首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
The study of female labor supply has been a topic of relevance in the economic literature. Generally, the data are left-censored and the classic tobit model has been extensively used in the modeling strategy. This model, however, assumes normality for the error distribution and is not recommended for data with positive skewness, heavy-tails and heteroscedasticity, as is the case of female labor supply data. Moreover, it is well-known that the quantile regression approach accounts for the influences of different quantiles in the estimated coefficients. We take all these features into account and propose a parametric quantile tobit regression model based on quantile log-symmetric distributions. The proposed method allows one to model data with positive skewness (which is not suitable for the classic tobit model), to study the influence of the quantiles of interest, and to account for heteroscedasticity. The model parameters are estimated by maximum likelihood and a Monte Carlo experiment is performed to evaluate alternative estimators. The new method is applied to two distinct female labor supply data sets. The results indicate that the log-symmetric quantile tobit model fits better the data than the classic tobit model.  相似文献   

2.
The purpose of this paper is to develop a new linear regression model for count data, namely generalized-Poisson Lindley (GPL) linear model. The GPL linear model is performed by applying generalized linear model to GPL distribution. The model parameters are estimated by the maximum likelihood estimation. We utilize the GPL linear model to fit two real data sets and compare it with the Poisson, negative binomial (NB) and Poisson-weighted exponential (P-WE) models for count data. It is found that the GPL linear model can fit over-dispersed count data, and it shows the highest log-likelihood, the smallest AIC and BIC values. As a consequence, the linear regression model from the GPL distribution is a valuable alternative model to the Poisson, NB, and P-WE models.  相似文献   

3.
This paper proposes a linear mixed model (LMM) with spatial effects, trend, seasonality and outliers for spatio-temporal time series data. A linear trend, dummy variables for seasonality, a binary method for outliers and a multivariate conditional autoregressive (MCAR) model for spatial effects are adopted. A Bayesian method using Gibbs sampling in Markov Chain Monte Carlo is used for parameter estimation. The proposed model is applied to forecast rice and cassava yields, a spatio-temporal data type, in Thailand. The data have been extracted from the Office of Agricultural Economics, Ministry of Agriculture and Cooperatives of Thailand. The proposed model is compared with our previous model, an LMM with MCAR, and a log transformed LMM with MCAR. We found that the proposed model is the most appropriate, using the mean absolute error criterion. It fits the data very well in both the fitting part and the validation part for both rice and cassava. Therefore, it is recommended to be a primary model for forecasting these types of spatio-temporal time series data.  相似文献   

4.
We investigate mixed models for repeated measures data from cross-over studies in general, but in particular for data from thorough QT studies. We extend both the conventional random effects model and the saturated covariance model for univariate cross-over data to repeated measures cross-over (RMC) data; the resulting models we call the RMC model and Saturated model, respectively. Furthermore, we consider a random effects model for repeated measures cross-over data previously proposed in the literature. We assess the standard errors of point estimates and the coverage properties of confidence intervals for treatment contrasts under the various models. Our findings suggest: (i) Point estimates of treatment contrasts from all models considered are similar; (ii) Confidence intervals for treatment contrasts under the random effects model previously proposed in the literature do not have adequate coverage properties; the model therefore cannot be recommended for analysis of marginal QT prolongation; (iii) The RMC model and the Saturated model have similar precision and coverage properties; both models are suitable for assessment of marginal QT prolongation; and (iv) The Akaike Information Criterion (AIC) is not a reliable criterion for selecting a covariance model for RMC data in the following sense: the model with the smallest AIC is not necessarily associated with the highest precision for the treatment contrasts, even if the model with the smallest AIC value is also the most parsimonious model.  相似文献   

5.
We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to follow-up. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a non-future dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links non-identifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by non-identified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.  相似文献   

6.
Abstract.  Methodology for Bayesian inference is considered for a stochastic epidemic model which permits mixing on both local and global scales. Interest focuses on estimation of the within- and between-group transmission rates given data on the final outcome. The model is sufficiently complex that the likelihood of the data is numerically intractable. To overcome this difficulty, an appropriate latent variable is introduced, about which asymptotic information is known as the population size tends to infinity. This yields a method for approximate inference for the true model. The methods are applied to real data, tested with simulated data, and also applied to a simple epidemic model for which exact results are available for comparison.  相似文献   

7.
Recurrence data usually come from sampling a system in different ages. It is common in areas of manufacturing, reliability, medicine, and risk analysis. In this article, an accelerated model for recurrence data is proposed. The model is based on the time between failures (TBFs) for an accelerated time regression method using the number of failures as a dummy covariate. Using this model, the repair (or cure) effect can also be studied. A graphical display of recurrence data created by plotting the log-TBFs versus the number of failures is proposed to detect any linear or nonlinear trend for log-TBFs. The model is then extended to incorporate covariates and/or time factors. Two data sets are used to demonstrate the usefulness of the proposed model.  相似文献   

8.
A folded type model is developed for analysing compositional data. The proposed model involves an extension of the α‐transformation for compositional data and provides a new and flexible class of distributions for modelling data defined on the simplex sample space. Despite its rather seemingly complex structure, employment of the EM algorithm guarantees efficient parameter estimation. The model is validated through simulation studies and examples which illustrate that the proposed model performs better in terms of capturing the data structure, when compared to the popular logistic normal distribution, and can be advantageous over a similar model without folding.  相似文献   

9.
Crossover designs are popular in early phases of clinical trials and in bioavailability and bioequivalence studies. Assessment of carryover effects, in addition to the treatment effects, is a critical issue in crossover trails. The observed data from a crossover trial can be incomplete because of potential dropouts. A joint model for analyzing incomplete data from crossover trials is proposed in this article; the model includes a measurement model and an outcome dependent informative model for the dropout process. The informative-dropout model is compared with the ignorable-dropout model as specific cases of the latter are nested subcases of the proposed joint model. Markov chain sampling methods are used for Bayesian analysis of this model. The joint model is used to analyze depression score data from a clinical trial in women with late luteal phase dysphoric disorder. Interestingly, carryover effect is found to have a strong effect in the informative dropout model, but it is less significant when dropout is considered ignorable.  相似文献   

10.
Because of limitations of the univariate frailty model in analysis of multivariate survival data, a bivariate frailty model is introduced for the analysis of bivariate survival data. This provides tremendous flexibility especially in allowing negative associations between subjects within the same cluster. The approach involves incorporating into the model two possibly correlated frailties for each cluster. The bivariate lognormal distribution is used as the frailty distribution. The model is then generalized to multivariate survival data with two distinguished groups and also to alternating process data. A modified EM algorithm is developed with no requirement of specification of the baseline hazards. The estimators are generalized maximum likelihood estimators with subject-specific interpretation. The model is applied to a mental health study on evaluation of health policy effects for inpatient psychiatric care.  相似文献   

11.
We describe a mixed-effect hurdle model for zero-inflated longitudinal count data, where a baseline variable is included in the model specification. Association between the count data process and the endogenous baseline variable is modeled through a latent structure, assumed to be dependent across equations. We show how model parameters can be estimated in a finite mixture context, allowing for overdispersion, multivariate association and endogeneity of the baseline variable. The model behavior is investigated through a large-scale simulation experiment. An empirical example on health care utilization data is provided.  相似文献   

12.
All statistical methods involve basic model assumptions, which if violated render results of the analysis dubious. A solution to such a contingency is to seek an appropriate model or to modify the customary model by introducing additional parameters. Both of these approaches are in general cumbersome and demand uncommon expertise. An alternative is to transform the data to achieve compatibility with a well understood and convenient customary model with readily available software. The well-known example is the Box–Cox data transformation developed in order to make the normal theory linear model usable even when the assumptions of normality and homoscedasticity are not met.In reliability analysis the model appropriateness is determined by the nature of the hazard function. The well-known Weibull distribution is the most commonly employed model for this purpose. However, this model, which allows only a small spectrum of monotone hazard rates, is especially inappropriate if the data indicate bathtub-shaped hazard rates.In this paper, a new model based on the use of data transformation is presented for modeling bathtub-shaped hazard rates. Parameter estimation methods are studied for this new (transformation) approach. Examples and results of comparisons between the new model and other bathtub-shaped models are shown to illustrate the applicability of this new model.  相似文献   

13.
We propose a four-parameter extended generalized gamma model, which includes as special cases some important distributions and it is very useful for modeling lifetime data. A advantage is that it can represent the error distribution for a new heteroscedastic log-odd log-logistic generalized gamma regression model. The proposed heteroscedastic regression model can be used more effectively in the analysis of survival data since it includes as special models several widely-known regression models. Further, for different parameter settings, sample sizes and censoring percentages, various simulations are performed. Overall, the new regression model is very useful to the analysis of real data.  相似文献   

14.
In this paper the exponentiated-Weibull model is modified to model the possibility that long-term survivors are present in the data. The modification leads to an exponentiated-Weibull mixture model which encompasses as special cases the exponential and Weibull mixture models typically used to model such data. Inference for the model parameters is considered via maximum likelihood and also via Bayesian inference by using Markov chain Monte Carlo simulation. Model comparison is considered by using likelihood ratio statistics and also the pseudo Bayes factor, which can be computed by using the generated samples. An example of a data set is considered for which the exponentiated-Weibull mixture model presents a better fit than the Weibull mixture model. Results of simulation studies are also reported, which show that the likelihood ratio statistics seems to be somewhat deficient for small and moderate sample sizes.  相似文献   

15.
A mixture model with Laplace and normal components is fitted to wind shear data available in grouped form. A set of equations is presented for iteratively estimating the parameters of the model using an application of the EM algorithm. Twenty-four sets of data are examined with this technique, and the model is found to give a good fit to the data. Some hypotheses about the parameters in the model are discussed in light of the estimates obtained.  相似文献   

16.
Biological control of pests is an important branch of entomology, providing environmentally friendly forms of crop protection. Bioassays are used to find the optimal conditions for the production of parasites and strategies for application in the field. In some of these assays, proportions are measured and, often, these data have an inflated number of zeros. In this work, six models will be applied to data sets obtained from biological control assays for Diatraea saccharalis , a common pest in sugar cane production. A natural choice for modelling proportion data is the binomial model. The second model will be an overdispersed version of the binomial model, estimated by a quasi-likelihood method. This model was initially built to model overdispersion generated by individual variability in the probability of success. When interest is only in the positive proportion data, a model can be based on the truncated binomial distribution and in its overdispersed version. The last two models include the zero proportions and are based on a finite mixture model with the binomial distribution or its overdispersed version for the positive data. Here, we will present the models, discuss their estimation and compare the results.  相似文献   

17.
The problem of deciding whether an intercept model or a no-intercept model is more appropriate for a given set of data is a problem with no simple solution. Often, the underlying physical situation will suggest an appropriate model; however, there still may be interest in assessing which model best fits the data or is the better predictor. In this article a different interpretation of regression through the origin is derived, that of a full fit to the original data set augmented by one further point. Examination of the leverage and influence of the augmented data point can provide help in comparing the models.  相似文献   

18.
In many longitudinal studies multiple characteristics of each individual, along with time to occurrence of an event of interest, are often collected. In such data set, some of the correlated characteristics may be discrete and some of them may be continuous. In this paper, a joint model for analysing multivariate longitudinal data comprising mixed continuous and ordinal responses and a time to event variable is proposed. We model the association structure between longitudinal mixed data and time to event data using a multivariate zero-mean Gaussian process. For modeling discrete ordinal data we assume a continuous latent variable follows the logistic distribution and for continuous data a Gaussian mixed effects model is used. For the event time variable, an accelerated failure time model is considered under different distributional assumptions. For parameter estimation, a Bayesian approach using Markov Chain Monte Carlo is adopted. The performance of the proposed methods is illustrated using some simulation studies. A real data set is also analyzed, where different model structures are used. Model comparison is performed using a variety of statistical criteria.  相似文献   

19.
斯琴 《统计教育》2008,(12):13-15,19
本文在对我国城镇居民养老保险分析和预测中引用了等维新息灰色预测模型并与传统灰色预测模型进行了预测精度比较,最终引入了等维新息处理的灰色预测模型。等维新息灰色预测在每一步预测中,不断推陈出新对原始数据进行等维新息处理。通过对实际案例研究证实,文中提出的预测模型可以在建模过程中成功地反映数据运动规律,具有合理、有效的中长期预测功能。笔者希望通过对我国城镇居民养老保险的灰色预测分析,为今后养老保险系统的后续研究打点基础。  相似文献   

20.
Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient’s information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号