Dementia patients exhibit considerable heterogeneity in individual trajectories of cognitive decline, with some patients showing rapid decline following diagnoses while others exhibiting slower decline or remaining stable for several years. Dementia studies often collect longitudinal measures of multiple neuropsychological tests aimed to measure patients’ decline across a number of cognitive domains. We propose a multivariate finite mixture latent trajectory model to identify distinct longitudinal patterns of cognitive decline simultaneously in multiple cognitive domains, each of which is measured by multiple neuropsychological tests. EM algorithm is used for parameter estimation and posterior probabilities are used to predict latent class membership. We present results of a simulation study demonstrating adequate performance of our proposed approach and apply our model to the Uniform Data Set from the National Alzheimer's Coordinating Center to identify cognitive decline patterns among dementia patients.  相似文献   

This article analyzes the effects of multicollienarity on the maximum likelihood (ML) estimator for the Tobit regression model. Furthermore, a ridge regression (RR) estimator is proposed since the mean squared error (MSE) of ML becomes inflated when the regressors are collinear. To investigate the performance of the traditional ML and the RR approaches we use Monte Carlo simulations where the MSE is used as performance criteria. The simulated results indicate that the RR approach should always be preferred to the ML estimation method.  相似文献   

This paper discusses the problem of fitting a parametric model in Tobit mean regression models. The proposed test is based on the supremum of the Khamaladze-type transformation of a partial sum process of calibrated residuals. The asymptotic null distribution of this transformed process is shown to be the same as that of a time-transformed standard Brownian motion. Consistency of this sequence of tests against some fixed alternatives and asymptotic power under some local nonparametric alternatives are also discussed. Simulation studies are conducted to assess the finite sample performance of the proposed test. The power comparison with some existing tests shows some superiority of the proposed test at the chosen alternatives.  相似文献   

The standard Tobit model is constructed under the assumption of a normal distribution and has been widely applied in econometrics. Atypical/extreme data have a harmful effect on the maximum likelihood estimates of the standard Tobit model parameters. Then, we need to count with diagnostic tools to evaluate the effect of extreme data. If they are detected, we must have available a Tobit model that is robust to this type of data. The family of elliptically contoured distributions has the Laplace, logistic, normal and Student-t cases as some of its members. This family has been largely used for providing generalizations of models based on the normal distribution, with excellent practical results. In particular, because the Student-t distribution has an additional parameter, we can adjust the kurtosis of the data, providing robust estimates against extreme data. We propose a methodology based on a generalization of the standard Tobit model with errors following elliptical distributions. Diagnostics in the Tobit model with elliptical errors are developed. We derive residuals and global/local influence methods considering several perturbation schemes. This is important because different diagnostic methods can detect different atypical data. We implement the proposed methodology in an R package. We illustrate the methodology with real-world econometrical data by using the R package, which shows its potential applications. The Tobit model based on the Student-t distribution with a small quantity of degrees of freedom displays an excellent performance reducing the influence of extreme cases in the maximum likelihood estimates in the application presented. It provides new empirical evidence on the capabilities of the Student-t distribution for accommodation of atypical data.  相似文献   

Finite mixture models are currently used to analyze heterogeneous longitudinal data. By releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, finite mixture models not only can estimate model parameters but also cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, which might be associated with a clinically important binary outcome. This article develops a joint modeling of a finite mixture of NLME models for longitudinal data in the presence of covariate measurement errors and a logistic regression for a binary outcome, linked by individual latent class indicators, under a Bayesian framework. Simulation studies are conducted to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and logistic regression are fitted separately, followed by an application to a real data set from an AIDS clinical trial, in which the viral dynamics and dichotomized time to the first decline of CD4/CD8 ratio are analyzed jointly.  相似文献   

We describe a semiparametric mixture model for human fertility studies. The probability of conception is a product of two components. The mixing distribution, the component that introduces the heterogeneity among the menstrual cycles that come from different couples, is characterized nonparametrically by a finite number of moments. The second component, the intercourse-related probability is modeled parametrically to assess the possible exposure effects. We discuss an EM algorithm-based estimating procedure that incorporates the natural order in the moments.  相似文献   

Hedonic price models are commonly used in the study of markets for various goods, most notably those for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes within a given product class, where in the case of some goods, such as wine, substantial product differentiation exists. To address this issue, recent research on wine prices employs local polynomial regression clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of wine prices, provided only that the dependent variable in the model is rescaled. The present study also catalogues several of the advantages over LPRC modeling of estimating mixture models.  相似文献   

Mean survival time is often of inherent interest in medical and epidemiologic studies. In the presence of censoring and when covariate effects are of interest, Cox regression is the strong default, but mostly due to convenience and familiarity. When survival times are uncensored, covariate effects can be estimated as differences in mean survival through linear regression. Tobit regression can validly be performed through maximum likelihood when the censoring times are fixed (ie, known for each subject, even in cases where the outcome is observed). However, Tobit regression is generally inapplicable when the response is subject to random right censoring. We propose Tobit regression methods based on weighted maximum likelihood which are applicable to survival times subject to both fixed and random censoring times. Under the proposed approach, known right censoring is handled naturally through the Tobit model, with inverse probability of censoring weighting used to overcome random censoring. Essentially, the re‐weighting data are intended to represent those that would have been observed in the absence of random censoring. We develop methods for estimating the Tobit regression parameter, then the population mean survival time. A closed form large‐sample variance estimator is proposed for the regression parameter estimator, with a semiparametric bootstrap standard error estimator derived for the population mean. The proposed methods are easily implementable using standard software. Finite‐sample properties are assessed through simulation. The methods are applied to a large cohort of patients wait‐listed for kidney transplantation.  相似文献   

Merger and acquisition is an important corporate strategy. We collect recent merger and acquisition data for companies on the China A-share stock market to explore the relationship between corporate ownership structure and speed of merger success. When studying merger success, selection bias occurs if only completed mergers are analyzed. There is also a censoring problem when duration time is used to measure the speed. In this article, for time-to-event outcomes, we propose a semiparametric version of the type II Tobit model that can simultaneously handle selection bias and right censoring. The proposed model can also easily incorporate time-dependent covariates. A nonparametric maximum likelihood estimator is proposed. The resulting estimators are shown to be consistent, asymptotically normal, and semiparametrically efficient. Some Monte Carlo studies are carried out to assess the finite-sample performance of the proposed approach. Using the proposed model, we find that higher power balance of a company is associated with faster merger success.  相似文献   

This paper introduces and applies an EM algorithm for the maximum-likelihood estimation of a latent class version of the grouped-data regression model. This new model is applied to examine the effects of college athletic participation of females on incomes. No evidence for an “athlete” effect in the case of females has been found in the previous work by Long and Caudill [12], Henderson et al. [10], and Caudill and Long [5]. Our study is the first to find evidence of a lower wage for female athletes. This effect is present in a regime characterizing 42% of the sample. Further analysis indicates that female athletes in many otherwise low-paying jobs actually get paid less than non-athletes.  相似文献   

The maximum likelihood estimator (MLE) in nonlinear panel data models with fixed effects is widely understood (with a few exceptions) to be biased and inconsistent when T, the length of the panel, is small and fixed. However, there is surprisingly little theoretical or empirical evidence on the behavior of the estimator on which to base this conclusion. The received studies have focused almost exclusively on coefficient estimation in two binary choice models, the probit and logit models. In this note, we use Monte Carlo methods to examine the behavior of the MLE of the fixed effects tobit model. We find that the estimator's behavior is quite unlike that of the estimators of the binary choice models. Among our findings are that the location coefficients in the tobit model, unlike those in the probit and logit models, are unaffected by the “incidental parameters problem.” But, a surprising result related to the disturbance variance emerges instead - the finite sample bias appears here rather than in the slopes. This has implications for estimation of marginal effects and asymptotic standard errors, which are also examined in this paper. The effects are also examined for the probit and truncated regression models, extending the range of received results in the first of these beyond the widely cited biases in the coefficient estimators.  相似文献   

This paper presents a new parametric model for recurrent events, in which the time of each recurrence is associated to one or multiple latent causes and no information is provided about the responsible cause for the event. This model is characterized by a rate function and it is based on the Poisson-exponential distribution, namely the distribution of the maximum among a random number (truncated Poisson distributed) of exponential times. The time of each recurrence is then given by the maximum lifetime value among all latent causes. Inference is based on a maximum likelihood approach. A simulation study is performed in order to observe the frequentist properties of the estimation procedure for small and moderate sample sizes. We also investigated likelihood-based tests procedures. A real example from a gastroenterology study concerning small bowel motility during fasting state is used to illustrate the methodology. Finally, we apply the proposed model to a real data set and compare it with the classical Homogeneous Poisson model, which is a particular case.  相似文献   

Summary.  The primary goal of multivariate statistical process performance monitoring is to identify deviations from normal operation within a manufacturing process. The basis of the monitoring schemes is historical data that have been collected when the process is running under normal operating conditions. These data are then used to establish confidence bounds to detect the onset of process deviations. In contrast with the traditional approaches that are based on the Gaussian assumption, this paper proposes the application of the infinite Gaussian mixture model (GMM) for the calculation of the confidence bounds, thereby relaxing the previous restrictive assumption. The infinite GMM is a special case of Dirichlet process mixtures and is introduced as the limit of the finite GMM, i.e. when the number of mixtures tends to ∞. On the basis of the estimation of the probability density function, via the infinite GMM, the confidence bounds are calculated by using the bootstrap algorithm. The methodology proposed is demonstrated through its application to a simulated continuous chemical process, and a batch semiconductor manufacturing process.  相似文献   

We propose a mixture integer-valued ARCH model for modeling integer-valued time series with overdispersion. The model consists of a mixture of K stationary or non-stationary integer-valued ARCH components. The advantages of the mixture model over the single-component model include the ability to handle multimodality and non-stationary components. The necessary and sufficient first- and second-order stationarity conditions, the necessary arbitrary-order stationarity conditions, and the autocorrelation function are derived. The estimation of parameters is done through an EM algorithm, and the model is selected by three information criterions, whose performances are studied via simulations. Finally, the model is applied to a real dataset.  相似文献   

The driving risk during the initial period after licensure for novice teenage drivers is typically the highest but decreases rapidly right after. The change-point of driving risk is a critical parameter for evaluating teenage driving risk, which also varies substantially among drivers. This paper presents latent class recurrent-event change-point models for detecting the change-points. The proposed model is applied to the Naturalist Teenage Driving Study, which continuously recorded the driving data of 42 novice teenage drivers for 18 months using advanced in-vehicle instrumentation. We propose a hierarchical BFMM to estimate the change-points by clusters of drivers with similar risk profiles. The model is based on a non-homogeneous Poisson process with piecewise-constant intensity functions. Latent variables which identify the membership of the subjects are used to detect potential clusters among subjects. Application to the Naturalistic Teenage Driving Study identifies three distinct clusters with change-points at 52.30, 108.99 and 150.20?hours of driving after first licensure, respectively. The overall intensity rate and the pattern of change also differ substantially among clusters. The results of this research provide more insight in teenagers' driving behaviour and will be critical to improve young drivers' safety education and parent management programs, as well as provide crucial reference for the GDL regulations to encourage safer driving.  相似文献   

Model-based clustering is a method that clusters data with an assumption of a statistical model structure. In this paper, we propose a novel model-based hierarchical clustering method for a finite statistical mixture model based on the Fisher distribution. The main foci of the proposed method are: (a) provide efficient solution to estimate the parameters of a Fisher mixture model (FMM); (b) generate a hierarchy of FMMs and (c) select the optimal model. To this aim, we develop a Bregman soft clustering method for FMM. Our model estimation strategy exploits Bregman divergence and hierarchical agglomerative clustering. Whereas, our model selection strategy comprises a parsimony-based approach and an evaluation graph-based approach. We empirically validate our proposed method by applying it on simulated data. Next, we apply the method on real data to perform depth image analysis. We demonstrate that the proposed clustering method can be used as a potential tool for unsupervised depth image analysis.  相似文献   

Karlis and Santourian [14 D. Karlis and A. Santourian, Model-based clustering with non-elliptically contoured distribution, Stat. Comput. 19 (2009), pp. 7383. doi: 10.1007/s11222-008-9072-0[Crossref], [Web of Science ®] [Google Scholar]] proposed a model-based clustering algorithm, the expectation–maximization (EM) algorithm, to fit the mixture of multivariate normal-inverse Gaussian (NIG) distribution. However, the EM algorithm for the mixture of multivariate NIG requires a set of initial values to begin the iterative process, and the number of components has to be given a priori. In this paper, we present a learning-based EM algorithm: its aim is to overcome the aforementioned weaknesses of Karlis and Santourian's EM algorithm [14 D. Karlis and A. Santourian, Model-based clustering with non-elliptically contoured distribution, Stat. Comput. 19 (2009), pp. 7383. doi: 10.1007/s11222-008-9072-0[Crossref], [Web of Science ®] [Google Scholar]]. The proposed learning-based EM algorithm was first inspired by Yang et al. [24 M.-S. Yang, C.-Y. Lai, and C.-Y. Lin, A robust EM clustering algorithm for Gaussian mixture models, Pattern Recognit. 45 (2012), pp. 39503961. doi: 10.1016/j.patcog.2012.04.031[Crossref], [Web of Science ®] [Google Scholar]]: the process of how they perform self-clustering was then simulated. Numerical experiments showed promising results compared to Karlis and Santourian's EM algorithm. Moreover, the methodology is applicable to the analysis of extrasolar planets. Our analysis provides an understanding of the clustering results in the ln?P?ln?M and ln?P?e spaces, where M is the planetary mass, P is the orbital period and e is orbital eccentricity. Our identified groups interpret two phenomena: (1) the characteristics of two clusters in ln?P?ln?M space might be related to the tidal and disc interactions (see [9 I.G. Jiang, W.H. Ip, and L.C. Yeh, On the fate of close-in extrasolar planets, Astrophys. J. 582 (2003), pp. 449454. doi: 10.1086/344590[Crossref], [Web of Science ®] [Google Scholar]]); and (2) there are two clusters in ln?P?e space.  相似文献   


In this article, we propose a new distribution by mixing normal and Pareto distributions, and the new distribution provides an unusual hazard function. We model the mean and the variance with covariates for heterogeneity. Estimation of the parameters is obtained by the Bayesian method using Markov Chain Monte Carlo (MCMC) algorithms. Proposal distribution in MCMC is proposed with a defined working variable related to the observations. Through the simulation, the method shows a dependable performance of the model. We demonstrate through establishing model under a real dataset that the proposed model and method can be more suitable than the previous report.  相似文献   


In this study, a Generalized, Multi-Stage Adjusted, Latent Class Linear Mixed Model is proposed for modeling the heterogeneous distributed phenotype and genetic information across the whole genome in the presence of both serial and familial correlations. Genome data were analyzed by applying the proposed model to Genetic Analysis Workshop (GAW) data, and the model results were compared to the results of standard models. Moreover, the potential of the model is discussed compared to simulated data. As a result of model comparisons, the information criteria and the genomic control parameter were found to be smaller. The results of a power analysis show that the proposed model is more powerful.  相似文献   

