首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
Linear mixed models are widely used when multiple correlated measurements are made on each unit of interest. In many applications, the units may form several distinct clusters, and such heterogeneity can be more appropriately modelled by a finite mixture linear mixed model. The classical estimation approach, in which both the random effects and the error parts are assumed to follow normal distribution, is sensitive to outliers, and failure to accommodate outliers may greatly jeopardize the model estimation and inference. We propose a new mixture linear mixed model using multivariate t distribution. For each mixture component, we assume the response and the random effects jointly follow a multivariate t distribution, to conveniently robustify the estimation procedure. An efficient expectation conditional maximization algorithm is developed for conducting maximum likelihood estimation. The degrees of freedom parameters of the t distributions are chosen data adaptively, for achieving flexible trade-off between estimation robustness and efficiency. Simulation studies and an application on analysing lung growth longitudinal data showcase the efficacy of the proposed approach.  相似文献   

2.
Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset.  相似文献   

3.
We investigate mixed analysis of covariance models for the 'one-step' assessment of conditional QT prolongation. Initially, we consider three different covariance structures for the data, where between-treatment covariance of repeated measures is modelled respectively through random effects, random coefficients, and through a combination of random effects and random coefficients. In all three of those models, an unstructured covariance pattern is used to model within-treatment covariance. In a fourth model, proposed earlier in the literature, between-treatment covariance is modelled through random coefficients but the residuals are assumed to be independent identically distributed (i.i.d.). Finally, we consider a mixed model with saturated covariance structure. We investigate the precision and robustness of those models by fitting them to a large group of real data sets from thorough QT studies. Our findings suggest: (i) Point estimates of treatment contrasts from all five models are similar. (ii) The random coefficients model with i.i.d. residuals is not robust; the model potentially leads to both under- and overestimation of standard errors of treatment contrasts and therefore cannot be recommended for the analysis of conditional QT prolongation. (iii) The combined random effects/random coefficients model does not always converge; in the cases where it converges, its precision is generally inferior to the other models considered. (iv) Both the random effects and the random coefficients model are robust. (v) The random effects, the random coefficients, and the saturated model have similar precision and all three models are suitable for the one-step assessment of conditional QT prolongation.  相似文献   

4.
Abstract.  Prediction error is critical to assess model fit and evaluate model prediction. We propose the cross-validation (CV) and approximated CV methods for estimating prediction error under the Bregman divergence (BD), which embeds nearly all of the commonly used loss functions in the regression, classification procedures and machine learning literature. The approximated CV formulas are analytically derived, which facilitate fast estimation of prediction error under BD. We then study a data-driven optimal bandwidth selector for local-likelihood estimation that minimizes the overall prediction error or equivalently the covariance penalty. It is shown that the covariance penalty and CV methods converge to the same mean-prediction-error-criterion. We also propose a lower-bound scheme for computing the local logistic regression estimates and demonstrate that the algorithm monotonically enhances the target local likelihood and converges. The idea and methods are extended to the generalized varying-coefficient models and additive models.  相似文献   

5.
Three types of polynomial mixed model splines have been proposed: smoothing splines, P‐splines and penalized splines using a truncated power function basis. The close connections between these models are demonstrated, showing that the default cubic form of the splines differs only in the penalty used. A general definition of the mixed model spline is given that includes general constraints and can be used to produce natural or periodic splines. The impact of different penalties is demonstrated by evaluation across a set of functions with specific features, and shows that the best penalty in terms of mean squared error of prediction depends on both the form of the underlying function and the signal:noise ratio.  相似文献   

6.
We introduce a new multivariate GARCH model with multivariate thresholds in conditional correlations and develop a two-step estimation procedure that is feasible in large dimensional applications. Optimal threshold functions are estimated endogenously from the data and the model conditional covariance matrix is ensured to be positive definite. We study the empirical performance of our model in two applications using U.S. stock and bond market data. In both applications our model has, in terms of statistical and economic significance, higher forecasting power than several other multivariate GARCH models for conditional correlations.  相似文献   

7.
Abstract

In this work we mainly study the local influence in nonlinear mixed effects model with M-estimation. A robust method to obtain maximum likelihood estimates for parameters is presented, and the local influence of nonlinear mixed models based on robust estimation (M-estimation) by use of the curvature method is systematically discussed. The counting formulas of curvature for case weights perturbation, response variable perturbation and random error covariance perturbation are derived. Simulation studies are carried to access performance of the methods we proposed. We illustrate the diagnostics by an example presented in Davidian and Giltinan, which was analyzed under the non-robust situation.  相似文献   

8.
We consider measurement error models within the time series unobserved component framework. A variable of interest is observed with some measurement error and modelled as an unobserved component. The forecast and the prediction of this variable given the observed values is given by the Kalman filter and smoother along with their conditional variances. By expressing the forecasts and predictions as weighted averages of the observed values, we investigate the effect of estimation error in the measurement and observation noise variances. We also develop corrected standard errors for prediction and forecasting accounting for the fact that the measurement and observation error variances are estimated by the same sample that is used for forecasting and prediction purposes. We apply the theory to the Yellowstone grizzly bears and US index of production datasets.  相似文献   

9.
The article considers a new approach for small area estimation based on a joint modelling of mean and variances. Model parameters are estimated via expectation–maximization algorithm. The conditional mean squared error is used to evaluate the prediction error. Analytical expressions are obtained for the conditional mean squared error and its estimator. Our approximations are second‐order correct, an unwritten standardization in the small area literature. Simulation studies indicate that the proposed method outperforms the existing methods in terms of prediction errors and their estimated values.  相似文献   

10.
In this article, robust estimation and prediction in multivariate autoregressive models with exogenous variables (VARX) are considered. The conditional least squares (CLS) estimators are known to be non-robust when outliers occur. To obtain robust estimators, the method introduced in Duchesne [2005. Robust and powerful serial correlation tests with new robust estimates in ARX models. J. Time Ser. Anal. 26, 49–81] and Bou Hamad and Duchesne [2005. On robust diagnostics at individual lags using RA-ARX estimators. In: Duchesne, P., Rémillard, B. (Eds.), Statistical Modeling and Analysis for Complex Data Problems. Springer, New York] is generalized for VARX models. The asymptotic distribution of the new estimators is studied and from this is obtained in particular the asymptotic covariance matrix of the robust estimators. Classical conditional prediction intervals normally rely on estimators such as the usual non-robust CLS estimators. In the presence of outliers, such as additive outliers, these classical predictions can be severely biased. More generally, the occurrence of outliers may invalidate the usual conditional prediction intervals. Consequently, the new robust methodology is used to develop robust conditional prediction intervals which take into account parameter estimation uncertainty. In a simulation study, we investigate the finite sample properties of the robust prediction intervals under several scenarios for the occurrence of the outliers, and the new intervals are compared to non-robust intervals based on classical CLS estimators.  相似文献   

11.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

12.
With a growing interest in using non-representative samples to train prediction models for numerous outcomes it is necessary to account for the sampling design that gives rise to the data in order to assess the generalized predictive utility of a proposed prediction rule. After learning a prediction rule based on a non-uniform sample, it is of interest to estimate the rule's error rate when applied to unobserved members of the population. Efron (1986) proposed a general class of covariance penalty inflated prediction error estimators that assume the available training data are representative of the target population for which the prediction rule is to be applied. We extend Efron's estimator to the complex sample context by incorporating Horvitz–Thompson sampling weights and show that it is consistent for the true generalization error rate when applied to the underlying superpopulation. The resulting Horvitz–Thompson–Efron estimator is equivalent to dAIC, a recent extension of Akaike's information criteria to survey sampling data, but is more widely applicable. The proposed methodology is assessed with simulations and is applied to models predicting renal function obtained from the large-scale National Health and Nutrition Examination Study survey. The Canadian Journal of Statistics 48: 204–221; 2020 © 2019 Statistical Society of Canada  相似文献   

13.
Abstract.  We consider marginal semiparametric partially linear models for longitudinal/clustered data and propose an estimation procedure based on a spline approximation of the non-parametric part of the model and an extension of the parametric marginal generalized estimating equations (GEE). Our estimates of both parametric part and non-parametric part of the model have properties parallel to those of parametric GEE, that is, the estimates are efficient if the covariance structure is correctly specified and they are still consistent and asymptotically normal even if the covariance structure is misspecified. By showing that our estimate achieves the semiparametric information bound, we actually establish the efficiency of estimating the parametric part of the model in a stronger sense than what is typically considered for GEE. The semiparametric efficiency of our estimate is obtained by assuming only conditional moment restrictions instead of the strict multivariate Gaussian error assumption.  相似文献   

14.
The estimation of the covariance matrix is important in the analysis of bivariate longitudinal data. A good estimator for the covariance matrix can improve the efficiency of the estimators of the mean regression coefficients. Furthermore, the covariance estimation itself is also of interest, but it is a challenging job to model the covariance matrix of bivariate longitudinal data due to the complex structure and positive definite constraint. In addition, most of existing approaches are based on the maximum likelihood, which is very sensitive to outliers or heavy-tail error distributions. In this article, an adaptive robust estimation method is proposed for bivariate longitudinal data. Unlike the existing likelihood-based methods, the proposed method can adapt to different error distributions. Specifically, at first, we utilize the modified Cholesky block decomposition to parameterize the covariance matrices. Secondly, we apply the bounded Huber's score function to develop a set of robust generalized estimating equations to estimate the parameters both in the mean and the covariance models simultaneously. A data-driven approach is presented to select the parameter c in the Huber's score function, which can ensure that the proposed method is robust and efficient. A simulation study and a real data analysis are conducted to illustrate the robustness and efficiency of the proposed approach.  相似文献   

15.
In general, growth models are adjusted under the assumptions that the error terms are homoscedastic and normally distributed. However, these assumptions are often not verified in practice. In this work we propose four growth models (Morgan–Mercer–Flodin, von Bertalanffy, Gompertz, and Richards) considering different distributions (normal, skew-normal) for the error terms and three different covariance structures. Maximum likelihood estimation procedure is addressed. A simulation study is performed in order to verify the appropriateness of the proposed growth curve models. The methodology is also illustrated on a real dataset.  相似文献   

16.
We propose a new model for conditional covariances based on predetermined idiosyncratic shocks as well as macroeconomic and own information instruments. The specification ensures positive definiteness by construction, is unique within the class of linear functions for our covariance decomposition, and yields a simple yet rich model of covariances. We introduce a property, invariance to variate order, that assures estimation is not impacted by a simple reordering of the variates in the system. Simulation results using realized covariances show smaller mean absolute errors (MAE) and root mean square errors (RMSE) for every element of the covariance matrix relative to a comparably specified BEKK model with own information instruments. We also find a smaller mean absolute percentage error (MAPE) and root mean square percentage error (RMSPE) for the entire covariance matrix. Supplementary materials for practitioners as well as all Matlab code used in the article are available online.  相似文献   

17.
Different longitudinal study designs require different statistical analysis methods and different methods of sample size determination. Statistical power analysis is a flexible approach to sample size determination for longitudinal studies. However, different power analyses are required for different statistical tests which arises from the difference between different statistical methods. In this paper, the simulation-based power calculations of F-tests with Containment, Kenward-Roger or Satterthwaite approximation of degrees of freedom are examined for sample size determination in the context of a special case of linear mixed models (LMMs), which is frequently used in the analysis of longitudinal data. Essentially, the roles of some factors, such as variance–covariance structure of random effects [unstructured UN or factor analytic FA0], autocorrelation structure among errors over time [independent IND, first-order autoregressive AR1 or first-order moving average MA1], parameter estimation methods [maximum likelihood ML and restricted maximum likelihood REML] and iterative algorithms [ridge-stabilized Newton-Raphson and Quasi-Newton] on statistical power of approximate F-tests in the LMM are examined together, which has not been considered previously. The greatest factor affecting statistical power is found to be the variance–covariance structure of random effects in the LMM. It appears that the simulation-based analysis in this study gives an interesting insight into statistical power of approximate F-tests for fixed effects in LMMs for longitudinal data.  相似文献   

18.
The purpose of this article is to obtain the jackknifed ridge predictors in the linear mixed models and to examine the superiorities, the linear combinations of the jackknifed ridge predictors over the ridge, principal components regression, r?k class and Henderson's predictors in terms of bias, covariance matrix and mean square error criteria. Numerical analyses are considered to illustrate the findings and a simulation study is conducted to see the performance of the jackknifed ridge predictors.  相似文献   

19.
We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the data set. This criterion can be applied to general prediction problems (e.g. regression or classification) and to general prediction rules (e.g. stepwise regression, tree-based models and neural nets). As a by-product we obtain a measure of the effective number of parameters used by an adaptive procedure. We relate the covariance inflation criterion to other model selection procedures and illustrate its use in some regression and classification problems. We also revisit the conditional bootstrap approach to model selection.  相似文献   

20.
In this paper we prove a consistency result for sieved maximum likelihood estimators of the density in general random censoring models with covariates. The proof is based on the method of functional estimation. The estimation error is decomposed in a deterministic approximation error and the stochastic estimation error. The main part of the proof is to establish a uniform law of large numbers for the conditional log-likelihood functional, by using results and techniques from empirical process theory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号