首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In modeling complex longitudinal data, semiparametric nonlinear mixed-effects (SNLME) models are very flexible and useful. Covariates are often introduced in the models to partially explain the inter-individual variations. In practice, data are often incomplete in the sense that there are often measurement errors and missing data in longitudinal studies. The likelihood method is a standard approach for inference for these models but it can be computationally very challenging, so computationally efficient approximate methods are quite valuable. However, the performance of these approximate methods is often based on limited simulation studies, and theoretical results are unavailable for many approximate methods. In this article, we consider a computationally efficient approximate method for a class of SNLME models with incomplete data and investigate its theoretical properties. We show that the estimates based on the approximate method are consistent and asymptotically normally distributed.  相似文献   

2.
CD4 and viral load play important roles in HIV/AIDS studies, and the study of their relationship has received much attention with well-known results. However, AIDS datasets are often highly complex in the sense that they typically contain outliers, measurement errors, and missing data. These data complications can greatly affect statistical analysis results, but much of the literature fail to address these issues in data analysis. In this paper, we re-visit the important relationship between CD4 and viral load and propose methods which simultaneously address outliers, measurement errors, and missing data. We find that the strength of the relationship may be severely mis-estimated if measurement errors and outliers are ignored. The proposed methods are general and can be used in other settings, where jointly modelling several different types of longitudinal data is required in the presence of data complications.  相似文献   

3.
ABSTRACT

Quantile regression models, as an important tool in practice, can describe effects of risk factors on the entire conditional distribution of the response variable with its estimates robust to outliers. However, there is few discussion on quantile regression for longitudinal data with both missing responses and measurement errors, which are commonly seen in practice. We develop a weighted and bias-corrected quantile loss function for the quantile regression with longitudinal data, which allows both missingness and measurement errors. Additionally, we establish the asymptotic properties of the proposed estimator. Simulation studies demonstrate the expected performance in correcting the bias resulted from missingness and measurement errors. Finally, we investigate the Lifestyle Education for Activity and Nutrition study and confirm the effective of intervention in producing weight loss after nine month at the high quantile.  相似文献   

4.
We compare the commonly used two-step methods and joint likelihood method for joint models of longitudinal and survival data via extensive simulations. The longitudinal models include LME, GLMM, and NLME models, and the survival models include Cox models and AFT models. We find that the full likelihood method outperforms the two-step methods for various joint models, but it can be computationally challenging when the dimension of the random effects in the longitudinal model is not small. We thus propose an approximate joint likelihood method which is computationally efficient. We find that the proposed approximation method performs well in the joint model context, and it performs better for more “continuous” longitudinal data. Finally, a real AIDS data example shows that patients with higher initial viral load or lower initial CD4 are more likely to drop out earlier during an anti-HIV treatment.  相似文献   

5.
In this paper, we study the estimation of p-values for robust tests for the linear regression model. The asymptotic distribution of these tests has only been studied under the restrictive assumption of errors with known scale or symmetric distribution. Since these robust tests are based on robust regression estimates, Efron's bootstrap (1979) presents a number of problems. In particular, it is computationally very expensive, and it is not resistant to outliers in the data. In other words, the tails of the bootstrap distribution estimates obtained by re-sampling the data may be severely affected by outliers.We show how to adapt the Robust Bootstrap (Ann. Statist 30 (2002) 556; Bootstrapping MM-estimators for linear regression with fixed designs, http://mathstat.carleton.ca/~matias/pubs.html) to this problem. This method is very fast to compute, resistant to outliers in the data, and asymptotically correct under weak regularity assumptions. In this paper, we show that the Robust Bootstrap can be used to obtain asymptotically correct, computationally simple p-value estimates. A simulation study indicates that the tests whose p-values are estimated with the Robust Bootstrap have better finite sample significance levels than those obtained from the asymptotic theory based on the symmetry assumption.Although this paper is focussed on robust scores-type tests (in: Directions in Robust Statistics and Diagnostics, Part I, Springer, New York), our approach can be applied to other robust tests (for example, Wald- and dispersion-type also discussed in Markatou et al., 1991).  相似文献   

6.
Quantile regression (QR) models have received increasing attention recently for longitudinal data analysis. When continuous responses appear non-centrality due to outliers and/or heavy-tails, commonly used mean regression models may fail to produce efficient estimators, whereas QR models may perform satisfactorily. In addition, longitudinal outcomes are often measured with non-normality, substantial errors and non-ignorable missing values. When carrying out statistical inference in such data setting, it is important to account for the simultaneous treatment of these data features; otherwise, erroneous or even misleading results may be produced. In the literature, there has been considerable interest in accommodating either one or some of these data features. However, there is relatively little work concerning all of them simultaneously. There is a need to fill up this gap as longitudinal data do often have these characteristics. Inferential procedure can be complicated dramatically when these data features arise in longitudinal response and covariate outcomes. In this article, our objective is to develop QR-based Bayesian semiparametric mixed-effects models to address the simultaneous impact of these multiple data features. The proposed models and method are applied to analyse a longitudinal data set arising from an AIDS clinical study. Simulation studies are conducted to assess the performance of the proposed method under various scenarios.  相似文献   

7.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

8.
Estimation in mixed linear models is, in general, computationally demanding, since applied problems may involve extensive data sets and large numbers of random effects. Existing computer algorithms are slow and/or require large amounts of memory. These problems are compounded in generalized linear mixed models for categorical data, since even approximate methods involve fitting of a linear mixed model within steps of an iteratively reweighted least squares algorithm. Only in models in which the random effects are hierarchically nested can the computations for fitting these models to large data sets be carried out rapidly. We describe a data augmentation approach to these computational difficulties in which we repeatedly fit an overlapping series of submodels, incorporating the missing terms in each submodel as 'offsets'. The submodels are chosen so that they have a nested random-effect structure, thus allowing maximum exploitation of the computational efficiency which is available in this case. Examples of the use of the algorithm for both metric and discrete responses are discussed, all calculations being carried out using macros within the MLwiN program.  相似文献   

9.
During recent years, analysts have been relying on approximate methods of inference to estimate multilevel models for binary or count data. In an earlier study of random-intercept models for binary outcomes we used simulated data to demonstrate that one such approximation, known as marginal quasi-likelihood, leads to a substantial attenuation bias in the estimates of both fixed and random effects whenever the random effects are non-trivial. In this paper, we fit three-level random-intercept models to actual data for two binary outcomes, to assess whether refined approximation procedures, namely penalized quasi-likelihood and second-order improvements to marginal and penalized quasi-likelihood, also underestimate the underlying parameters. The extent of the bias is assessed by two standards of comparison: exact maximum likelihood estimates, based on a Gauss–Hermite numerical quadrature procedure, and a set of Bayesian estimates, obtained from Gibbs sampling with diffuse priors. We also examine the effectiveness of a parametric bootstrap procedure for reducing the bias. The results indicate that second-order penalized quasi-likelihood estimates provide a considerable improvement over the other approximations, but all the methods of approximate inference result in a substantial underestimation of the fixed and random effects when the random effects are sizable. We also find that the parametric bootstrap method can eliminate the bias but is computationally very intensive.  相似文献   

10.
Different longitudinal study designs require different statistical analysis methods and different methods of sample size determination. Statistical power analysis is a flexible approach to sample size determination for longitudinal studies. However, different power analyses are required for different statistical tests which arises from the difference between different statistical methods. In this paper, the simulation-based power calculations of F-tests with Containment, Kenward-Roger or Satterthwaite approximation of degrees of freedom are examined for sample size determination in the context of a special case of linear mixed models (LMMs), which is frequently used in the analysis of longitudinal data. Essentially, the roles of some factors, such as variance–covariance structure of random effects [unstructured UN or factor analytic FA0], autocorrelation structure among errors over time [independent IND, first-order autoregressive AR1 or first-order moving average MA1], parameter estimation methods [maximum likelihood ML and restricted maximum likelihood REML] and iterative algorithms [ridge-stabilized Newton-Raphson and Quasi-Newton] on statistical power of approximate F-tests in the LMM are examined together, which has not been considered previously. The greatest factor affecting statistical power is found to be the variance–covariance structure of random effects in the LMM. It appears that the simulation-based analysis in this study gives an interesting insight into statistical power of approximate F-tests for fixed effects in LMMs for longitudinal data.  相似文献   

11.
In survival analysis, time-dependent covariates are usually present as longitudinal data collected periodically and measured with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models may be used for modelling of survival events. The hazard rate of survival times depends on the underlying time-dependent covariate measured with error, which may be described by random effects. Most existing methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be always valid in practice. In this article, we propose a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The proposed method does not require any parametric distribution assumption on random effects and random errors. Asymptotic properties for parameter estimators are provided. Simulation results show that under certain situations the proposed methods are more efficient than the existing methods.  相似文献   

12.
In this paper we consider the impact of both missing data and measurement errors on a longitudinal analysis of participation in higher education in Australia. We develop a general method for handling both discrete and continuous measurement errors that also allows for the incorporation of missing values and random effects in both binary and continuous response multilevel models. Measurement errors are allowed to be mutually dependent and their distribution may depend on further covariates. We show that our methodology works via two simple simulation studies. We then consider the impact of our measurement error assumptions on the analysis of the real data set.  相似文献   

13.
The least squares estimates of the parameters in the multistage dose-response model are unduly affected by outliers in a data set whereas the minimum sum of absolute errors, MSAE estimates are more resistant to outliers. Algorithms to compute the MSAE estimates can be tedious and computationally burdensome. We propose a linear approximation for the dose-response model that can be used to find the MSAE estimates by a simple and computationally less intensive algorithm. A few illustrative ex-amples and a Monte Carlo study show that we get comparable values of the MSAE estimates of the parameters in a dose-response model using the exact model and the linear approximation.  相似文献   

14.
Homogeneity of between-individual variance and autocorrelation coefficients is one of assumptions in the study of longitudinal data. However, the assumption could be challenging due to the complexity of the dataset. In the paper we propose and analyze nonlinear mixed models with AR(1) errors for longitudinal data, intend to introduce Huber's function in the log-likelihood function and get robust estimation, which may help to reduce the influence of outliers, by Fisher scoring method. Testing of homogeneity of variance among individuals and autocorrelation coefficients on the basis of Huber's M-estimation is studied later in the paper. Simulation studies are carried to assess performance of score test we proposed. Results obtained from plasma concentrations data are reported as an illustrative example.  相似文献   

15.
When there are frequent capture occasions, both semiparametric and nonparametric estimators for the size of an open population have been proposed using kernel smoothing methods. While kernel smoothing methods are mathematically tractable, fitting them to data is computationally intensive. Here, we use smoothing splines in the form of P-splines to provide an alternate less computationally intensive method of fitting these models to capture–recapture data from open populations with frequent capture occasions. We fit the model to capture data collected over 64 occasions and model the population size as a function of time, seasonal effects and an environmental covariate. A small simulation study is also conducted to examine the performance of the estimators and their standard errors.  相似文献   

16.
This paper generalizes the tolerance interval approach for assessing agreement between two methods of continuous measurement for repeated measurement data—a common scenario in applications. The repeated measurements may be longitudinal or they may be replicates of the same underlying measurement. Our approach is to first model the data using a mixed model and then construct a relevant asymptotic tolerance interval (or band) for the distribution of appropriately defined differences. We present the methodology in the general context of a mixed model that can incorporate covariates, heteroscedasticity and serial correlation in the errors. Simulation for the no-covariate case shows good small-sample performance of the proposed methodology. For the longitudinal data, we also describe an extension for the case when the observed time profiles are modelled nonparametrically through penalized splines. Two real data applications are presented.  相似文献   

17.
The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n 2+p) subject to n 2 linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies.  相似文献   

18.
ABSTRACT

In this paper, we develop an efficient wavelet-based regularized linear quantile regression framework for coefficient estimations, where the responses are scalars and the predictors include both scalars and function. The framework consists of two important parts: wavelet transformation and regularized linear quantile regression. Wavelet transform can be used to approximate functional data through representing it by finite wavelet coefficients and effectively capturing its local features. Quantile regression is robust for response outliers and heavy-tailed errors. In addition, comparing with other methods it provides a more complete picture of how responses change conditional on covariates. Meanwhile, regularization can remove small wavelet coefficients to achieve sparsity and efficiency. A novel algorithm, Alternating Direction Method of Multipliers (ADMM) is derived to solve the optimization problems. We conduct numerical studies to investigate the finite sample performance of our method and applied it on real data from ADHD studies.  相似文献   

19.
Most regression problems in practice require flexible semiparametric forms of the predictor for modelling the dependence of responses on covariates. Moreover, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal or spatial data. We present a unified approach for Bayesian inference via Markov chain Monte Carlo simulation in generalized additive and semiparametric mixed models. Different types of covariates, such as the usual covariates with fixed effects, metrical covariates with non-linear effects, unstructured random effects, trend and seasonal components in longitudinal data and spatial covariates, are all treated within the same general framework by assigning appropriate Markov random field priors with different forms and degrees of smoothness. We applied the approach in several case-studies and consulting cases, showing that the methods are also computationally feasible in problems with many covariates and large data sets. In this paper, we choose two typical applications.  相似文献   

20.
The multiple longitudinal outcomes collected in many clinical trials are often analyzed by multilevel item response theory (MLIRT) models. The normality assumption for the continuous outcomes in the MLIRT models can be violated due to skewness and/or outliers. Moreover, patients’ follow-up may be stopped by some terminal events (e.g., death or dropout), which are dependent on the multiple longitudinal outcomes. We proposed a joint modeling framework based on the MLIRT model to account for three data features: skewness, outliers, and dependent censoring. Our method development was motivated by a clinical study for Parkinson’s disease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号