期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Approximate bounded influence estimation for longitudinal data with outliers and measurement errors

Lang Wu Jin Qiu 《Journal of statistical planning and inference》2011,141(7):2321-2330

Mixed effects models or random effects models are popular for the analysis of longitudinal data. In practice, longitudinal data are often complex since there may be outliers in both the response and the covariates and there may be measurement errors. The likelihood method is a common approach for these problems but it can be computationally very intensive and sometimes may even be computationally infeasible. In this article, we consider approximate robust methods for nonlinear mixed effects models to simultaneously address outliers and measurement errors. The approximate methods are computationally very efficient. We show the consistency and asymptotic normality of the approximate estimates. The methods can also be extended to missing data problems. An example is used to illustrate the methods and a simulation is conducted to evaluate the methods. 相似文献

2.

Two-step and likelihood methods for joint models of longitudinal and survival data

Qian Ye 《统计学通讯:模拟与计算》2017,46(8):6019-6033

We compare the commonly used two-step methods and joint likelihood method for joint models of longitudinal and survival data via extensive simulations. The longitudinal models include LME, GLMM, and NLME models, and the survival models include Cox models and AFT models. We find that the full likelihood method outperforms the two-step methods for various joint models, but it can be computationally challenging when the dimension of the random effects in the longitudinal model is not small. We thus propose an approximate joint likelihood method which is computationally efficient. We find that the proposed approximation method performs well in the joint model context, and it performs better for more “continuous” longitudinal data. Finally, a real AIDS data example shows that patients with higher initial viral load or lower initial CD4 are more likely to drop out earlier during an anti-HIV treatment. 相似文献

3.

A new local estimation method for single index models for longitudinal data

Hongmei Lin Jianhong Shi Jicai Liu Yanghui Liu 《Journal of nonparametric statistics》2016,28(3):644-658

Single index models are natural extensions of linear models and overcome the so-called curse of dimensionality. They are very useful for longitudinal data analysis. In this paper, we develop a new efficient estimation procedure for single index models with longitudinal data, based on Cholesky decomposition and local linear smoothing method. Asymptotic normality for the proposed estimators of both the parametric and nonparametric parts will be established. Monte Carlo simulation studies show excellent finite sample performance. Furthermore, we illustrate our methods with a real data example. 相似文献

4.

Non-concave penalization in linear mixed-effect models and regularized selection of fixed effects

Abhik Ghosh Magne Thoresen 《AStA Advances in Statistical Analysis》2018,102(2):179-210

Mixed-effect models are very popular for analyzing data with a hierarchical structure. In medical applications, typical examples include repeated observations within subjects in a longitudinal design, patients nested within centers in a multicenter design. However, recently, due to the medical advances, the number of fixed-effect covariates collected from each patient can be quite large, e.g., data on gene expressions of each patient, and all of these variables are not necessarily important for the outcome. So, it is very important to choose the relevant covariates correctly for obtaining the optimal inference for the overall study. On the other hand, the relevant random effects will often be low-dimensional and pre-specified. In this paper, we consider regularized selection of important fixed-effect variables in linear mixed-effect models along with maximum penalized likelihood estimation of both fixed and random-effect parameters based on general non-concave penalties. Asymptotic and variable selection consistency with oracle properties are proved for low-dimensional cases as well as for high dimensionality of non-polynomial order of sample size (number of parameters is much larger than sample size). We also provide a suitable computationally efficient algorithm for implementation. Additionally, all the theoretical results are proved for a general non-convex optimization problem that applies to several important situations well beyond the mixed model setup (like finite mixture of regressions) illustrating the huge range of applicability of our proposal. 相似文献

5.

Fast regression surrogates for computer models with time-dependent outputs

Dorin Drignei Dalia Eugenia Popescu 《Journal of Statistical Computation and Simulation》2013,83(6):1058-1067

The study of physical processes is often aided by computer models or codes. Computer models that simulate such processes are sometimes computationally intensive and therefore not very efficient exploratory tools. In this paper, we address computer models characterized by temporal dynamics and propose new statistical correlation structures aimed at modelling their time dependence. These correlations are embedded in regression models with input-dependent design matrix and input-correlated errors that act as fast statistical surrogates for the computationally intensive dynamical codes. The methods are illustrated with an automotive industry application involving a road load data acquisition computer model. 相似文献

6.

The Effect of Drop-Out on the Efficiency of Longitudinal Experiments 总被引：1，自引：0，他引：1

Geert Verbeke & Emmanuel Lesaffre 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(3):363-375

It is shown that drop-out often reduces the efficiency of longitudinal experiments considerably. In the framework of linear mixed models, a general, computationally simple method is provided, for designing longitudinal studies when drop-out is to be expected, such that there is little risk of large losses of efficiency due to the missing data. All the results are extensively illustrated using data from a randomized experiment with rats. 相似文献

7.

Quantile regression-based Bayesian semiparametric mixed-effects models for longitudinal data with non-normal,missing and mismeasured covariate

《Journal of Statistical Computation and Simulation》2012,82(6):1183-1202

Quantile regression (QR) models have received increasing attention recently for longitudinal data analysis. When continuous responses appear non-centrality due to outliers and/or heavy-tails, commonly used mean regression models may fail to produce efficient estimators, whereas QR models may perform satisfactorily. In addition, longitudinal outcomes are often measured with non-normality, substantial errors and non-ignorable missing values. When carrying out statistical inference in such data setting, it is important to account for the simultaneous treatment of these data features; otherwise, erroneous or even misleading results may be produced. In the literature, there has been considerable interest in accommodating either one or some of these data features. However, there is relatively little work concerning all of them simultaneously. There is a need to fill up this gap as longitudinal data do often have these characteristics. Inferential procedure can be complicated dramatically when these data features arise in longitudinal response and covariate outcomes. In this article, our objective is to develop QR-based Bayesian semiparametric mixed-effects models to address the simultaneous impact of these multiple data features. The proposed models and method are applied to analyse a longitudinal data set arising from an AIDS clinical study. Simulation studies are conducted to assess the performance of the proposed method under various scenarios. 相似文献

8.

Improved estimation procedures for multilevel models with binary response: a case-study 总被引：2，自引：1，他引：1

Germán Rodríguez & Noreen Goldman 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(2):339-355

During recent years, analysts have been relying on approximate methods of inference to estimate multilevel models for binary or count data. In an earlier study of random-intercept models for binary outcomes we used simulated data to demonstrate that one such approximation, known as marginal quasi-likelihood, leads to a substantial attenuation bias in the estimates of both fixed and random effects whenever the random effects are non-trivial. In this paper, we fit three-level random-intercept models to actual data for two binary outcomes, to assess whether refined approximation procedures, namely penalized quasi-likelihood and second-order improvements to marginal and penalized quasi-likelihood, also underestimate the underlying parameters. The extent of the bias is assessed by two standards of comparison: exact maximum likelihood estimates, based on a Gauss–Hermite numerical quadrature procedure, and a set of Bayesian estimates, obtained from Gibbs sampling with diffuse priors. We also examine the effectiveness of a parametric bootstrap procedure for reducing the bias. The results indicate that second-order penalized quasi-likelihood estimates provide a considerable improvement over the other approximations, but all the methods of approximate inference result in a substantial underestimation of the fixed and random effects when the random effects are sizable. We also find that the parametric bootstrap method can eliminate the bias but is computationally very intensive. 相似文献

9.

Least squares variogram fitting by spatial subsampling

Yoon Dong Lee Soumendra N. Lahiri 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(4):837-854

Summary. Least squares methods are popular for fitting valid variogram models to spatial data. The paper proposes a new least squares method based on spatial subsampling for variogram model fitting. We show that the method proposed is statistically efficient among a class of least squares methods, including the generalized least squares method. Further, it is computationally much simpler than the generalized least squares method. The method produces valid variogram estimators under very mild regularity conditions on the underlying random field and may be applied with different choices of the generic variogram estimator without analytical calculation. An extension of the method proposed to a class of spatial regression models is illustrated with a real data example. Results from a simulation study on finite sample properties of the method are also reported. 相似文献

10.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker 《Econometric Reviews》2013,32(2-3):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

11.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker Siem Jan Koopman 《Econometric Reviews》2006,25(2):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

12.

Kernel-based global MLE of partial linear random effects models for longitudinal data

Lei Liu 《Journal of nonparametric statistics》2017,29(3):615-635

Random effects models have been playing a critical role for modelling longitudinal data. However, there are little studies on the kernel-based maximum likelihood method for semiparametric random effects models. In this paper, based on kernel and likelihood methods, we propose a pooled global maximum likelihood method for the partial linear random effects models. The pooled global maximum likelihood method employs the local approximations of the nonparametric function at a group of grid points simultaneously, instead of one point. Gaussian quadrature is used to approximate the integration of likelihood with respect to random effects. The asymptotic properties of the proposed estimators are rigorously studied. Simulation studies are conducted to demonstrate the performance of the proposed approach. We also apply the proposed method to analyse correlated medical costs in the Medical Expenditure Panel Survey data set. 相似文献

13.

Estimation of parameters in incomplete data models defined by dynamical systems

Sophie Donnet Adeline Samson 《Journal of statistical planning and inference》2007

Parametric incomplete data models defined by ordinary differential equations (ODEs) are widely used in biostatistics to describe biological processes accurately. Their parameters are estimated on approximate models, whose regression functions are evaluated by a numerical integration method. Accurate and efficient estimations of these parameters are critical issues. This paper proposes parameter estimation methods involving either a stochastic approximation EM algorithm (SAEM) in the maximum likelihood estimation, or a Gibbs sampler in the Bayesian approach. Both algorithms involve the simulation of non-observed data with conditional distributions using Hastings–Metropolis (H–M) algorithms. A modified H–M algorithm, including an original local linearization scheme to solve the ODEs, is proposed to reduce the computational time significantly. The convergence on the approximate model of all these algorithms is proved. The errors induced by the numerical solving method on the conditional distribution, the likelihood and the posterior distribution are bounded. The Bayesian and maximum likelihood estimation methods are illustrated on a simulated pharmacokinetic nonlinear mixed-effects model defined by an ODE. Simulation results illustrate the ability of these algorithms to provide accurate estimates. 相似文献

14.

Different methods for handling incomplete longitudinal binary outcome due to missing at random dropout

《Statistical Methodology》2015

This paper compares the performance of weighted generalized estimating equations (WGEEs), multiple imputation based on generalized estimating equations (MI-GEEs) and generalized linear mixed models (GLMMs) for analyzing incomplete longitudinal binary data when the underlying study is subject to dropout. The paper aims to explore the performance of the above methods in terms of handling dropouts that are missing at random (MAR). The methods are compared on simulated data. The longitudinal binary data are generated from a logistic regression model, under different sample sizes. The incomplete data are created for three different dropout rates. The methods are evaluated in terms of bias, precision and mean square error in case where data are subject to MAR dropout. In conclusion, across the simulations performed, the MI-GEE method performed better in both small and large sample sizes. Evidently, this should not be seen as formal and definitive proof, but adds to the body of knowledge about the methods’ relative performance. In addition, the methods are compared using data from a randomized clinical trial. 相似文献

15.

Inference methods for saturated models in longitudinal clinical trials with incomplete binary data

Song JX 《Pharmaceutical statistics》2006,5(4):295-304

In the longitudinal studies with binary response, it is often of interest to estimate the percentage of positive responses at each time point and the percentage of having at least one positive response by each time point. When missing data exist, the conventional method based on observed percentages could result in erroneous estimates. This study demonstrates two methods of using expectation-maximization (EM) and data augmentation (DA) algorithms in the estimation of the marginal and cumulative probabilities for incomplete longitudinal binary response data. Both methods provide unbiased estimates when the missingness mechanism is missing at random (MAR) assumption. Sensitivity analyses have been performed for cases when the MAR assumption is in question. 相似文献

16.

Robust logistic regression of family data in the presence of missing genotypes

Yanping Qiu 《Journal of applied statistics》2019,46(5):926-945

Large cohort studies are commonly launched to study the risk effect of genetic variants or other risk factors on a chronic disorder. In these studies, family data are often collected to provide additional information for the purpose of improving the inference results. Statistical analysis of the family data can be very challenging due to the missing observations of genotypes, incomplete records of disease occurrences in family members, and the complicated dependence attributed to the shared genetic background and environmental factors. In this article, we investigate a class of logistic models with family-shared random effects to tackle these challenges, and develop a robust regression method based on the conditional logistic technique for statistical inference. An expectation–maximization (EM) algorithm with fast computation speed is developed to handle the missing genotypes. The proposed estimators are shown to be consistent and asymptotically normal. Additionally, a score test based on the proposed method is derived to test the genetic effect. Extensive simulation studies demonstrate that the proposed method performs well in finite samples in terms of estimate accuracy, robustness and computational speed. The proposed procedure is applied to an Alzheimer's disease study. 相似文献

17.

Two-Stage Estimation in Copula Models Used in Family Studies

Andersen EW 《Lifetime data analysis》2005,11(3):333-350

In this paper register based family studies provide the motivation for studying a two-stage estimation procedure in copula models for multivariate failure time data. The asymptotic properties of the estimators in both parametric and semi-parametric models are derived, generalising the approach by Shih and Louis (Biometrics vol. 51, pp. 1384–1399, 1995b) and Glidden (Lifetime Data Analysis vol. 6, pp. 141–156, 2000). Because register based family studies often involve very large cohorts a method for analysing a sampled cohort is also derived together with the asymptotic properties of the estimators. The proposed methods are studied in simulations and the estimators are found to be highly efficient. Finally, the methods are applied to a study of mortality in twins. 相似文献

18.

Empirical characteristic function tests for GARCH innovation distribution using multipliers

María Dolores Jiménez-Gamero Juan Carlos Pardo-Fernández 《Journal of Statistical Computation and Simulation》2017,87(10):2069-2093

Goodness-of-fit tests for the innovation distribution in GARCH models based on measuring deviations between the empirical characteristic function of the residuals and the characteristic function under the null hypothesis have been proposed in the literature. The asymptotic distributions of these test statistics depend on unknown quantities, so their null distributions are usually estimated through parametric bootstrap (PB). Although easy to implement, the PB can become very computationally expensive for large sample sizes, which is typically the case in applications of these models. This work proposes to approximate the null distribution through a weighted bootstrap. The procedure is studied both theoretically and numerically. Its asymptotic properties are similar to those of the PB, but, from a computational point of view, it is more efficient. 相似文献

19.

含非随机缺失数据的面板数据参数估计方法

于力超金勇进《统计研究》2016,33(1):95-102

抽样调查领域常采用对多个受访者进行跟踪调查得到面板数据,进而对总体特性进行统计推断,在面板数据中常含缺失数据,大多数处理面板缺失数据的软件都是直接删去含缺失值的受访者以得到完全数据集,当数据缺失机制为非随机缺失时会导致总体参数估计结果有偏。本文针对数据缺失机制为非随机缺失情形下,如何对面板数据进行统计分析进行了阐述,主要采用的是基于模型的似然推断法,对目标变量、缺失指示变量和随机效应向量的联合分布建模,在已有选择模型和模式混合模型的基础上,引入随机效应,研究目标变量期望的计算方法,并研究随机效应杂合模型下参数的估计方法,在变量分布相对简单的情形下给出了用极大似然法推断总体参数的估计步骤,最后通过模拟分析比较方法的优劣。相似文献

20.

Simulated maximum likelihood estimation in joint models for multiple longitudinal markers and recurrent events of multiple types,in the presence of a terminal event

M. H. Hof J. Z. Musoro R. B. Geskus G. H. Struijk I. J. M. ten Berge A. H. Zwinderman 《Journal of applied statistics》2017,44(15):2756-2777

In medical studies we are often confronted with complex longitudinal data. During the follow-up period, which can be ended prematurely by a terminal event (e.g. death), a subject can experience recurrent events of multiple types. In addition, we collect repeated measurements from multiple markers. An adverse health status, represented by ‘bad’ marker values and an abnormal number of recurrent events, is often associated with the risk of experiencing the terminal event. In this situation, the missingness of the data is not at random and, to avoid bias, it is necessary to model all data simultaneously using a joint model. The correlations between the repeated observations of a marker or an event type within an individual are captured by normally distributed random effects. Because the joint likelihood contains an analytically intractable integral, Bayesian approaches or quadrature approximation techniques are necessary to evaluate the likelihood. However, when the number of recurrent event types and markers is large, the dimensionality of the integral is high and these methods are too computationally expensive. As an alternative, we propose a simulated maximum-likelihood approach based on quasi-Monte Carlo integration to evaluate the likelihood of joint models with multiple recurrent event types and markers. 相似文献