期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The impact of dichotomization in longitudinal data analysis: a simulation study

Bongin Yoo 《Pharmaceutical statistics》2010,9(4):298-312

In this paper, a simulation study is conducted to systematically investigate the impact of dichotomizing longitudinal continuous outcome variables under various types of missing data mechanisms. Generalized linear models (GLM) with standard generalized estimating equations (GEE) are widely used for longitudinal outcome analysis, but these semi‐parametric approaches are only valid under missing data completely at random (MCAR). Alternatively, weighted GEE (WGEE) and multiple imputation GEE (MI‐GEE) were developed to ensure validity under missing at random (MAR). Using a simulation study, the performance of standard GEE, WGEE and MI‐GEE on incomplete longitudinal dichotomized outcome analysis is evaluated. For comparisons, likelihood‐based linear mixed effects models (LMM) are used for incomplete longitudinal original continuous outcome analysis. Focusing on dichotomized outcome analysis, MI‐GEE with original continuous missing data imputation procedure provides well controlled test sizes and more stable power estimates compared with any other GEE‐based approaches. It is also shown that dichotomizing longitudinal continuous outcome will result in substantial loss of power compared with LMM. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

2.

Estimation of regression vectors in linear mixed models with Dirichlet process random effects

Chen Li George Casella Malay Ghosh 《统计学通讯:理论与方法》2018,47(16):3935-3954

The Dirichlet process has been used extensively in Bayesian non parametric modeling, and has proven to be very useful. In particular, mixed models with Dirichlet process random effects have been used in modeling many types of data and can often outperform their normal random effect counterparts. Here we examine the linear mixed model with Dirichlet process random effects from a classical view, and derive the best linear unbiased estimator (BLUE) of the fixed effects. We are also able to calculate the resulting covariance matrix and find that the covariance is directly related to the precision parameter of the Dirichlet process, giving a new interpretation of this parameter. We also characterize the relationship between the BLUE and the ordinary least-squares (OLS) estimator and show how confidence intervals can be approximated. 相似文献

3.

Covariate Decomposition Methods for Longitudinal Missing‐at‐Random Data and Predictors Associated with Subject‐Specific Effects

John M. Neuhaus Charles E. McCulloch 《Australian & New Zealand Journal of Statistics》2014,56(4):331-345

Investigators often gather longitudinal data to assess changes in responses over time within subjects and to relate these changes to within‐subject changes in predictors. Missing data are common in such studies and predictors can be correlated with subject‐specific effects. Maximum likelihood methods for generalized linear mixed models provide consistent estimates when the data are ‘missing at random’ (MAR) but can produce inconsistent estimates in settings where the random effects are correlated with one of the predictors. On the other hand, conditional maximum likelihood methods (and closely related maximum likelihood methods that partition covariates into between‐ and within‐cluster components) provide consistent estimation when random effects are correlated with predictors but can produce inconsistent covariate effect estimates when data are MAR. Using theory, simulation studies, and fits to example data this paper shows that decomposition methods using complete covariate information produce consistent estimates. In some practical cases these methods, that ostensibly require complete covariate information, actually only involve the observed covariates. These results offer an easy‐to‐use approach to simultaneously protect against bias from both cluster‐level confounding and MAR missingness in assessments of change. 相似文献

4.

Estimation Bias in Complete-Case Analysis in Crossover Studies with Missing Data

Fang Liu 《统计学通讯:理论与方法》2013,42(5):812-827

Crossover designs are used often in clinical trials. It is not uncommon that subjects discontinue before completing all treatment periods in a crossover study. Despite availability of statistical methodologies utilizing all available data and software for obtaining valid inferences under the assumption of missing at random (MAR), naïve approaches, such as the complete case (CC) analysis, which is only valid with a strong assumption of missing completely at random are still widely used in practice. In this article, we obtain the analytical form of the estimation bias of treatment effects with CC for linear mixed models. We use simulation studies to examine the inflation of Type I error and efficiency loss in the inferences with CC under MAR. Invalidity and inefficiency of two other commonly used approaches for defining analyzed data in the presence of missing data, including data from at least two periods in three period crossover and available cases for a specific comparison of interest, are also demonstrated through simulation studies. 相似文献

5.

A Semiparametric Regression Model for Longitudinal Data with Non‐stationary Errors

下载免费PDF全文

Rui Li Chenlei Leng Jinhong You 《Scandinavian Journal of Statistics》2017,44(4):932-950

Motivated by the need to analyze the National Longitudinal Surveys data, we propose a new semiparametric longitudinal mean‐covariance model in which the effects on dependent variable of some explanatory variables are linear and others are non‐linear, while the within‐subject correlations are modelled by a non‐stationary autoregressive error structure. We develop an estimation machinery based on least squares technique by approximating non‐parametric functions via B‐spline expansions and establish the asymptotic normality of parametric estimators as well as the rate of convergence for the non‐parametric estimators. We further advocate a new model selection strategy in the varying‐coefficient model framework, for distinguishing whether a component is significant and subsequently whether it is linear or non‐linear. Besides, the proposed method can also be employed for identifying the true order of lagged terms consistently. Monte Carlo studies are conducted to examine the finite sample performance of our approach, and an application of real data is also illustrated. 相似文献

6.

A test of the missing data mechanism for repeated measures data

Taesung Park Seungyeoun Lee Robert F. Woolson 《统计学通讯:理论与方法》2013,42(10):2813-2829

The occurrence of missing data is an often unavoidable consequence of repeated measures studies. Fortunately, multivariate general linear models such as growth curve models and linear mixed models with random effects have been well developed to analyze incomplete normally-distributed repeated measures data. Most statistical methods have assumed that the missing data occur at random. This assumption may include two types of missing data mechanism: missing completely at random (MCAR) and missing at random (MAR) in the sense of Rubin (1976). In this paper, we develop a test procedure for distinguishing these two types of missing data mechanism for incomplete normally-distributed repeated measures data. The proposed test is similar in spiril to the test of Park and Davis (1992). We derive the test for incomplete normally-distribrlted repeated measures data using linear mixed models. while Park and Davis (1992) cleirved thr test for incomplete repeatctl categorical data in the framework of Grizzle Starmer. and Koch (1969). Thr proposed procedure can be applied easily to any other multivariate general linear model which allow for missing data. The test is illustrated using the hip-replacernent patient.data from Crowder and Hand (1990). 相似文献

7.

Empirical Bayes estimates for correlated hierarchical data with overdispersion

Samuel Iddi Geert Molenberghs Mehreteab Aregay George Kalema 《Pharmaceutical statistics》2014,13(5):316-326

An extension of the generalized linear mixed model was constructed to simultaneously accommodate overdispersion and hierarchies present in longitudinal or clustered data. This so‐called combined model includes conjugate random effects at observation level for overdispersion and normal random effects at subject level to handle correlation, respectively. A variety of data types can be handled in this way, using different members of the exponential family. Both maximum likelihood and Bayesian estimation for covariate effects and variance components were proposed. The focus of this paper is the development of an estimation procedure for the two sets of random effects. These are necessary when making predictions for future responses or their associated probabilities. Such (empirical) Bayes estimates will also be helpful in model diagnosis, both when checking the fit of the model as well as when investigating outlying observations. The proposed procedure is applied to three datasets of different outcome types. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

8.

Non‐stationary Cross‐Covariance Models for Multivariate Processes on a Globe

MIKYOUNG JUN 《Scandinavian Journal of Statistics》2011,38(4):726-747

Abstract. In geophysical and environmental problems, it is common to have multiple variables of interest measured at the same location and time. These multiple variables typically have dependence over space (and/or time). As a consequence, there is a growing interest in developing models for multivariate spatial processes, in particular, the cross‐covariance models. On the other hand, many data sets these days cover a large portion of the Earth such as satellite data, which require valid covariance models on a globe. We present a class of parametric covariance models for multivariate processes on a globe. The covariance models are flexible in capturing non‐stationarity in the data yet computationally feasible and require moderate numbers of parameters. We apply our covariance model to surface temperature and precipitation data from an NCAR climate model output. We compare our model to the multivariate version of the Matérn cross‐covariance function and models based on coregionalization and demonstrate the superior performance of our model in terms of AIC (and/or maximum loglikelihood values) and predictive skill. We also present some challenges in modelling the cross‐covariance structure of the temperature and precipitation data. Based on the fitted results using full data, we give the estimated cross‐correlation structure between the two variables. 相似文献

9.

A test of separate hypotheses for comparing linear mixed models with non nested fixed effects

Ché L. Smith Lloyd J. Edwards 《统计学通讯:理论与方法》2017,46(11):5487-5500

As researchers increasingly rely on linear mixed models to characterize longitudinal data, there is a need for improved techniques for selecting among this class of models which requires specification of both fixed and random effects via a mean model and variance-covariance structure. The process is further complicated when fixed and/or random effects are non nested between models. This paper explores the development of a hypothesis test to compare non nested linear mixed models based on extensions of the work begun by Sir David Cox. We assess the robustness of this approach for comparing models containing correlated measures of body fat for predicting longitudinal cardiometabolic risk. 相似文献

10.

Nonlinear mixed‐effects models with misspecified random‐effects distribution

Reza Drikvandi 《Pharmaceutical statistics》2020,19(3):187-201

Nonlinear mixed‐effects models are being widely used for the analysis of longitudinal data, especially from pharmaceutical research. They use random effects which are latent and unobservable variables so the random‐effects distribution is subject to misspecification in practice. In this paper, we first study the consequences of misspecifying the random‐effects distribution in nonlinear mixed‐effects models. Our study is focused on Gauss‐Hermite quadrature, which is now the routine method for calculation of the marginal likelihood in mixed models. We then present a formal diagnostic test to check the appropriateness of the assumed random‐effects distribution in nonlinear mixed‐effects models, which is very useful for real data analysis. Our findings show that the estimates of fixed‐effects parameters in nonlinear mixed‐effects models are generally robust to deviations from normality of the random‐effects distribution, but the estimates of variance components are very sensitive to the distributional assumption of random effects. Furthermore, a misspecified random‐effects distribution will either overestimate or underestimate the predictions of random effects. We illustrate the results using a real data application from an intensive pharmacokinetic study. 相似文献

11.

Multiple imputation methods for recurrent event data with missing event category

Douglas E. Schaubel Jianwen Cai 《Revue canadienne de statistique》2006,34(4):677-692

Frequently in clinical and epidemiologic studies, the event of interest is recurrent (i.e., can occur more than once per subject). When the events are not of the same type, an analysis which accounts for the fact that events fall into different categories will often be more informative. Often, however, although event times may always be known, information through which events are categorized may potentially be missing. Complete‐case methods (whose application may require, for example, that events be censored when their category cannot be determined) are valid only when event categories are missing completely at random. This assumption is rather restrictive. The authors propose two multiple imputation methods for analyzing multiple‐category recurrent event data under the proportional means/rates model. The use of a proper or improper imputation technique distinguishes the two approaches. Both methods lead to consistent estimation of regression parameters even when the missingness of event categories depends on covariates. The authors derive the asymptotic properties of the estimators and examine their behaviour in finite samples through simulation. They illustrate their approach using data from an international study on dialysis. 相似文献

12.

Longitudinal conditional models with intermittent missingness: SAS code and applications

《Journal of Statistical Computation and Simulation》2012,82(4):753-780

This work provides a set of macros performed with SAS (Statistical Analysis System) for Windows, which can be used to fit conditional models under intermittent missingness in longitudinal data. A formalized transition model, including random effects for individuals and measurement error, is presented. Model fitting is based on the missing completely at random or missing at random assumptions, and the separability condition. The problem translates to maximization of the marginal observed data density only, which for Gaussian data is again Gaussian, meaning that the likelihood can be expressed in terms of the mean and covariance matrix of the observed data vector. A simulation study is presented and misspecification issues are considered. A practical application is also given, where conditional models are fitted to the data from a clinical trial that assessed the effect of a Cuban medicine on a disease of the respiratory system. 相似文献

13.

Handling of missing data in long‐term clinical trials: a case study

Mark Janssens Geert Molenberghs René Kerstens 《Pharmaceutical statistics》2012,11(6):442-448

Missing data in clinical trials is a well‐known problem, and the classical statistical methods used can be overly simple. This case study shows how well‐established missing data theory can be applied to efficacy data collected in a long‐term open‐label trial with a discontinuation rate of almost 50%. Satisfaction with treatment in chronically constipated patients was the efficacy measure assessed at baseline and every 3 months postbaseline. The improvement in treatment satisfaction from baseline was originally analyzed with a paired t‐test ignoring missing data and discarding the correlation structure of the longitudinal data. As the original analysis started from missing completely at random assumptions regarding the missing data process, the satisfaction data were re‐examined, and several missing at random (MAR) and missing not at random (MNAR) techniques resulted in adjusted estimate for the improvement in satisfaction over 12 months. Throughout the different sensitivity analyses, the effect sizes remained significant and clinically relevant. Thus, even for an open‐label trial design, sensitivity analysis, with different assumptions for the nature of dropouts (MAR or MNAR) and with different classes of models (selection, pattern‐mixture, or multiple imputation models), has been found useful and provides evidence towards the robustness of the original analyses; additional sensitivity analyses could be undertaken to further qualify robustness. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

14.

Modelling Survival Events with Longitudinal Covariates Measured with Error

Hongsheng Dai Jianxin Pan Yanchun Bao 《统计学通讯:理论与方法》2013,42(21):3819-3837

In survival analysis, time-dependent covariates are usually present as longitudinal data collected periodically and measured with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models may be used for modelling of survival events. The hazard rate of survival times depends on the underlying time-dependent covariate measured with error, which may be described by random effects. Most existing methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be always valid in practice. In this article, we propose a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The proposed method does not require any parametric distribution assumption on random effects and random errors. Asymptotic properties for parameter estimators are provided. Simulation results show that under certain situations the proposed methods are more efficient than the existing methods. 相似文献

15.

Kernel‐based Generalized Cross‐validation in Non‐parametric Mixed‐effect Models

WANGLI XU LIXING ZHU 《Scandinavian Journal of Statistics》2009,36(2):229-247

Abstract. Although generalized cross‐validation (GCV) has been frequently applied to select bandwidth when kernel methods are used to estimate non‐parametric mixed‐effect models in which non‐parametric mean functions are used to model covariate effects, and additive random effects are applied to account for overdispersion and correlation, the optimality of the GCV has not yet been explored. In this article, we construct a kernel estimator of the non‐parametric mean function. An equivalence between the kernel estimator and a weighted least square type estimator is provided, and the optimality of the GCV‐based bandwidth is investigated. The theoretical derivations also show that kernel‐based and spline‐based GCV give very similar asymptotic results. This provides us with a solid base to use kernel estimation for mixed‐effect models. Simulation studies are undertaken to investigate the empirical performance of the GCV. A real data example is analysed for illustration. 相似文献

16.

Empirical evaluation of the implementation of the EMA guideline on missing data in confirmatory clinical trials: Specification of mixed models for longitudinal data in study protocols

Sebastian Hckl Armin Koch Florian Lasch 《Pharmaceutical statistics》2019,18(6):636-644

In confirmatory clinical trials, the prespecification of the primary analysis model is a universally accepted scientific principle to allow strict control of the type I error. Consequently, both the ICH E9 guideline and the European Medicines Agency (EMA) guideline on missing data in confirmatory clinical trials require that the primary analysis model is defined unambiguously. This requirement applies to mixed models for longitudinal data handling missing data implicitly. To evaluate the compliance with the EMA guideline, we evaluated the model specifications in those clinical study protocols from development phases II and III submitted between 2015 and 2018 to the Ethics Committee at Hannover Medical School under the German Medicinal Products Act, which planned to use a mixed model for longitudinal data in the confirmatory testing strategy. Overall, 39 trials from different types of sponsors and a wide range of therapeutic areas were evaluated. While nearly all protocols specify the fixed and random effects of the analysis model (95%), only 77% give the structure of the covariance matrix used for modeling the repeated measurements. Moreover, the testing method (36%), the estimation method (28%), the computation method (3%), and the fallback strategy (18%) are given by less than half the study protocols. Subgroup analyses indicate that these findings are universal and not specific to clinical trial phases or size of company. Altogether, our results show that guideline compliance is to various degrees poor and consequently, strict type I error rate control at the intended level is not guaranteed. 相似文献

17.

Estimation of group means in generalized linear mixed models

Jiexin Duan Michael Levine Junxiang Luo Yongming Qu 《Pharmaceutical statistics》2020,19(5):646-661

In this study, we investigate the concept of the mean response for a treatment group mean as well as its estimation and prediction for generalized linear models with a subject‐wise random effect. Generalized linear models are commonly used to analyze categorical data. The model‐based mean for a treatment group usually estimates the response at the mean covariate. However, the mean response for the treatment group for studied population is at least equally important in the context of clinical trials. New methods were proposed to estimate such a mean response in generalized linear models; however, this has only been done when there are no random effects in the model. We suggest that, in a generalized linear mixed model (GLMM), there are at least two possible definitions of a treatment group mean response that can serve as estimation/prediction targets. The estimation of these treatment group means is important for healthcare professionals to be able to understand the absolute benefit vs risk. For both of these treatment group means, we propose a new set of methods that suggests how to estimate/predict both of them in a GLMMs with a univariate subject‐wise random effect. Our methods also suggest an easy way of constructing corresponding confidence and prediction intervals for both possible treatment group means. Simulations show that proposed confidence and prediction intervals provide correct empirical coverage probability under most circumstances. Proposed methods have also been applied to analyze hypoglycemia data from diabetes clinical trials. 相似文献

18.

Statistical Inference for Single‐index Panel Data Models

Liping Zhu Jinhong You Qunfang Xu 《Scandinavian Journal of Statistics》2014,41(3):830-843

We study estimation and hypothesis testing in single‐index panel data models with individual effects. Through regressing the individual effects on the covariates linearly, we convert the estimation problem in single‐index panel data models to that in partially linear single‐index models. The conversion is valid regardless of the individual effects being random or fixed. We propose an estimating equation approach, which has a desirable double robustness property. We show that our method is applicable in single‐index panel data models with heterogeneous link functions. We further design a chi‐squared test to evaluate whether the individual effects are random or fixed. We conduct simulations to demonstrate the finite sample performance of the method and conduct a data analysis to illustrate its usefulness. 相似文献

19.

A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects

Xin Huang Gang Li Robert M. Elashoff Jianxin Pan 《Lifetime data analysis》2011,17(1):80-100

This article studies a general joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the competing risks survival data, and a regression sub-model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. The model provides a useful approach to adjust for non-ignorable missing data due to dropout for the longitudinal outcome, enables analysis of the survival outcome with informative censoring and intermittently measured time-dependent covariates, as well as joint analysis of the longitudinal and survival outcomes. Unlike previously studied joint models, our model allows for heterogeneous random covariance matrices. It also offers a framework to assess the homogeneous covariance assumption of existing joint models. A Bayesian MCMC procedure is developed for parameter estimation and inference. Its performances and frequentist properties are investigated using simulations. A real data example is used to illustrate the usefulness of the approach. 相似文献

20.

The em algorithm for the quasi-likelihood regression model

Myunghee Cho Paik 《统计学通讯:理论与方法》2013,42(6):1403-1430

The objective of this paper is to present a method which can accommodate certain types of missing data by using the quasi-likelihood function for the complete data. This method can be useful when we can make first and second moment assumptions only; in addition, it can be helpful when the EM algorithm applied to the actual likelihood becomes overly complicated. First we derive a loss function for the observed data using an exponential family density which has the same mean and variance structure of the complete data. This loss function is the counterpart of the quasi-deviance for the observed data. Then the loss function is minimized using the EM algorithm. The use of the EM algorithm guarantees a decrease in the loss function at every iteration. When the observed data can be expressed as a deterministic linear transformation of the complete data, or when data are missing completely at random, the proposed method yields consistent estimators. Examples are given for overdispersed polytomous data, linear random effects models, and linear regression with missing covariates. Simulation results for the linear regression model with missing covariates show that the proposed estimates are more efficient than estimates based on completely observed units, even when outcomes are bimodal or skewed. 相似文献