首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Clustered longitudinal data feature cross‐sectional associations within clusters, serial dependence within subjects, and associations between responses at different time points from different subjects within the same cluster. Generalized estimating equations are often used for inference with data of this sort since they do not require full specification of the response model. When data are incomplete, however, they require data to be missing completely at random unless inverse probability weights are introduced based on a model for the missing data process. The authors propose a robust approach for incomplete clustered longitudinal data using composite likelihood. Specifically, pairwise likelihood methods are described for conducting robust estimation with minimal model assumptions made. The authors also show that the resulting estimates remain valid for a wide variety of missing data problems including missing at random mechanisms and so in such cases there is no need to model the missing data process. In addition to describing the asymptotic properties of the resulting estimators, it is shown that the method performs well empirically through simulation studies for complete and incomplete data. Pairwise likelihood estimators are also compared with estimators obtained from inverse probability weighted alternating logistic regression. An application to data from the Waterloo Smoking Prevention Project is provided for illustration. The Canadian Journal of Statistics 39: 34–51; 2011 © 2010 Statistical Society of Canada  相似文献   

2.
Suppose we have a random sample of size n from a multivariate distribution with finite moments, for which a parametric form is not available. We wish to obtain a confidence interval (CI) for the length of its mean. The usual method is to Studentize. The resulting CIs are not exact. The error in their nominal levels is ~n ?1/2 and ~n ?1 in the one-sided and two-sided cases. We show how to reduce these errors to ~n ?3/2 and ~n ?2.  相似文献   

3.
Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms.  相似文献   

4.
We propose a new stochastic approximation (SA) algorithm for maximum-likelihood estimation (MLE) in the incomplete-data setting. This algorithm is most useful for problems when the EM algorithm is not possible due to an intractable E-step or M-step. Compared to other algorithm that have been proposed for intractable EM problems, such as the MCEM algorithm of Wei and Tanner (1990), our proposed algorithm appears more generally applicable and efficient. The approach we adopt is inspired by the Robbins-Monro (1951) stochastic approximation procedure, and we show that the proposed algorithm can be used to solve some of the long-standing problems in computing an MLE with incomplete data. We prove that in general O(n) simulation steps are required in computing the MLE with the SA algorithm and O(n log n) simulation steps are required in computing the MLE using the MCEM and/or the MCNR algorithm, where n is the sample size of the observations. Examples include computing the MLE in the nonlinear error-in-variable model and nonlinear regression model with random effects.  相似文献   

5.
In longitudinal data, missing observations occur commonly with incomplete responses and covariates. Missing data can have a ‘missing not at random’ mechanism, a non‐monotone missing pattern, and moreover response and covariates can be missing not simultaneously. To avoid complexities in both modelling and computation, a two‐stage estimation method and a pairwise‐likelihood method are proposed. The two‐stage estimation method enjoys simplicities in computation, but incurs more severe efficiency loss. On the other hand, the pairwise approach leads to estimators with better efficiency, but can be cumbersome in computation. In this paper, we develop a compromise method using a hybrid pairwise‐likelihood framework. Our proposed approach has better efficiency than the two‐stage method, but its computational cost is still reasonable compared to the pairwise approach. The performance of the methods is evaluated empirically by means of simulation studies. Our methods are used to analyse longitudinal data obtained from the National Population Health Study.  相似文献   

6.
Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution.  相似文献   

7.
In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration.  相似文献   

8.
Randomized clinical trials with count measurements as the primary outcome are common in various medical areas such as seizure counts in epilepsy trials, or relapse counts in multiple sclerosis trials. Controlled clinical trials frequently use a conventional parallel-group design that assigns subjects randomly to one of two treatment groups and repeatedly evaluates them at baseline and intervals across a treatment period of a fixed duration. The primary interest is to compare the rates of change between treatment groups. Generalized estimating equations (GEEs) have been widely used to compare rates of change between treatment groups because of its robustness to misspecification of the true correlation structure. In this paper, we derive a sample size formula for comparing the rates of change between two groups in a repeatedly measured count outcome using GEE. The sample size formula incorporates general missing patterns such as independent missing and monotone missing, and general correlation structures such as AR(1) and compound symmetry (CS). The performance of the sample size formula is evaluated through simulation studies. Sample size estimation is illustrated by a clinical trial example from epilepsy.  相似文献   

9.
The authors discuss prior distributions that are conjugate to the multivariate normal likelihood when some of the observations are incomplete. They present a general class of priors for incorporating information about unidentified parameters in the covariance matrix. They analyze the special case of monotone patterns of missing data, providing an explicit recursive form for the posterior distribution resulting from a conjugate prior distribution. They develop an importance sampling and a Gibbs sampling approach to sample from a general posterior distribution and compare the two methods.  相似文献   

10.
Incomplete growth curve data often result from missing or mistimed observations in a repeated measures design. Virtually all methods of analysis rely on the dispersion matrix estimates. A Monte Carlo simulation was used to compare three methods of estimation of dispersion matrices for incomplete growth curve data. The three methods were: 1) maximum likelihood estimation with a smoothing algorithm, which finds the closest positive semidefinite estimate of the pairwise estimated dispersion matrix; 2) a mixed effects model using the EM (estimation maximization) algorithm; and 3) a mixed effects model with the scoring algorithm. The simulation included 5 dispersion structures, 20 or 40 subjects with 4 or 8 observations per subject and 10 or 30% missing data. In all the simulations, the smoothing algorithm was the poorest estimator of the dispersion matrix. In most cases, there were no significant differences between the scoring and EM algorithms. The EM algorithm tended to be better than the scoring algorithm when the variances of the random effects were close to zero, especially for the simulations with 4 observations per subject and two random effects.  相似文献   

11.
A stochastic model is proposed to analyze the observation vectors of variable lengths in a long-term clinical trial. Using a Markovian normal density, the likelihood ratio tests for usual hypotheses are derived and asymptotic distributions of the test statistics are obtained. The use of 'step-down' procedure is discussed for the interim analysis and a numerical example is given to illustrate the methodology.  相似文献   

12.
Abstract

The problem of testing equality of two multivariate normal covariance matrices is considered. Assuming that the incomplete data are of monotone pattern, a quantity similar to the Likelihood Ratio Test Statistic is proposed. A satisfactory approximation to the distribution of the quantity is derived. Hypothesis testing based on the approximate distribution is outlined. The merits of the test are investigated using Monte Carlo simulation. Monte Carlo studies indicate that the test is very satisfactory even for moderately small samples. The proposed methods are illustrated using an example.  相似文献   

13.
This article proposes various Searls-type ratio imputation methods (STRIM) on the lines of Ahmed et al. (2006 Ahmed, M. S., O. Al-Titi, Z. Al-Rawi, and W. Abu-Dayyeh. 2006. Estimation of a population mean using different imputation methods. Stat. Trans. 7 (6):12471264. [Google Scholar]). It is a well-known fact that the optimal ratio type estimator attains the MSE of regression estimator (or optimal difference estimator) but while using Searls-type transformation (STT) (Searls (1964 Searls, D. T. 1964. The utilization of a known coefficient of variation in the estimation procedure. J. Am. Stat. Assoc. 59:12251226.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar])) this may not always happen. These STRIM are shown to perform better than the imputation procedures of Ahmed et al. (2006 Ahmed, M. S., O. Al-Titi, Z. Al-Rawi, and W. Abu-Dayyeh. 2006. Estimation of a population mean using different imputation methods. Stat. Trans. 7 (6):12471264. [Google Scholar]). The STRIM may even outperform the Searls type difference imputation methods (STDIM) proposed by us in our earlier work, Bhushan and Pandey (2016 Bhushan, S., and A. P. Pandey. 2016. Optimal imputation of the missing data for estimation of population mean. Journal of Statistics and Management System 19 (6):75569.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]). This study is concluded with the numerical study along with the theoretical comparison.  相似文献   

14.
The problem of estimation of the mean vector of a multivariate normal distribution with unknown covariance matrix, under uncertain prior information (UPI) that the component mean vectors are equal, is considered. The shrinkage preliminary test maximum likelihood estimator (SPTMLE) for the parameter vector is proposed. The risk and covariance matrix of the proposed estimato are derived and parameter range in which SPTMLE dominates the usual preliminary test maximum likelihood estimator (PTMLE) is investigated. It is shown that the proposed estimator provides a wider range than the usual premilinary test estimator in which it dominates the classical estimator. Further, the SPTMLE has more appropriate size for the preliminary test than the PTMLE.  相似文献   

15.
Incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With generalized linear models, when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights proposed in Ibrahim (1990). In this article, we extend the EM by the method of weights to survival outcomes whose distributions may not fall in the class of generalized linear models. This method requires the estimation of the parameters of the distribution of the covariates. We present a clinical trials example with five covariates, four of which have some missing values.  相似文献   

16.
Suppose that there are independent samples available from several multivariate normal populations with the same mean vector m? but possibly different covariance matrices. The problem of developing a confidence region for the common mean vector based on all the samples is considered. An exact confidence region centered at a generalized version of the well-known Graybill-Deal estimator of m? is developed, and a multiple comparison procedure based on this confidence region is outlined. Necessary percentile points for constructing the confidence region are given for the two-sample case. For more than two samples, a convenient method of approximating the percentile points is suggested. Also, a numerical example is presented to illustrate the methods. Further, for the bivariate case, the proposed confidence region and the ones based on individual samples are compared numerically with respect to their expected areas. The numerical results indicate that the new confidence region is preferable to the single-sample versions for practical use.  相似文献   

17.
Traditional factor analysis (FA) rests on the assumption of multivariate normality. However, in some practical situations, the data do not meet this assumption; thus, the statistical inference made from such data may be misleading. This paper aims at providing some new tools for the skew-normal (SN) FA model when missing values occur in the data. In such a model, the latent factors are assumed to follow a restricted version of multivariate SN distribution with additional shape parameters for accommodating skewness. We develop an analytically feasible expectation conditional maximization algorithm for carrying out parameter estimation and imputation of missing values under missing at random mechanisms. The practical utility of the proposed methodology is illustrated with two real data examples and the results are compared with those obtained from the traditional FA counterparts.  相似文献   

18.
Rank tests are considered that compare t treatments in repeated measures designs. A statistic is given that contains as special cases several that have been proposed for this problem, including one that corresponds to the randomized block ANOVA statistic applied to the rank transformed data. Another statistic is proposed, having a null distribution holding under more general conditions, that is the rank transform of the Hotelling statistic for repeated measures. A statistic of this type is also given for data that are ordered categorical rather than fully rankedo Unlike the Friedman statistic, the statistics discussed in this article utilize a single ranking of the entire sample. Power calculations for an underlying normal distribution indicate that the rank transformed ANOVA test can be substantially more powerful than the Friedman test.  相似文献   

19.
We consider the problem of estimation of a density function in the presence of incomplete data and study the Hellinger distance between our proposed estimators and the true density function. Here, the presence of incomplete data is handled by utilizing a Horvitz–Thompson-type inverse weighting approach, where the weights are the estimates of the unknown selection probabilities. We also address the problem of estimation of a regression function with incomplete data.  相似文献   

20.
The problem of the estimation of mean frequency of events in the presence of censoring is important in assessing the efficacy, safety and cost of therapies. The mean frequency is typically estimated by dividing the total number of events by the total number of patients under study. This method, referred to in this paper as the ‘naïve estimator’, ignores the censoring. Other approaches available for this problem require many assumptions that are rarely acceptable. These include the assumption of independence, constant hazard rate over time and other similar distributional assumptions. In this paper a simple non‐parametric estimator based on the sum of the products of Kaplan–Meier estimators is proposed as an estimator of mean frequency, and its approximate variance and standard error are derived. An illustration is provided to show the derivation of the proposed estimator. Although the clinical trial setting is used in this paper, the problem has applications in other areas where survival analysis is used and recurrent events are studied. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号