首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 289 毫秒
1.
We use simulations based on data on injury severity in car accidents to compare methods for the analysis of very large data sets containing clusters of individuals for which the measured response is polytomous. Retrospective sampling of clusters is used to expedite the analysis of the large data set while at the same time obtaining information about rare, but important, outcomes. An additional complication in the analysis of such data sets is that there can be two types of covariates: those which vary within a cluster and those which vary only among clusters. Weighted generalized estimating equations are developed to obtain consistent estimates of the regression coefficients in a proportional-odds model, along with a weighted robust covariance matrix to estimate the variabilities of these estimated coefficients.  相似文献   

2.
When modeling correlated binary data in the presence of informative cluster sizes, generalized estimating equations with either resampling or inverse-weighting, are often used to correct for estimation bias. However, existing methods for the clustered longitudinal setting assume constant cluster sizes over time. We present a subject-weighted generalized estimating equations scheme that provides valid parameter estimation for the clustered longitudinal setting while allowing cluster sizes to change over time. We compare, via simulation, the performance of existing methods to our subject-weighted approach. The subject-weighted approach was the only method that showed negligible bias, with excellent coverage, for all model parameters.  相似文献   

3.
Power analysis for multi-center randomized control trials is quite difficult to perform for non-continuous responses when site differences are modeled by random effects using the generalized linear mixed-effects model (GLMM). First, it is not possible to construct power functions analytically, because of the extreme complexity of the sampling distribution of parameter estimates. Second, Monte Carlo (MC) simulation, a popular option for estimating power for complex models, does not work within the current context because of a lack of methods and software packages that would provide reliable estimates for fitting such GLMMs. For example, even statistical packages from software giants like SAS do not provide reliable estimates at the time of writing. Another major limitation of MC simulation is the lengthy running time, especially for complex models such as GLMM, especially when estimating power for multiple scenarios of interest. We present a new approach to address such limitations. The proposed approach defines a marginal model to approximate the GLMM and estimates power without relying on MC simulation. The approach is illustrated with both real and simulated data, with the simulation study demonstrating good performance of the method.  相似文献   

4.
This paper extends methods for nonlinear regression analysis that have developed for the analysis of clustered data. Its novelty lies in its dual incorporation of random cluster effects and structural error in the measurement of the explanatory variables. Moments up to second order are assumed to have been specified for the latter to enable a generalized estimating equations approach to be used for fitting and testing nonlinear models linking response to these explanatory variables and random effects. Taylor expansion methods are used, and a difficulty with earlier approaches overcome. Finally we describe an application of this methodology to indicate how it can be used. That application concerns the degree of association of hospital admissions for acute respiratory health problems and air pollution.  相似文献   

5.
The author describes the relationship between the extended generalized estimating equations (EGEEs) of Hall & Severini (1998) and various similar methods. He proposes a true extended quasi‐likelihood approach for the clustered data case and explores restricted maximum likelihood‐like versions of the EGEE and extended quasi‐likelihood estimating equations. He also presents simulation results comparing the various estimators in terms of mean squared error of estimation based on three moderate sample size, discrete data situations.  相似文献   

6.
This paper investigates the test procedures for testing the homogeneity of the proportions in the analysis of clustered binary data in the context of unequal dispersions across the treatment groups. We introduce a simple test procedure based on adjusted proportions using a sandwich estimator of the variance of the proportion estimators obtained by the generalized estimating equations approach of Zeger and Liang (1986) [Biometrics 42, 121-130]. We also extend the exiting test procedures of testing the hypothesis of proportions in this context. These test procedures are then compared, by simulations, in terms of size and power. Moreover, we derive the score test for testing the homogeneity of the dispersion parameters among several groups of clustered binary data. An illustrative application of the recommended test procedures is also presented.  相似文献   

7.
Generalized estimating equations (GEE) have become a popular method for marginal regression modelling of data that occur in clusters. Features of the GEE methodology are the use of a ‘working covariance’, an approximation to the underlying covariance, which is used to improve the efficiency in estimating the regression coefficients, and the ‘sandwich’ estimate of variance, which provides a way of consistently estimating their standard errors. These techniques have been extended to include estimating equations for the underlying correlation structure, both to improve the efficiency of the regression coefficient estimates and to provide estimates of correlations between units in a cluster, when these are of interest. If the mean structure is of primary interest, then a simpler set of equations (GEE1) can be used, whereas if the underlying covariance structure is of interest in its own right, the use of the more complex GEE2 estimating equations is often recommended. In this paper, we compare the effect of increasing the complexity of the ‘working covariances’ on the variance of the parameter estimates, as well as the mean-squared error of the ‘sandwich’ estimate of variance. We give asymptotic expressions for these variances and mean-squared error terms. We use these to study the behaviour of different variants of GEE1 and GEE2 when we change the number of clusters, the cluster size, and the within-cluster correlation. We conclude that the extra complexity of the full GEE2 approach is not usually justified if the mean structure is of primary interest.  相似文献   

8.
Some studies generate data that can be grouped into clusters in more than one way. Consider for instance a smoking prevention study in which responses on smoking status are collected over several years in a cohort of students from a number of different schools. This yields longitudinal data, also cross‐sectionaliy clustered in schools. The authors present a model for analyzing binary data of this type, combining generalized estimating equations and estimation of random effects to address the longitudinal and cross‐sectional dependence, respectively. The estimation procedure for this model is discussed, as are the results of a simulation study used to investigate the properties of its estimates. An illustration using data from a smoking prevention trial is given.  相似文献   

9.
Measurement-error modelling occurs when one cannot observe a covariate, but instead has possibly replicated surrogate versions of this covariate measured with error. The vast majority of the literature in measurement-error modelling assumes (typically with good reason) that given the value of the true but unobserved (latent) covariate, the replicated surrogates are unbiased for latent covariate and conditionally independent. In the area of nutritional epidemiology, there is some evidence from biomarker studies that this simple conditional independence model may break down due to two causes: (a) systematic biases depending on a person's body mass index, and (b) an additional random component of bias, so that the error structure is the same as a one-way random-effects model. We investigate this problem in the context of (1) estimating distribution of usual nutrient intake, (2) estimating the correlation between a nutrient instrument and usual nutrient intake, and (3) estimating the true relative risk from an estimated relative risk using the error-prone covariate. While systematic bias due to body mass index appears to have little effect, the additional random effect in the variance structure is shown to have a potentially important effect on overall results, both on corrections for relative risk estimates and in estimating the distribution of usual nutrient intake. However, the effect of dietary measurement error on both factors is shown via examples to depend strongly on the data set being used. Indeed, one of our data sets suggests that dietary measurement error may be masking a strong risk of fat on breast cancer, while for a second data set this masking is not so clear. Until further understanding of dietary measurement is available, measurement-error corrections must be done on a study-specific basis, sensitivity analyses should be conducted, and even then results of nutritional epidemiology studies relating diet to disease risk should be interpreted cautiously.  相似文献   

10.
Summary.  Generalized estimating equations for correlated repeated ordinal score data are developed assuming a proportional odds model and a working correlation structure based on a first-order autoregressive process. Repeated ordinal scores on the same experimental units, not necessarily with equally spaced time intervals, are assumed and a new algorithm for the joint estimation of the model regression parameters and the correlation coefficient is developed. Approximate standard errors for the estimated correlation coefficient are developed and a simulation study is used to compare the new methodology with existing methodology. The work was part of a project on post-harvest quality of pot-plants and the generalized estimating equation model is used to analyse data on poinsettia and begonia pot-plant quality deterioration over time. The relationship between the key attributes of plant quality and the quality and longevity of ornamental pot-plants during shelf and after-sales life is explored.  相似文献   

11.
A brief review of the minimum discrimination information (MDI) approach in analyzing categorical data is presented in a question -answer format, An example is given to bring out situations in which the MDI approach is more useful. No new results are proved.  相似文献   

12.
The lymphocyte proliferative assay (LPA) of immune competence was conducted on 52 subjects, with up to 36 processing conditions per subject, to evaluate whether samples could be shipped or stored overnight, rather than being processed on fresh blood as currently required. The LPA study resulted in clustered binary data, with both cluster level and cluster-varying covariates. Two modelling strategies for the analysis of such clustered binary data are through the cluster-specific and population-averaged approaches. Whereas most research in this area has focused on the analysis of matched pairs data, in many situations, such as the LPA study, cluster sizes are naturally larger. Through considerations of interpretation and efficiency of these models when applied to large clusters, the mixed effect cluster-specific model was selected as most appropriate for the analysis of the LPA data. The model confirmed that the LPA response is significantly impaired in individuals infected with the human immunodeficiency virus (HIV). The LPA response was found to be significantly lower for shipped and overnight samples than for fresh samples, and this effect was significantly stronger among HIV-infected individuals. Surprisingly, an anticoagulant effect was not detected.  相似文献   

13.
Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach.  相似文献   

14.
The present work demonstrates an application of random effects model for analyzing birth intervals that are clustered into geographical regions. Observations from the same cluster are assumed to be correlated because usually they share certain unobserved characteristics between them. Ignoring the correlations among the observations may lead to incorrect standard errors of the estimates of parameters of interest. Beside making the comparisons between Cox's proportional hazards model and random effects model for analyzing geographically clustered time-to-event data, important demographic and socioeconomic factors that may affect the length of birth intervals of Bangladeshi women are also reported in this paper.  相似文献   

15.
This paper discusses regression analysis of clustered interval-censored failure time data, which often occur in medical follow-up studies among other areas. For such data, sometimes the failure time may be related to the cluster size, the number of subjects within each cluster or we have informative cluster sizes. For the problem, we present a within-cluster resampling method for the situation where the failure time of interest can be described by a class of linear transformation models. In addition to the establishment of the asymptotic properties of the proposed estimators of regression parameters, an extensive simulation study is conducted for the assessment of the finite sample properties of the proposed method and suggests that it works well in practical situations. An application to the example that motivated this study is also provided.  相似文献   

16.
We use Owen's (1988, 1990) empirical likelihood method in upgraded mixture models. Two groups of independent observations are available. One is z 1, ..., z n which is observed directly from a distribution F ( z ). The other one is x 1, ..., x m which is observed indirectly from F ( z ), where the x i s have density ∫ p ( x | z ) dF ( z ) and p ( x | z ) is a conditional density function. We are interested in testing H 0: p ( x | z ) = p ( x | z ; θ ), for some specified smooth density function. A semiparametric likelihood ratio based statistic is proposed and it is shown that it converges to a chi-squared distribution. This is a simple method for doing goodness of fit tests, especially when x is a discrete variable with finitely many values. In addition, we discuss estimation of θ and F ( z ) when H 0 is true. The connection between upgraded mixture models and general estimating equations is pointed out.  相似文献   

17.
In this paper, we provide a method for constructing confidence interval for accuracy in correlated observations, where one sample of patients is being rated by two or more diagnostic tests. Confidence intervals for other measures of diagnostic tests, such as sensitivity, specificity, positive predictive value, and negative predictive value, have already been developed for clustered or correlated observations using the generalized estimating equations (GEE) method. Here, we use the GEE and delta‐method to construct confidence intervals for accuracy, the proportion of patients who are correctly classified. Simulation results verify that the estimated confidence intervals exhibit consistent/appropriate coverage rates.  相似文献   

18.
Estimates from an EM algorithm are somewhat sensitive to the initial values for the estimates, and this sensitivity is likely to increase when the model becomes larger and more complicated. In this paper, we examined how the estimates fluctuate during an EM procedure for a recursive model of categorical variables. It is found that the fluctuation takes place mostly during the initial stage of the procedure and that it can be reduced by applying a Bayes method of estimation. Both real and simulated data are used for illustration.  相似文献   

19.
Summary.  Using standard correlation bounds, we show that in generalized estimation equations (GEEs) the so-called 'working correlation matrix' R ( α ) for analysing binary data cannot in general be the true correlation matrix of the data. Methods for estimating the correlation param-eter in current GEE software for binary responses disregard these bounds. To show that the GEE applied on binary data has high efficiency, we use a multivariate binary model so that the covariance matrix from estimating equation theory can be compared with the inverse Fisher information matrix. But R ( α ) should be viewed as the weight matrix, and it should not be confused with the correlation matrix of the binary responses. We also do a comparison with more general weighted estimating equations by using a matrix Cauchy–Schwarz inequality. Our analysis leads to simple rules for the choice of α in an exchangeable or autoregressive AR(1) weight matrix R ( α ), based on the strength of dependence between the binary variables. An example is given to illustrate the assessment of dependence and choice of α .  相似文献   

20.
In this article, we investigate the use of implied probabilities (Back and Brown, 1993) to improve estimation in unconditional moment conditions models. Using the seminal contributions of Bonnal and Renault (2001 Bonnal, H., Renault, E. (2001). Minimal Chi-Square Estimation with Conditional Moment Restrictions, Document de Travail, CESG, September 2001. [Google Scholar]) and Antoine et al. (2007 Antoine, B., Bonnal, H., Renault, E. (2007). On the efficient use of the informational content of estimating equations: Implied probabilities and euclidean empirical likelihood. Journal of Econometrics 138(2):461487.[Crossref], [Web of Science ®] [Google Scholar]), we propose two three-step Euclidian empirical likelihood (3S-EEL) estimators for weakly dependent data. Both estimators make use of a control variates principle that can be interpreted in terms of implied probabilities in order to achieve higher-order improvements relative to the traditional two-step GMM estimator. A Monte Carlo study reveals that the finite and large sample properties of the three-step estimators compare favorably to the existing approaches: the two-step GMM and the continuous updating estimator.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号