首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Summary. To construct an optimal estimating function by weighting a set of score functions, we must either know or estimate consistently the covariance matrix for the individual scores. In problems with high dimensional correlated data the estimated covariance matrix could be unreliable. The smallest eigenvalues of the covariance matrix will be the most important for weighting the estimating equations, but in high dimensions these will be poorly determined. Generalized estimating equations introduced the idea of a working correlation to minimize such problems. However, it can be difficult to specify the working correlation model correctly. We develop an adaptive estimating equation method which requires no working correlation assumptions. This methodology relies on finding a reliable approximation to the inverse of the variance matrix in the quasi-likelihood equations. We apply a multivariate generalization of the conjugate gradient method to find estimating equations that preserve the information well at fixed low dimensions. This approach is particularly useful when the estimator of the covariance matrix is singular or close to singular, or impossible to invert owing to its large size.  相似文献   

3.
In the longitudinal studies, the mixture generalized estimation equation (mix-GEE) was proposed to improve the efficiency of the fixed-effects estimator for addressing the working correlation structure misspecification. When the subject-specific effect is one of interests, mixed-effects models were widely used to analyze longitudinal data. However, most of the existing approaches assume a normal distribution for the random effects, and this could affect the efficiency of the fixed-effects estimator. In this article, a conditional mixture generalized estimating equation (cmix-GEE) approach based on the advantage of mix-GEE and conditional quadratic inference function (CQIF) method is developed. The advantage of our new approach is that it does not require the normality assumption for random effects and can accommodate the serial correlation between observations within the same cluster. The feature of our proposed approach is that the estimators of the regression parameters are more efficient than CQIF even if the working correlation structure is not correctly specified. In addition, according to the estimates of some mixture proportions, the true working correlation matrix can be identified. We establish the asymptotic results for the fixed-effects parameter estimators. Simulation studies were conducted to evaluate our proposed method.  相似文献   

4.
The weighted generalized estimating equation (WGEE), an extension of the generalized estimating equation (GEE) method, is a method for analyzing incomplete longitudinal data. An inappropriate specification of the working correlation structure results in the loss of efficiency of the GEE estimation. In this study, we evaluated the efficiency of WGEE estimation for incomplete longitudinal data when the working correlation structure was misspecified. As a result, we found that the efficiency of the WGEE estimation was lower when an improper working correlation structure was selected, similar to the case of the GEE method. Furthermore, we modified the criterion proposed by Gosho et al. (2011 Gosho, M., Hamada, C. and Yoshimura, I. 2011. Criterion for the selection of a working correlation structure in the generalized estimating equation approach for longitudinal balanced data. Communications in Statistics -Theory and Methods, 40: 38393856. [Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) for selecting a working correlation structure, such that the GEE and WGEE methods can be applied to incomplete longitudinal data, and we investigated the performance of the modified criterion. The results revealed that when the modified criterion was adopted, the proportion that the true correlation structure was selected was likely higher than that in the case of adopting other competing approaches.  相似文献   

5.
This paper proposes a working estimating equation which is computationally easy to use for spatial count data. The proposed estimating equation is a modification of quasi-likelihood estimating equations without the need of correctly specifying the covariance matrix. Under some regularity conditions, we show that the proposed estimator has consistency and asymptotic normality. A simulation comparison also indicates that the proposed method has competitive performance in dealing with over-dispersion data from a parameter-driven model.  相似文献   

6.
Generalized estimating equations (GEE) is one of the most commonly used methods for regression analysis of longitudinal data, especially with discrete outcomes. The GEE method accounts for the association among the responses of a subject through a working correlation matrix and its correct specification ensures efficient estimation of the regression parameters in the marginal mean regression model. This study proposes a predicted residual sum of squares (PRESS) statistic as a working correlation selection criterion in GEE. A simulation study is designed to assess the performance of the proposed GEE PRESS criterion and to compare its performance with its counterpart criteria in the literature. The results show that the GEE PRESS criterion has better performance than the weighted error sum of squares SC criterion in all cases but is surpassed in performance by the Gaussian pseudo-likelihood criterion. Lastly, the working correlation selection criteria are illustrated with data from the Coronary Artery Risk Development in Young Adults study.  相似文献   

7.
Summary.  Using standard correlation bounds, we show that in generalized estimation equations (GEEs) the so-called 'working correlation matrix' R ( α ) for analysing binary data cannot in general be the true correlation matrix of the data. Methods for estimating the correlation param-eter in current GEE software for binary responses disregard these bounds. To show that the GEE applied on binary data has high efficiency, we use a multivariate binary model so that the covariance matrix from estimating equation theory can be compared with the inverse Fisher information matrix. But R ( α ) should be viewed as the weight matrix, and it should not be confused with the correlation matrix of the binary responses. We also do a comparison with more general weighted estimating equations by using a matrix Cauchy–Schwarz inequality. Our analysis leads to simple rules for the choice of α in an exchangeable or autoregressive AR(1) weight matrix R ( α ), based on the strength of dependence between the binary variables. An example is given to illustrate the assessment of dependence and choice of α .  相似文献   

8.
The generalized estimating equations (GEE) approach has attracted considerable interest for the analysis of correlated response data. This paper considers the model selection criterion based on the multivariate quasi‐likelihood (MQL) in the GEE framework. The GEE approach is closely related to the MQL. We derive a necessary and sufficient condition for the uniqueness of the risk function based on the MQL by using properties of differential geometry. Furthermore, we establish a formal derivation of model selection criterion as an asymptotically unbiased estimator of the prediction risk under this condition, and we explicitly take into account the effect of estimating the correlation matrix used in the GEE procedure.  相似文献   

9.
A new covariance matrix estimator is proposed under the assumption that at every time period all pairwise correlations are equal. This assumption, which is pragmatically applied in various areas of finance, makes it possible to estimate arbitrarily large covariance matrices with ease. The model, called DECO, involves first adjusting for individual volatilities and then estimating correlations. A quasi-maximum likelihood result shows that DECO provides consistent parameter estimates even when the equicorrelation assumption is violated. We demonstrate how to generalize DECO to block equicorrelation structures. DECO estimates for U.S. stock return data show that (block) equicorrelated models can provide a better fit of the data than DCC. Using out-of-sample forecasts, DECO and Block DECO are shown to improve portfolio selection compared to an unrestricted dynamic correlation structure.  相似文献   

10.
When spatial data are correlated, currently available data‐driven smoothing parameter selection methods for nonparametric regression will often fail to provide useful results. The authors propose a method that adjusts the generalized cross‐validation criterion for the effect of spatial correlation in the case of bivariate local polynomial regression. Their approach uses a pilot fit to the data and the estimation of a parametric covariance model. The method is easy to implement and leads to improved smoothing parameter selection, even when the covariance model is misspecified. The methodology is illustrated using water chemistry data collected in a survey of lakes in the Northeastern United States.  相似文献   

11.
In this paper, we consider the estimation of both the parameters and the nonparametric link function in partially linear single‐index models for longitudinal data that may be unbalanced. In particular, a new three‐stage approach is proposed to estimate the nonparametric link function using marginal kernel regression and the parametric components with generalized estimating equations. The resulting estimators properly account for the within‐subject correlation. We show that the parameter estimators are asymptotically semiparametrically efficient. We also show that the asymptotic variance of the link function estimator is minimized when the working error covariance matrices are correctly specified. The new estimators are more efficient than estimators in the existing literature. These asymptotic results are obtained without assuming normality. The finite‐sample performance of the proposed method is demonstrated by simulation studies. In addition, two real‐data examples are analyzed to illustrate the methodology.  相似文献   

12.
Summary.  Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.  相似文献   

13.
In longitudinal data analysis, efficient estimation of regression coefficients requires a correct specification of certain covariance structure, and efficient estimation of covariance matrix requires a correct specification of mean regression model. In this article, we propose a general semiparametric model for the mean and the covariance simultaneously using the modified Cholesky decomposition. A regression spline-based approach within the framework of generalized estimating equations is proposed to estimate the parameters in the mean and the covariance. Under regularity conditions, asymptotic properties of the resulting estimators are established. Extensive simulation is conducted to investigate the performance of the proposed estimator and in the end a real data set is analysed using the proposed approach.  相似文献   

14.
For longitudinal data, the within-subject dependence structure and covariance parameters may be of practical and theoretical interests. The estimation of covariance parameters has received much attention and been studied mainly in the framework of generalized estimating equations (GEEs). The GEEs method, however, is sensitive to outliers. In this paper, an alternative set of robust generalized estimating equations for both the mean and covariance parameters are proposed in the partial linear model for longitudinal data. The asymptotic properties of the proposed estimators of regression parameters, non-parametric function and covariance parameters are obtained. Simulation studies are conducted to evaluate the performance of the proposed estimators under different contaminations. The proposed method is illustrated with a real data analysis.  相似文献   

15.
Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

16.
Recent literature provides many computational and modeling approaches for covariance matrices estimation in a penalized Gaussian graphical models but relatively little study has been carried out on the choice of the tuning parameter. This paper tries to fill this gap by focusing on the problem of shrinkage parameter selection when estimating sparse precision matrices using the penalized likelihood approach. Previous approaches typically used K-fold cross-validation in this regard. In this paper, we first derived the generalized approximate cross-validation for tuning parameter selection which is not only a more computationally efficient alternative, but also achieves smaller error rate for model fitting compared to leave-one-out cross-validation. For consistency in the selection of nonzero entries in the precision matrix, we employ a Bayesian information criterion which provably can identify the nonzero conditional correlations in the Gaussian model. Our simulations demonstrate the general superiority of the two proposed selectors in comparison with leave-one-out cross-validation, 10-fold cross-validation and Akaike information criterion.  相似文献   

17.
Recent work has shown that the Lasso-based regularization is very useful for estimating the high-dimensional inverse covariance matrix. A particularly useful scheme is based on penalizing the ?1 norm of the off-diagonal elements to encourage sparsity. We embed this type of regularization into high-dimensional classification. A two-stage estimation procedure is proposed which first recovers structural zeros of the inverse covariance matrix and then enforces block sparsity by moving non-zeros closer to the main diagonal. We show that the block-diagonal approximation of the inverse covariance matrix leads to an additive classifier, and demonstrate that accounting for the structure can yield better performance accuracy. Effect of the block size on classification is explored, and a class of asymptotically equivalent structure approximations in a high-dimensional setting is specified. We suggest a variable selection at the block level and investigate properties of this procedure in growing dimension asymptotics. We present a consistency result on the feature selection procedure, establish asymptotic lower an upper bounds for the fraction of separative blocks and specify constraints under which the reliable classification with block-wise feature selection can be performed. The relevance and benefits of the proposed approach are illustrated on both simulated and real data.  相似文献   

18.
Based on various improved robust covariance estimators in the literature, several modified versions of the well-known correlated information criterion (CIC) for working intra-cluster correlation structure (ICS) selection are proposed. Performances of these modified criteria are examined and compared to the CIC via simulations. When the response is Gaussian, binary, or Poisson, the modified criteria are demonstrated to have higher detection rates when the true ICS is exchangeable, while the CIC would perform better when the true ICS is AR(1). An application of the criteria is made to a real dataset.  相似文献   

19.
Predictive criteria, including the adjusted squared multiple correlation coefficient, the adjusted concordance correlation coefficient, and the predictive error sum of squares, are available for model selection in the linear mixed model. These criteria all involve some sort of comparison of observed values and predicted values, adjusted for the complexity of the model. The predicted values can be conditional on the random effects or marginal, i.e., based on averages over the random effects. These criteria have not been investigated for model selection success.

We used simulations to investigate selection success rates for several versions of these predictive criteria as well as several versions of Akaike's information criterion and the Bayesian information criterion, and the pseudo F-test. The simulations involved the simple scenario of selection of a fixed parameter when the covariance structure is known.

Several variance–covariance structures were used. For compound symmetry structures, higher success rates for the predictive criteria were obtained when marginal rather than conditional predicted values were used. Information criteria had higher success rates when a certain term (normally left out in SAS MIXED computations) was included in the criteria. Various penalty functions were used in the information criteria, but these had little effect on success rates. The pseudo F-test performed as expected. For the autoregressive with random effects structure, the results were the same except that success rates were higher for the conditional version of the predictive error sum of squares.

Characteristics of the data, such as the covariance structure, parameter values, and sample size, greatly impacted performance of various model selection criteria. No one criterion was consistently better than the others.  相似文献   

20.
Time-course gene sets are collections of predefined groups of genes in some patients gathered over time. The analysis of time-course gene sets for testing gene sets which vary significantly over time is an important context in genomic data analysis. In this paper, the method of generalized estimating equations (GEEs), which is a semi-parametric approach, is applied to time-course gene set data. We propose a special structure of working correlation matrix to handle the association among repeated measurements of each patient over time. Also, the proposed working correlation matrix permits estimation of the effects of the same gene among different patients. The proposed approach is applied to an HIV therapeutic vaccine trial (DALIA-1 trial). This data set has two phases: pre-ATI and post-ATI which depend on a vaccination period. Using multiple testing, the significant gene sets in the pre-ATI phase are detected and data on two randomly selected gene sets in the post-ATI phase are also analyzed. Some simulation studies are performed to illustrate the proposed approaches. The results of the simulation studies confirm the good performance of our proposed approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号