首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Generalized estimating equations (GEE) have become a popular method for marginal regression modelling of data that occur in clusters. Features of the GEE methodology are the use of a ‘working covariance’, an approximation to the underlying covariance, which is used to improve the efficiency in estimating the regression coefficients, and the ‘sandwich’ estimate of variance, which provides a way of consistently estimating their standard errors. These techniques have been extended to include estimating equations for the underlying correlation structure, both to improve the efficiency of the regression coefficient estimates and to provide estimates of correlations between units in a cluster, when these are of interest. If the mean structure is of primary interest, then a simpler set of equations (GEE1) can be used, whereas if the underlying covariance structure is of interest in its own right, the use of the more complex GEE2 estimating equations is often recommended. In this paper, we compare the effect of increasing the complexity of the ‘working covariances’ on the variance of the parameter estimates, as well as the mean-squared error of the ‘sandwich’ estimate of variance. We give asymptotic expressions for these variances and mean-squared error terms. We use these to study the behaviour of different variants of GEE1 and GEE2 when we change the number of clusters, the cluster size, and the within-cluster correlation. We conclude that the extra complexity of the full GEE2 approach is not usually justified if the mean structure is of primary interest.  相似文献   

2.
The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan’s weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study.  相似文献   

3.
A semiparametric method is developed to estimate the dependence parameter and the joint distribution of the error term in the multivariate linear regression model. The nonparametric part of the method treats the marginal distributions of the error term as unknown, and estimates them using suitable empirical distribution functions. Then the dependence parameter is estimated by either maximizing a pseudolikelihood or solving an estimating equation. It is shown that this estimator is asymptotically normal, and a consistent estimator of its large sample variance is given. A simulation study shows that the proposed semiparametric method is better than the parametric ones available when the error distribution is unknown, which is almost always the case in practice. It turns out that there is no loss of asymptotic efficiency as a result of the estimation of regression parameters. An empirical example on portfolio management is used to illustrate the method.  相似文献   

4.
Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

5.
Xing-Cai Zhou 《Statistics》2013,47(3):668-684
In this paper, empirical likelihood inference in mixture of semiparametric varying-coefficient models for longitudinal data with non-ignorable dropout is investigated. We estimate the non-parametric function based on the estimating equations and the local linear profile-kernel method. An empirical log-likelihood ratio statistic for parametric components is proposed to construct confidence regions and is shown to be an asymptotically chi-squared distribution. The non-parametric version of Wilk's theorem is also derived. A simulation study is undertaken to illustrate the finite sample performance of the proposed method.  相似文献   

6.
Information from multiple informants is frequently used to assess psychopathology. We consider marginal regression models with multiple informants as discrete predictors and a time to event outcome. We fit these models to data from the Stirling County Study; specifically, the models predict mortality from self report of psychiatric disorders and also predict mortality from physician report of psychiatric disorders. Previously, Horton et al. found little relationship between self and physician reports of psychopathology, but that the relationship of self report of psychopathology with mortality was similar to that of physician report of psychopathology with mortality. Generalized estimating equations (GEE) have been used to fit marginal models with multiple informant covariates; here we develop a maximum likelihood (ML) approach and show how it relates to the GEE approach. In a simple setting using a saturated model, the ML approach can be constructed to provide estimates that match those found using GEE. We extend the ML technique to consider multiple informant predictors with missingness and compare the method to using inverse probability weighted (IPW) GEE. Our simulation study illustrates that IPW GEE loses little efficiency compared with ML in the presence of monotone missingness. Our example data has non-monotone missingness; in this case, ML offers a modest decrease in variance compared with IPW GEE, particularly for estimating covariates in the marginal models. In more general settings, e.g., categorical predictors and piecewise exponential models, the likelihood parameters from the ML technique do not have the same interpretation as the GEE. Thus, the GEE is recommended to fit marginal models for its flexibility, ease of interpretation and comparable efficiency to ML in the presence of missing data.  相似文献   

7.
Abstract.  Multivariate failure time data arises when each study subject can potentially ex-perience several types of failures or recurrences of a certain phenomenon, or when failure times are sampled in clusters. We formulate the marginal distributions of such multivariate data with semiparametric accelerated failure time models (i.e. linear regression models for log-transformed failure times with arbitrary error distributions) while leaving the dependence structures for related failure times completely unspecified. We develop rank-based monotone estimating functions for the regression parameters of these marginal models based on right-censored observations. The estimating equations can be easily solved via linear programming. The resultant estimators are consistent and asymptotically normal. The limiting covariance matrices can be readily estimated by a novel resampling approach, which does not involve non-parametric density estimation or evaluation of numerical derivatives. The proposed estimators represent consistent roots to the potentially non-monotone estimating equations based on weighted log-rank statistics. Simulation studies show that the new inference procedures perform well in small samples. Illustrations with real medical data are provided.  相似文献   

8.
In this paper, we consider improved estimating equations for semiparametric partial linear models (PLM) for longitudinal data, or clustered data in general. We approximate the non‐parametric function in the PLM by a regression spline, and utilize quadratic inference functions (QIF) in the estimating equations to achieve a more efficient estimation of the parametric part in the model, even when the correlation structure is misspecified. Moreover, we construct a test which is an analogue to the likelihood ratio inference function for inferring the parametric component in the model. The proposed methods perform well in simulation studies and real data analysis conducted in this paper.  相似文献   

9.
A finite mixture model is considered in which the mixing probabilities vary from observation to observation. A parametric model is assumed for one mixture component distribution, while the others are nonparametric nuisance parameters. Generalized estimating equations (GEE) are proposed for the semi‐parametric estimation. Asymptotic normality of the GEE estimates is demonstrated and the lower bound for their dispersion (asymptotic covariance) matrix is derived. An adaptive technique is developed to derive estimates with nearly optimal small dispersion. An application to the sociological analysis of voting results is discussed. The Canadian Journal of Statistics 41: 217–236; 2013 © 2013 Statistical Society of Canada  相似文献   

10.
Kai B  Li R  Zou H 《Annals of statistics》2011,39(1):305-332
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data.  相似文献   

11.
We have previously(Segal and Neuhaus, 1993) devised methods for obtaining marginal regression coefficients and associated variance estimates for multivariate survival data, using a synthesis of the Poisson regression formulation for univariate censored survival analysis and generalized estimating equations (GEE's). The method is parametric in that a baseline survival distribution is specified. Analogous semiparametric models, with unspecified baseline survival, have also been developed (Wei, Lin and Weissfeld, 1989; Lin, 1994).Common to both these approaches is the provision of robust variances for the regression parameters. However, none of this work has addressed the more difficult area of dependence estimation. While GEE approaches ostensibly provide such estimates, we show that there are problems adopting these with multivariate survival data. Further, we demonstrate that these problems can affect estimation of the regression coefficients themselves. An alternate, ad hoc approach to dependence estimation, based on design effects, is proposed and evaluated via simulation and illustrative examples. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

12.
There are a variety of economic areas, such as studies of employment duration and of the durability of capital goods, in which data on important variables typically are censored. The standard techinques for estimating a model from censored data require the distributions of unobservable random components of the model to be specified a priori up to a finite set of parameters, and misspecification of these distributions usually leads to inconsistent parameter estimates. However, economic theory rarely gives guidance about distributions and the standard estimation techniques do not provide convenient methods for identifying distributions from censored data. Recently, several distribution-free or semiparametric methods for estimating censored regression models have been developed. This paper presents the results of using two such methods to estimate a model of employment duration. The paper reports the operating characteristics of the semiparametric estimators and compares the semiparametric estimates with those obtained from a standard parametric model.  相似文献   

13.
Longitudinal data analysis requires a proper estimation of the within-cluster correlation structure in order to achieve efficient estimates of the regression parameters. When applying likelihood-based methods one may select an optimal correlation structure by the AIC or BIC. However, such information criteria are not applicable for estimating equation based approaches. In this paper we develop a model averaging approach to estimate the correlation matrix by a weighted sum of a group of patterned correlation matrices under the GEE framework. The optimal weight is determined by minimizing the difference between the weighted sum and a consistent yet inefficient estimator of the correlation structure. The computation of our proposed approach only involves a standard quadratic programming on top of the standard GEE procedure and can be easily implemented in practice. We provide theoretical justifications and extensive numerical simulations to support the application of the proposed estimator. A couple of well-known longitudinal data sets are revisited where we implement and illustrate our methodology.  相似文献   

14.
Abstract.  This paper describes our studies on non-parametric maximum-likelihood estimators in a semiparametric mixture model for competing-risks data, in which proportional hazards models are specified for failure time models conditional on cause and a multinomial model is specified for the marginal distribution of cause conditional on covariates. We provide a verifiable identifiability condition and, based on it, establish an asymptotic profile likelihood theory for this model. We also provide efficient algorithms for the computation of the non-parametric maximum-likelihood estimate and its asymptotic variance. The success of this method is demonstrated in simulation studies and in the analysis of Taiwan severe acute respiratory syndrome data.  相似文献   

15.
This article is concerned with the problem of multicollinearity in the linear part of a seemingly unrelated semiparametric (SUS) model. It is also suspected that some additional non stochastic linear constraints hold on the whole parameter space. In the sequel, we propose semiparametric ridge and non ridge type estimators combining the restricted least squares methods in the model under study. For practical aspects, it is assumed that the covariance matrix of error terms is unknown and thus feasible estimators are proposed and their asymptotic distributional properties are derived. Also, necessary and sufficient conditions for the superiority of the ridge-type estimator over the non ridge type estimator for selecting the ridge parameter K are derived. Lastly, a Monte Carlo simulation study is conducted to estimate the parametric and nonparametric parts. In this regard, kernel smoothing and cross validation methods for estimating the nonparametric function are used.  相似文献   

16.
Abstract.  A new semiparametric method for density deconvolution is proposed, based on a model in which only the ratio of the unconvoluted to convoluted densities is specified parametrically. Deconvolution results from reweighting the terms in a standard kernel density estimator, where the weights are defined by the parametric density ratio. We propose that in practice, the density ratio be modelled on the log-scale as a cubic spline with a fixed number of knots. Parameter estimation is based on maximization of a type of semiparametric likelihood. The resulting asymptotic properties for our deconvolution estimator mirror the convergence rates in standard density estimation without measurement error when attention is restricted to our semiparametric class of densities. Furthermore, numerical studies indicate that for practical sample sizes our weighted kernel estimator can provide better results than the classical non-parametric kernel estimator for a range of densities outside the specified semiparametric class.  相似文献   

17.
Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach.  相似文献   

18.
Copula models have become increasingly popular for modelling the dependence structure in multivariate survival data. The two-parameter Archimedean family of Power Variance Function (PVF) copulas includes the Clayton, Positive Stable (Gumbel) and Inverse Gaussian copulas as special or limiting cases, thus offers a unified approach to fitting these important copulas. Two-stage frequentist procedures for estimating the marginal distributions and the PVF copula have been suggested by Andersen (Lifetime Data Anal 11:333–350, 2005), Massonnet et al. (J Stat Plann Inference 139(11):3865–3877, 2009) and Prenen et al. (J R Stat Soc Ser B 79(2):483–505, 2017) which first estimate the marginal distributions and conditional on these in a second step to estimate the PVF copula parameters. Here we explore an one-stage Bayesian approach that simultaneously estimates the marginal and the PVF copula parameters. For the marginal distributions, we consider both parametric as well as semiparametric models. We propose a new method to simulate uniform pairs with PVF dependence structure based on conditional sampling for copulas and on numerical approximation to solve a target equation. In a simulation study, small sample properties of the Bayesian estimators are explored. We illustrate the usefulness of the methodology using data on times to appendectomy for adult twins in the Australian NH&MRC Twin registry. Parameters of the marginal distributions and the PVF copula are simultaneously estimated in a parametric as well as a semiparametric approach where the marginal distributions are modelled using Weibull and piecewise exponential distributions, respectively.  相似文献   

19.
We consider regression analysis when part of covariates are incomplete in generalized linear models. The incomplete covariates could be due to measurement error or missing for some study subjects. We assume there exists a validation sample in which the data is complete and is a simple random subsample from the whole sample. Based on the idea of projection-solution method in Heyde (1997, Quasi-Likelihood and its Applications: A General Approach to Optimal Parameter Estimation. Springer, New York), a class of estimating functions is proposed to estimate the regression coefficients through the whole data. This method does not need to specify a correct parametric model for the incomplete covariates to yield a consistent estimate, and avoids the ‘curse of dimensionality’ encountered in the existing semiparametric method. Simulation results shows that the finite sample performance and efficiency property of the proposed estimates are satisfactory. Also this approach is computationally convenient hence can be applied to daily data analysis.  相似文献   

20.
In this paper, we derive an exact formula for the covariance of two innovations computed from a spatial Gibbs point process and suggest a fast method for estimating this covariance. We show how this methodology can be used to estimate the asymptotic covariance matrix of the maximum pseudo‐likelihood estimator of the parameters of a spatial Gibbs point process model. This allows us to construct asymptotic confidence intervals for the parameters. We illustrate the efficiency of our procedure in a simulation study for several classical parametric models. The procedure is implemented in the statistical software R , and it is included in spatstat , which is an R package for analyzing spatial point patterns.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号