首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
There are a variety of economic areas, such as studies of employment duration and of the durability of capital goods, in which data on important variables typically are censored. The standard techinques for estimating a model from censored data require the distributions of unobservable random components of the model to be specified a priori up to a finite set of parameters, and misspecification of these distributions usually leads to inconsistent parameter estimates. However, economic theory rarely gives guidance about distributions and the standard estimation techniques do not provide convenient methods for identifying distributions from censored data. Recently, several distribution-free or semiparametric methods for estimating censored regression models have been developed. This paper presents the results of using two such methods to estimate a model of employment duration. The paper reports the operating characteristics of the semiparametric estimators and compares the semiparametric estimates with those obtained from a standard parametric model.  相似文献   

2.
In the analysis of semi‐competing risks data interest lies in estimation and inference with respect to a so‐called non‐terminal event, the observation of which is subject to a terminal event. Multi‐state models are commonly used to analyse such data, with covariate effects on the transition/intensity functions typically specified via the Cox model and dependence between the non‐terminal and terminal events specified, in part, by a unit‐specific shared frailty term. To ensure identifiability, the frailties are typically assumed to arise from a parametric distribution, specifically a Gamma distribution with mean 1.0 and variance, say, σ2. When the frailty distribution is misspecified, however, the resulting estimator is not guaranteed to be consistent, with the extent of asymptotic bias depending on the discrepancy between the assumed and true frailty distributions. In this paper, we propose a novel class of transformation models for semi‐competing risks analysis that permit the non‐parametric specification of the frailty distribution. To ensure identifiability, the class restricts to parametric specifications of the transformation and the error distribution; the latter are flexible, however, and cover a broad range of possible specifications. We also derive the semi‐parametric efficient score under the complete data setting and propose a non‐parametric score imputation method to handle right censoring; consistency and asymptotic normality of the resulting estimators is derived and small‐sample operating characteristics evaluated via simulation. Although the proposed semi‐parametric transformation model and non‐parametric score imputation method are motivated by the analysis of semi‐competing risks data, they are broadly applicable to any analysis of multivariate time‐to‐event outcomes in which a unit‐specific shared frailty is used to account for correlation. Finally, the proposed model and estimation procedures are applied to a study of hospital readmission among patients diagnosed with pancreatic cancer.  相似文献   

3.
In survival analysis, time-dependent covariates are usually present as longitudinal data collected periodically and measured with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models may be used for modelling of survival events. The hazard rate of survival times depends on the underlying time-dependent covariate measured with error, which may be described by random effects. Most existing methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be always valid in practice. In this article, we propose a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The proposed method does not require any parametric distribution assumption on random effects and random errors. Asymptotic properties for parameter estimators are provided. Simulation results show that under certain situations the proposed methods are more efficient than the existing methods.  相似文献   

4.
This paper deals with a longitudinal semi‐parametric regression model in a generalised linear model setup for repeated count data collected from a large number of independent individuals. To accommodate the longitudinal correlations, we consider a dynamic model for repeated counts which has decaying auto‐correlations as the time lag increases between the repeated responses. The semi‐parametric regression function involved in the model contains a specified regression function in some suitable time‐dependent covariates and a non‐parametric function in some other time‐dependent covariates. As far as the inference is concerned, because the non‐parametric function is of secondary interest, we estimate this function consistently using the independence assumption‐based well‐known quasi‐likelihood approach. Next, the proposed longitudinal correlation structure and the estimate of the non‐parametric function are used to develop a semi‐parametric generalised quasi‐likelihood approach for consistent and efficient estimation of the regression effects in the parametric regression function. The finite sample performance of the proposed estimation approach is examined through an intensive simulation study based on both large and small samples. Both balanced and unbalanced cluster sizes are incorporated in the simulation study. The asymptotic performances of the estimators are given. The estimation methodology is illustrated by reanalysing the well‐known health care utilisation data consisting of counts of yearly visits to a physician by 180 individuals for four years and several important primary and secondary covariates.  相似文献   

5.
When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

6.
Abstract.  We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerate distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicates for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c- myc expression level, is subject to measurement error.  相似文献   

7.
There exists a recent study where dynamic mixed‐effects regression models for count data have been extended to a semi‐parametric context. However, when one deals with other discrete data such as binary responses, the results based on count data models are not directly applicable. In this paper, we therefore begin with existing binary dynamic mixed models and generalise them to the semi‐parametric context. For inference, we use a new semi‐parametric conditional quasi‐likelihood (SCQL) approach for the estimation of the non‐parametric function involved in the semi‐parametric model, and a semi‐parametric generalised quasi‐likelihood (SGQL) approach for the estimation of the main regression, dynamic dependence and random effects variance parameters. A semi‐parametric maximum likelihood (SML) approach is also used as a comparison to the SGQL approach. The properties of the estimators are examined both asymptotically and empirically. More specifically, the consistency of the estimators is established and finite sample performances of the estimators are examined through an intensive simulation study.  相似文献   

8.
We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probabilities than the other two estimators. We concluded that analysts should be cautious in using aggregate data to estimate the parameters of the logistic regression model for the underlying individual data.  相似文献   

9.
Abstract

In this article, we consider a panel data partially linear regression model with fixed effect and non parametric time trend function. The data can be dependent cross individuals through linear regressor and error components. Unlike the methods using non parametric smoothing technique, a difference-based method is proposed to estimate linear regression coefficients of the model to avoid bandwidth selection. Here the difference technique is employed to eliminate the non parametric function effect, not the fixed effects, on linear regressor coefficient estimation totally. Therefore, a more efficient estimator for parametric part is anticipated, which is shown to be true by the simulation results. For the non parametric component, the polynomial spline technique is implemented. The asymptotic properties of estimators for parametric and non parametric parts are presented. We also show how to select informative ones from a number of covariates in the linear part by using smoothly clipped absolute deviation-penalized estimators on a difference-based least-squares objective function, and the resulting estimators perform asymptotically as well as the oracle procedure in terms of selecting the correct model.  相似文献   

10.
Abstract

Sliced average variance estimation (SAVE) is one of the best methods for estimating central dimension-reduction subspace in semi parametric regression models when covariates are normal. In recent days SAVE is being used to analyze DNA microarray data especially in tumor classification but most important drawback is normality of covariates. In this article, the asymptotic behavior of estimates of CDR space under varying slice size is studied through simulation studies when covariates are non normal but follows linearity condition as well as when covariates slightly perturbed from normal distribution and we observed that serious error may occur under violation normality assumption.  相似文献   

11.
In this article, we propose semiparametric methods to estimate the cumulative incidence function of two dependent competing risks for left-truncated and right-censored data. The proposed method is based on work by Huang and Wang (1995). We extend previous model by allowing for a general parametric truncation distribution and a third competing risk before recruitment. Based on work by Vardi (1989), several iterative algorithms are proposed to obtain the semiparametric estimates of cumulative incidence functions. The asymptotic properties of the semiparametric estimators are derived. Simulation results show that a semiparametric approach assuming the parametric truncation distribution is correctly specified produces estimates with smaller mean squared error than those obtained in a fully nonparametric model.  相似文献   

12.
Summary.  Multiple imputation is now a well-established technique for analysing data sets where some units have incomplete observations. Provided that the imputation model is correct, the resulting estimates are consistent. An alternative, weighting by the inverse probability of observing complete data on a unit, is conceptually simple and involves fewer modelling assumptions, but it is known to be both inefficient (relative to a fully parametric approach) and sensitive to the choice of weighting model. Over the last decade, there has been a considerable body of theoretical work to improve the performance of inverse probability weighting, leading to the development of 'doubly robust' or 'doubly protected' estimators. We present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint.  相似文献   

13.
In many longitudinal studies of recurrent events there is an interest in assessing how recurrences vary over time and across treatments or strata in the population. Usual analyses of such data assume a parametric form for the distribution of the recurrences over time. Here, we consider a semiparametric model for the analysis of such longitudinal studies where data are collected as panel counts. The model is a non-homogeneous Poisson process with a multiplicative intensity incorporating covariates through a proportionality assumption. Heterogeneity is accounted for in the model through subject-specific random effects. The key feature of the model is the use of regression splines to model the distribution of recurrences over time. This provides a flexible and robust method of relaxing parametric assumptions. In addition, quasi-likelihood methods are proposed for estimation, requiring only first and second moment assumptions to obtain consistent estimates. Simulations demonstrate that the method produces estimators of the rate with low bias and whose standardized distributions are well approximated by the normal. The usefulness of this approach, especially as an exploratory tool, is illustrated by analyzing a study designed to assess the effectiveness of a pheromone treatment in disturbing the mating habits of the Cherry Bark Tortrix moth.  相似文献   

14.
We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

15.
Kai B  Li R  Zou H 《Annals of statistics》2011,39(1):305-332
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data.  相似文献   

16.
We consider the estimation of life length of people who were born in the seventeenth or eighteenth century in England. The data consist of a sequence of times of life events that is either ended by a time of death or is right-censored by an unobserved time of migration. We propose a semi parametric model for the data and use a maximum likelihood method to estimate the unknown parameters in this model. We prove the consistency of the maximum likelihood estimators and describe an algorithm to obtain the estimates numerically. We have applied the algorithm to data and the estimates found are presented.  相似文献   

17.
A finite mixture model is considered in which the mixing probabilities vary from observation to observation. A parametric model is assumed for one mixture component distribution, while the others are nonparametric nuisance parameters. Generalized estimating equations (GEE) are proposed for the semi‐parametric estimation. Asymptotic normality of the GEE estimates is demonstrated and the lower bound for their dispersion (asymptotic covariance) matrix is derived. An adaptive technique is developed to derive estimates with nearly optimal small dispersion. An application to the sociological analysis of voting results is discussed. The Canadian Journal of Statistics 41: 217–236; 2013 © 2013 Statistical Society of Canada  相似文献   

18.
Missing covariate data are common in biomedical studies. In this article, by using the non parametric kernel regression technique, a new imputation approach is developed for the Cox-proportional hazard regression model with missing covariates. This method achieves the same efficiency as the fully augmented weighted estimators (Qi et al. 2005. Journal of the American Statistical Association, 100:1250) and has a simpler form. The asymptotic properties of the proposed estimator are derived and analyzed. The comparisons between the proposed imputation method and several other existing methods are conducted via a number of simulation studies and a mouse leukemia data.  相似文献   

19.
We consider mixed effects models for longitudinal, repeated measures or clustered data. Unmeasured or omitted covariates in such models may be correlated with the included covanates, and create model violations when not taken into account. Previous research and experience with longitudinal data sets suggest a general form of model which should be considered when omitted covariates are likely, such as in observational studies. We derive the marginal model between the response variable and included covariates, and consider model fitting using the ordinary and weighted least squares methods, which require simple non-iterative computation and no assumptions on the distribution of random covariates or error terms, Asymptotic properties of the least squares estimators are also discussed. The results shed light on the structure of least squares estimators in mixed effects models, and provide large sample procedures for statistical inference and prediction based on the marginal model. We present an example of the relationship between fluid intake and output in very low birth weight infants, where the model is found to have the assumed structure.  相似文献   

20.
When multilevel models are estimated from survey data derived using multistage sampling, unequal selection probabilities at any stage of sampling may induce bias in standard estimators, unless the sources of the unequal probabilities are fully controlled for in the covariates. This paper proposes alternative ways of weighting the estimation of a two-level model by using the reciprocals of the selection probabilities at each stage of sampling. Consistent estimators are obtained when both the sample number of level 2 units and the sample number of level 1 units within sampled level 2 units increase. Scaling of the weights is proposed to improve the properties of the estimators and to simplify computation. Variance estimators are also proposed. In a limited simulation study the scaled weighted estimators are found to perform well, although non-negligible bias starts to arise for informative designs when the sample number of level 1 units becomes small. The variance estimators perform extremely well. The procedures are illustrated using data from the survey of psychiatric morbidity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号