首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
We use a class of parametric counting process regression models that are commonly employed in the analysis of failure time data to formulate the subject-specific capture probabilities for removal and recapture studies conducted in continuous time. We estimate the regression parameters by modifying the conventional likelihood score function for left-truncated and right-censored data to accommodate an unknown population size and missing covariates on uncaptured subjects, and we subsequently estimate the population size by a martingale-based estimating function. The resultant estimators for the regression parameters and population size are consistent and asymptotically normal under appropriate regularity conditions. We assess the small sample properties of the proposed estimators through Monte Carlo simulation and we present an application to a bird banding exercise.  相似文献   

This paper develops a novel weighted composite quantile regression (CQR) method for estimation of a linear model when some covariates are missing at random and the probability for missingness mechanism can be modelled parametrically. By incorporating the unbiased estimating equations of incomplete data into empirical likelihood (EL), we obtain the EL-based weights, and then re-adjust the inverse probability weighted CQR for estimating the vector of regression coefficients. Theoretical results show that the proposed method can achieve semiparametric efficiency if the selection probability function is correctly specified, therefore the EL weighted CQR is more efficient than the inverse probability weighted CQR. Besides, our algorithm is computationally simple and easy to implement. Simulation studies are conducted to examine the finite sample performance of the proposed procedures. Finally, we apply the new method to analyse the US news College data.  相似文献   

Abstract.  In a case–cohort design a random sample from the study cohort, referred as a subcohort, and all the cases outside the subcohort are selected for collecting extra covariate data. The union of the selected subcohort and all cases are referred as the case–cohort set. Such a design is generally employed when the collection of information on an extra covariate for the study cohort is expensive. An advantage of the case–cohort design over more traditional case–control and the nested case–control designs is that it provides a set of controls which can be used for multiple end-points, in which case there is information on some covariates and event follow-up for the whole study cohort. Here, we propose a Bayesian approach to analyse such a case–cohort design as a cohort design with incomplete data on the extra covariate. We construct likelihood expressions when multiple end-points are of interest simultaneously and propose a Bayesian data augmentation method to estimate the model parameters. A simulation study is carried out to illustrate the method and the results are compared with the complete cohort analysis.  相似文献   

The accelerated failure time (AFT) model is an important regression tool to study the association between failure time and covariates. In this paper, we propose a robust weighted generalized M (GM) estimation for the AFT model with right-censored data by appropriately using the Kaplan–Meier weights in the GM–type objective function to estimate the regression coefficients and scale parameter simultaneously. This estimation method is computationally simple and can be implemented with existing software. Asymptotic properties including the root-n consistency and asymptotic normality are established for the resulting estimator under suitable conditions. We further show that the method can be readily extended to handle a class of nonlinear AFT models. Simulation results demonstrate satisfactory finite sample performance of the proposed estimator. The practical utility of the method is illustrated by a real data example.  相似文献   

Recurrent events are frequently encountered in biomedical studies. Evaluating the covariates effects on the marginal recurrent event rate is of practical interest. There are mainly two types of rate models for the recurrent event data: the multiplicative rates model and the additive rates model. We consider a more flexible additive–multiplicative rates model for analysis of recurrent event data, wherein some covariate effects are additive while others are multiplicative. We formulate estimating equations for estimating the regression parameters. The estimators for these regression parameters are shown to be consistent and asymptotically normally distributed under appropriate regularity conditions. Moreover, the estimator of the baseline mean function is proposed and its large sample properties are investigated. We also conduct simulation studies to evaluate the finite sample behavior of the proposed estimators. A medical study of patients with cystic fibrosis suffered from recurrent pulmonary exacerbations is provided for illustration of the proposed method.  相似文献   

A method based on pseudo-observations has been proposed for direct regression modeling of functionals of interest with right-censored data, including the survival function, the restricted mean and the cumulative incidence function in competing risks. The models, once the pseudo-observations have been computed, can be fitted using standard generalized estimating equation software. Regression models can however yield problematic results if the number of covariates is large in relation to the number of events observed. Guidelines of events per variable are often used in practice. These rules of thumb for the number of events per variable have primarily been established based on simulation studies for the logistic regression model and Cox regression model. In this paper we conduct a simulation study to examine the small sample behavior of the pseudo-observation method to estimate risk differences and relative risks for right-censored data. We investigate how coverage probabilities and relative bias of the pseudo-observation estimator interact with sample size, number of variables and average number of events per variable.  相似文献   

Many analyses for incomplete longitudinal data are directed to examining the impact of covariates on the marginal mean responses. We consider the setting in which longitudinal responses are collected from individuals nested within clusters. We discuss methods for assessing covariate effects on the mean and association parameters when covariates are incompletely observed. Weighted first and second order estimating equations are constructed to obtain consistent estimates of mean and association parameters when covariates are missing at random. Empirical studies demonstrate that estimators from the proposed method have negligible finite sample biases in moderate samples. An application to the National Alzheimer's Coordinating Center (NACC) Uniform Data Set (UDS) demonstrates the utility of the proposed method.  相似文献   

Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance–covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.  相似文献   

Recurrent event data often arise in biomedical studies, with examples including hospitalizations, infections, and treatment failures. In observational studies, it is often of interest to estimate the effects of covariates on the marginal recurrent event rate. The majority of existing rate regression methods assume multiplicative covariate effects. We propose a semiparametric model for the marginal recurrent event rate, wherein the covariates are assumed to add to the unspecified baseline rate. Covariate effects are summarized by rate differences, meaning that the absolute effect on the rate function can be determined from the regression coefficient alone. We describe modifications of the proposed method to accommodate a terminating event (e.g., death). Proposed estimators of the regression parameters and baseline rate are shown to be consistent and asymptotically Gaussian. Simulation studies demonstrate that the asymptotic approximations are accurate in finite samples. The proposed methods are applied to a state-wide kidney transplant data set.  相似文献   

Semiparametric regression models with multiple covariates are commonly encountered. When there are covariates not associated with response variable, variable selection may lead to sparser models, more lucid interpretations and more accurate estimation. In this study, we adopt a sieve approach for the estimation of nonparametric covariate effects in semiparametric regression models. We adopt a two-step iterated penalization approach for variable selection. In the first step, a mixture of the Lasso and group Lasso penalties are employed to conduct the first-round variable selection and obtain the initial estimate. In the second step, a mixture of the weighted Lasso and weighted group Lasso penalties, with weights constructed using the initial estimate, are employed for variable selection. We show that the proposed iterated approach has the variable selection consistency property, even when number of unknown parameters diverges with sample size. Numerical studies, including simulation and analysis of a diabetes dataset, show satisfactory performance of the proposed approach.  相似文献   

In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

We consider logistic regression with covariate measurement error. Most existing approaches require certain replicates of the error‐contaminated covariates, which may not be available in the data. We propose generalized method of moments (GMM) nonparametric correction approaches that use instrumental variables observed in a calibration subsample. The instrumental variable is related to the underlying true covariates through a general nonparametric model, and the probability of being in the calibration subsample may depend on the observed variables. We first take a simple approach adopting the inverse selection probability weighting technique using the calibration subsample. We then improve the approach based on the GMM using the whole sample. The asymptotic properties are derived, and the finite sample performance is evaluated through simulation studies and an application to a real data set.  相似文献   

Analysis of massive datasets is challenging owing to limitations of computer primary memory. Composite quantile regression (CQR) is a robust and efficient estimation method. In this paper, we extend CQR to massive datasets and propose a divide-and-conquer CQR method. The basic idea is to split the entire dataset into several blocks, applying the CQR method for data in each block, and finally combining these regression results via weighted average. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient as if the entire data set is analysed simultaneously. Moreover, to improve the efficiency of CQR, we propose a weighted CQR estimation approach. To achieve sparsity with high-dimensional covariates, we develop a variable selection procedure to select significant parametric components and prove the method possessing the oracle property. Both simulations and data analysis are conducted to illustrate the finite sample performance of the proposed methods.  相似文献   

We consider failure time regression analysis with an auxiliary variable in the presence of a validation sample. We extend the nonparametric inference procedure of Zhou and Pepe to handle a continuous auxiliary or proxy covariate. We estimate the induced relative risk function with a kernel smoother and allow the selection probability of the validation set to depend on the observed covariates. We present some asymptotic properties for the kernel estimator and provide some simulation results. The method proposed is illustrated with a data set from an on-going epidemiologic study.  相似文献   

Quantile regression is a flexible approach to assessing covariate effects on failure time, which has attracted considerable interest in survival analysis. When the dimension of covariates is much larger than the sample size, feature screening and variable selection become extremely important and indispensable. In this article, we introduce a new feature screening method for ultrahigh dimensional censored quantile regression. The proposed method can work for a general class of survival models, allow for heterogeneity of data and enjoy desirable properties including the sure screening property and the ranking consistency property. Moreover, an iterative version of screening algorithm has also been proposed to accommodate more complex situations. Monte Carlo simulation studies are designed to evaluate the finite sample performance under different model settings. We also illustrate the proposed methods through an empirical analysis.  相似文献   

Screening procedures play an important role in data analysis, especially in high-throughput biological studies where the datasets consist of more covariates than independent subjects. In this article, a Bayesian screening procedure is introduced for the binary response models with logit and probit links. In contrast to many screening rules based on marginal information involving one or a few covariates, the proposed Bayesian procedure simultaneously models all covariates and uses closed-form screening statistics. Specifically, we use the posterior means of the regression coefficients as screening statistics; by imposing a generalized g-prior on the regression coefficients, we derive the analytical form of their posterior means and compute the screening statistics without Markov chain Monte Carlo implementation. We evaluate the utility of the proposed Bayesian screening method using simulations and real data analysis. When the sample size is small, the simulation results suggest improved performance with comparable computational cost.  相似文献   

This article considers the analysis of complex monitored health data, where often one or several signals are reflecting the current health status that can be represented by a finite number of states, in addition to a set of covariates. In particular, we consider a novel application of a non-parametric state intensity regression method in order to study time-dependent effects of covariates on the state transition intensities. The method can handle baseline, time varying as well as dynamic covariates. Because of the non-parametric nature, the method can handle different data types and challenges under minimal assumptions. If the signal that is reflecting the current health status is of continuous nature, we propose the application of a weighted median and a hysteresis filter as data pre-processing steps in order to facilitate robust analysis. In intensity regression, covariates can be aggregated by a suitable functional form over a time history window. We propose to study the estimated cumulative regression parameters for different choices of the time history window in order to investigate short- and long-term effects of the given covariates. The proposed framework is discussed and applied to resuscitation data of newborns collected in Tanzania.  相似文献   

Proportion differences are often used to estimate and test treatment effects in clinical trials with binary outcomes. In order to adjust for other covariates or intra-subject correlation among repeated measures, logistic regression or longitudinal data analysis models such as generalized estimating equation or generalized linear mixed models may be used for the analyses. However, these analysis models are often based on the logit link which results in parameter estimates and comparisons in the log-odds ratio scale rather than in the proportion difference scale. A two-step method is proposed in the literature to approximate the calculation of confidence intervals for the proportion difference using a concept of effective sample sizes. However, the performance of this two-step method has not been investigated in their paper. On this note, we examine the properties of the two-step method and propose an adjustment to the effective sample size formula based on Bayesian information theory. Simulations are conducted to evaluate the performance and to show that the modified effective sample size improves the coverage property of the confidence intervals.  相似文献   

We study the problem of fitting a heteroscedastic median regression model with doubly truncated data. A self-consistency equation is proposed to obtain an estimator. We set up a least absolute deviation estimating function. We establish the consistency and asymptotic normality for the case when covariates are discrete. The finite sample performance of the proposed estimators are investigated through simulation studies. The proposed method is illustrated using the AIDS Blood Transfusion Data.  相似文献   

When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号