首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
This article compares the accuracy of the median unbiased estimator with that of the maximum likelihood estimator for a logistic regression model with two binary covariates. The former estimator is shown to be uniformly more accurate than the latter for small to moderately large sample sizes and a broad range of parameter values. In view of the recently developed efficient algorithms for generating exact distributions of sufficient statistics in binary-data problems, these results call for a serious consideration of median unbiased estimation as an alternative to maximum likelihood estimation, especially when the sample size is not large, or when the data structure is sparse.  相似文献   

2.
Abstract.  We consider inference for a semiparametric regression model where some covariates are measured with errors, and the errors in both the regression model and the mismeasured covariates are serially correlated. We propose a weighted estimating equations-based estimator (WEEBE) for the regression coefficients. We show that the WEEBE is asymptotically more efficient than the estimators that neglect the serial correlations. This is an interesting new finding since earlier results in the statistical literature have shown that the weighted estimation is not as efficient as the unweighted estimation when the measurement errors and serially correlated errors of the regression models exist simultaneously (Biometrics, 49, 1993, 1262; Technometrics, 42, 2000, 137). The proposed WEEBE does not require undersmoothing the regressor functions in order to make it attain the root- n consistency. Simulation studies show that the proposed estimator has nice finite sample properties. A real data set is used to illustrate the proposed method.  相似文献   

3.
This article studies the absolute penalized convex function estimator in sparse and high-dimensional additive hazards model. Under such model, we assume that the failure time data are interval-censored and the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on some natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true parameters in the model. Some similar inequalities based on an extension of the restricted eigenvalue are also established. Under mild conditions, we prove that the compatibility and cone invertibility factors and the restricted eigenvalues are bounded from below by positive constants for time-dependent covariates.  相似文献   

4.
High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by applications in high-throughput genomic data analysis. In this paper, we propose a class of regularization methods, integrating both the penalized empirical likelihood and pseudoscore approaches, for variable selection and estimation in sparse and high-dimensional additive hazards regression models. When the number of covariates grows with the sample size, we establish asymptotic properties of the resulting estimator and the oracle property of the proposed method. It is shown that the proposed estimator is more efficient than that obtained from the non-concave penalized likelihood approach in the literature. Based on a penalized empirical likelihood ratio statistic, we further develop a nonparametric likelihood approach for testing the linear hypothesis of regression coefficients and constructing confidence regions consequently. Simulation studies are carried out to evaluate the performance of the proposed methodology and also two real data sets are analyzed.  相似文献   

5.
We consider hypothesis testing problems for low‐dimensional coefficients in a high dimensional additive hazard model. A variance reduced partial profiling estimator (VRPPE) is proposed and its asymptotic normality is established, which enables us to test the significance of each single coefficient when the data dimension is much larger than the sample size. Based on the p‐values obtained from the proposed test statistics, we then apply a multiple testing procedure to identify significant coefficients and show that the false discovery rate can be controlled at the desired level. The proposed method is also extended to testing a low‐dimensional sub‐vector of coefficients. The finite sample performance of the proposed testing procedure is evaluated by simulation studies. We also apply it to two real data sets, with one focusing on testing low‐dimensional coefficients and the other focusing on identifying significant coefficients through the proposed multiple testing procedure.  相似文献   

6.
We propose a profile conditional likelihood approach to handle missing covariates in the general semiparametric transformation regression model. The method estimates the marginal survival function by the Kaplan-Meier estimator, and then estimates the parameters of the survival model and the covariate distribution from a conditional likelihood, substituting the Kaplan-Meier estimator for the marginal survival function in the conditional likelihood. This method is simpler than full maximum likelihood approaches, and yields consistent and asymptotically normally distributed estimator of the regression parameter when censoring is independent of the covariates. The estimator demonstrates very high relative efficiency in simulations. When compared with complete-case analysis, the proposed estimator can be more efficient when the missing data are missing completely at random and can correct bias when the missing data are missing at random. The potential application of the proposed method to the generalized probit model with missing continuous covariates is also outlined.  相似文献   

7.
In biostatistical applications interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time C, then the data conforms to the well understood singly-censored current status model, also known as interval censored data, case I. Additional covariates can be used to allow for dependent censoring and to improve estimation of the marginal distribution of T. Assuming a wrong model for the conditional distribution of T, given the covariates, will lead to an inconsistent estimator of the marginal distribution. On the other hand, the nonparametric maximum likelihood estimator of FT requires splitting up the sample in several subsamples corresponding with a particular value of the covariates, computing the NPMLE for every subsample and then taking an average. With a few continuous covariates the performance of the resulting estimator is typically miserable. In van der Laan, Robins (1996) a locally efficient one-step estimator is proposed for smooth functionals of the distribution of T, assuming nothing about the conditional distribution of T, given the covariates, but assuming a model for censoring, given the covariates. The estimators are asymptotically linear if the censoring mechanism is estimated correctly. The estimator also uses an estimator of the conditional distribution of T, given the covariates. If this estimate is consistent, then the estimator is efficient and if it is inconsistent, then the estimator is still consistent and asymptotically normal. In this paper we show that the estimators can also be used to estimate the distribution function in a locally optimal way. Moreover, we show that the proposed estimator can be used to estimate the distribution based on interval censored data (T is now known to lie between two observed points) in the presence of covariates. The resulting estimator also has a known influence curve so that asymptotic confidence intervals are directly available. In particular, one can apply our proposal to the interval censored data without covariates. In Geskus (1992) the information bound for interval censored data with two uniformly distributed monitoring times at the uniform distribution (for T has been computed. We show that the relative efficiency of our proposal w.r.t. this optimal bound equals 0.994, which is also reflected in finite sample simulations. Finally, the good practical performance of the estimator is shown in a simulation study. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

8.
This paper considers estimation and prediction in the Aalen additive hazards model in the case where the covariate vector is high-dimensional such as gene expression measurements. Some form of dimension reduction of the covariate space is needed to obtain useful statistical analyses. We study the partial least squares regression method. It turns out that it is naturally adapted to this setting via the so-called Krylov sequence. The resulting PLS estimator is shown to be consistent provided that the number of terms included is taken to be equal to the number of relevant components in the regression model. A standard PLS algorithm can also be constructed, but it turns out that the resulting predictor can only be related to the original covariates via time-dependent coefficients. The methods are applied to a breast cancer data set with gene expression recordings and to the well known primary biliary cirrhosis clinical data.  相似文献   

9.
10.
In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators.  相似文献   

11.
ABSTRACT

We study the method for generating pseudo random numbers under various special cases of the Cox model with time-dependent covariates when the baseline hazard function may not be constant and the random variable may equal infinity with a positive probability. During our simulation studies in computing the partial likelihood estimates, in between 3% and 20% of the time with a moderate sample size, it happens that the partial likelihood estimate of the regression coefficient is ∞ for the data from the Cox model. We propose a semi-parametric estimator as a modification for such a case. We present simulation results on the asymptotic properties of the semi-parametric estimator.  相似文献   

12.
We examine the finite sample properties of the maximum likelihood estimator for the binary logit model with random covariates. Previous studies have either relied on large-sample asymptotics or have assumed non-random covariates. Analytic expressions for the first-order bias and second-order mean squared error function for the maximum likelihood estimator in this model are derived, and we undertake numerical evaluations to illustrate these analytic results for the single covariate case. For various data distributions, the bias of the estimator is signed the same as the covariate’s coefficient, and both the absolute bias and the mean squared errors increase symmetrically with the absolute value of that parameter. The behaviour of a bias-adjusted maximum likelihood estimator, constructed by subtracting the (maximum likelihood) estimator of the first-order bias from the original estimator, is examined in a Monte Carlo experiment. This bias-correction is effective in all of the cases considered, and is recommended for use when this logit model is estimated by maximum likelihood using small samples.  相似文献   

13.
Summary. In many biomedical studies, covariates are subject to measurement error. Although it is well known that the regression coefficients estimators can be substantially biased if the measurement error is not accommodated, there has been little study of the effect of covariate measurement error on the estimation of the dependence between bivariate failure times. We show that the dependence parameter estimator in the Clayton–Oakes model can be considerably biased if the measurement error in the covariate is not accommodated. In contrast with the typical bias towards the null for marginal regression coefficients, the dependence parameter can be biased in either direction. We introduce a bias reduction technique for the bivariate survival function in copula models while assuming an additive measurement error model and replicated measurement for the covariates, and we study the large and small sample properties of the dependence parameter estimator proposed.  相似文献   

14.
A semiparametric approach to model skewed/heteroscedastic regression data is discussed. We work with a semiparametric transform-both-sides regression model, which contains a parametric regression function and a nonparametric transformation. This model is adequate when the relationship between the median response and the explanatory variable has been specified by a theoretical result or a previous empirical study. The transform-both-sides model with a parametric transformation has been studied extensively and applied successfully to a number data sets. Allowing a nonparametric transformation function increases the flexibility of the model. In this article, we estimate the nonparametric transformation function by the conditional kernel density approach developed by Wang and Ruppert (1995), and then use a pseudo-maximum likelihood estimator to estimate the regression parameters. This estimate of the regression parameters has not been studied previously. In this article, the asymptotic distribution of this pseudo-MLE is derived. We also show that when σ, the standard deviation of the error, goes to zero (small σ asymptotics), this estimator is adaptive. Adaptive means that the regression parameters are estimated as precisely as when the transformation is known exactly. A similar result holds in the parametric approaches of Carroll and Ruppert (1984) and Ruppert and Aldershof (1989). Simulated and real examples are provided to illustrate the performance of the proposed estimator for finite sample size.  相似文献   

15.
This article studies variable selection and parameter estimation in the partially linear model when the number of covariates in the linear part increases to infinity. Using the bridge penalty method, we succeed in selecting the important covariates of the linear part. Under regularity conditions, we have shown that the bridge penalized estimator of the parametric part enjoys the oracle property. We also obtain the convergence rate of the estimator of the nonparametric part. Simulation studies show that the bridge estimator performs as well as the oracle estimator for the partially linear model. An application is analyzed to illustrate the bridge procedure.  相似文献   

16.
High-dimensional data with a group structure of variables arise always in many contemporary statistical modelling problems. Heavy-tailed errors or outliers in the response often exist in these data. We consider robust group selection for partially linear models when the number of covariates can be larger than the sample size. The non-convex penalty function is applied to achieve both goals of variable selection and estimation in the linear part simultaneously, and we use polynomial splines to estimate the nonparametric component. Under regular conditions, we show that the robust estimator enjoys the oracle property. Simulation studies demonstrate the performance of the proposed method with samples of moderate size. The analysis of a real example illustrates that our method works well.  相似文献   

17.
Recurrent events are frequently encountered in biomedical studies. Evaluating the covariates effects on the marginal recurrent event rate is of practical interest. There are mainly two types of rate models for the recurrent event data: the multiplicative rates model and the additive rates model. We consider a more flexible additive–multiplicative rates model for analysis of recurrent event data, wherein some covariate effects are additive while others are multiplicative. We formulate estimating equations for estimating the regression parameters. The estimators for these regression parameters are shown to be consistent and asymptotically normally distributed under appropriate regularity conditions. Moreover, the estimator of the baseline mean function is proposed and its large sample properties are investigated. We also conduct simulation studies to evaluate the finite sample behavior of the proposed estimators. A medical study of patients with cystic fibrosis suffered from recurrent pulmonary exacerbations is provided for illustration of the proposed method.  相似文献   

18.
A method based on pseudo-observations has been proposed for direct regression modeling of functionals of interest with right-censored data, including the survival function, the restricted mean and the cumulative incidence function in competing risks. The models, once the pseudo-observations have been computed, can be fitted using standard generalized estimating equation software. Regression models can however yield problematic results if the number of covariates is large in relation to the number of events observed. Guidelines of events per variable are often used in practice. These rules of thumb for the number of events per variable have primarily been established based on simulation studies for the logistic regression model and Cox regression model. In this paper we conduct a simulation study to examine the small sample behavior of the pseudo-observation method to estimate risk differences and relative risks for right-censored data. We investigate how coverage probabilities and relative bias of the pseudo-observation estimator interact with sample size, number of variables and average number of events per variable.  相似文献   

19.
The accelerated failure time (AFT) model is an important regression tool to study the association between failure time and covariates. In this paper, we propose a robust weighted generalized M (GM) estimation for the AFT model with right-censored data by appropriately using the Kaplan–Meier weights in the GM–type objective function to estimate the regression coefficients and scale parameter simultaneously. This estimation method is computationally simple and can be implemented with existing software. Asymptotic properties including the root-n consistency and asymptotic normality are established for the resulting estimator under suitable conditions. We further show that the method can be readily extended to handle a class of nonlinear AFT models. Simulation results demonstrate satisfactory finite sample performance of the proposed estimator. The practical utility of the method is illustrated by a real data example.  相似文献   

20.
Estimating equations which are not necessarily likelihood-based score equations are becoming increasingly popular for estimating regression model parameters. This paper is concerned with estimation based on general estimating equations when true covariate data are missing for all the study subjects, but surrogate or mismeasured covariates are available instead. The method is motivated by the covariate measurement error problem in marginal or partly conditional regression of longitudinal data. We propose to base estimation on the expectation of the complete data estimating equation conditioned on available data. The regression parameters and other nuisance parameters are estimated simultaneously by solving the resulting estimating equations. The expected estimating equation (EEE) estimator is equal to the maximum likelihood estimator if the complete data scores are likelihood scores and conditioning is with respect to all the available data. A pseudo-EEE estimator, which requires less computation, is also investigated. Asymptotic distribution theory is derived. Small sample simulations are conducted when the error process is an order 1 autoregressive model. Regression calibration is extended to this setting and compared with the EEE approach. We demonstrate the methods on data from a longitudinal study of the relationship between childhood growth and adult obesity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号