首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
We consider parametric regression problems with some covariates missing at random. It is shown that the regression parameter remains identifiable under natural conditions. When the always observed covariates are discrete, we propose a semiparametric maximum likelihood method, which does not require parametric specification of the missing data mechanism or the covariate distribution. The global maximum likelihood estimator (MLE), which maximizes the likelihood over the whole parameter set, is shown to exist under simple conditions. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. Under regularity conditions, the two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set.  相似文献   

2.
Doubly robust (DR) estimators of the mean with missing data are compared. An estimator is DR if either the regression of the missing variable on the observed variables or the missing data mechanism is correctly specified. One method is to include the inverse of the propensity score as a linear term in the imputation model [D. Firth and K.E. Bennett, Robust models in probability sampling, J. R. Statist. Soc. Ser. B. 60 (1998), pp. 3–21; D.O. Scharfstein, A. Rotnitzky, and J.M. Robins, Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion), J. Am. Statist. Assoc. 94 (1999), pp. 1096–1146; H. Bang and J.M. Robins, Doubly robust estimation in missing data and causal inference models, Biometrics 61 (2005), pp. 962–972]. Another method is to calibrate the predictions from a parametric model by adding a mean of the weighted residuals [J.M Robins, A. Rotnitzky, and L.P. Zhao, Estimation of regression coefficients when some regressors are not always observed, J. Am. Statist. Assoc. 89 (1994), pp. 846–866; D.O. Scharfstein, A. Rotnitzky, and J.M. Robins, Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion), J. Am. Statist. Assoc. 94 (1999), pp. 1096–1146]. The penalized spline propensity prediction (PSPP) model includes the propensity score into the model non-parametrically [R.J.A. Little and H. An, Robust likelihood-based analysis of multivariate data with missing values, Statist. Sin. 14 (2004), pp. 949–968; G. Zhang and R.J. Little, Extensions of the penalized spline propensity prediction method of imputation, Biometrics, 65(3) (2008), pp. 911–918]. All these methods have consistency properties under misspecification of regression models, but their comparative efficiency and confidence coverage in finite samples have received little attention. In this paper, we compare the root mean square error (RMSE), width of confidence interval and non-coverage rate of these methods under various mean and response propensity functions. We study the effects of sample size and robustness to model misspecification. The PSPP method yields estimates with smaller RMSE and width of confidence interval compared with other methods under most situations. It also yields estimates with confidence coverage close to the 95% nominal level, provided the sample size is not too small.  相似文献   

3.
Biao Zhang 《Statistics》2016,50(5):1173-1194
Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study.  相似文献   

4.
Abstract.  We propose and study a class of regression models, in which the mean function is specified parametrically as in the existing regression methods, but the residual distribution is modelled non-parametrically by a kernel estimator, without imposing any assumption on its distribution. This specification is different from the existing semiparametric regression models. The asymptotic properties of such likelihood and the maximum likelihood estimate (MLE) under this semiparametric model are studied. We show that under some regularity conditions, the MLE under this model is consistent (when compared with the possibly pseudo-consistency of the parameter estimation under the existing parametric regression model), is asymptotically normal with rate and efficient. The non-parametric pseudo-likelihood ratio has the Wilks property as the true likelihood ratio does. Simulated examples are presented to evaluate the accuracy of the proposed semiparametric MLE method.  相似文献   

5.
Semiparametric transformation models provide flexible regression models for survival analysis, including the Cox proportional hazards and the proportional odds models as special cases. We consider the application of semiparametric transformation models in case-cohort studies, where the covariate data are observed only on cases and on a subcohort randomly sampled from the full cohort. We first propose an approximate profile likelihood approach with full-cohort data, which amounts to the pseudo-partial likelihood approach of Zucker [2005. A pseudo-partial likelihood method for semiparametric survival regression with covariate errors. J. Amer. Statist. Assoc. 100, 1264–1277]. Simulation results show that our proposal is almost as efficient as the nonparametric maximum likelihood estimator. We then extend this approach to the case-cohort design, applying the Horvitz–Thompson weighting method to the estimating equations from the approximated profile likelihood. Two levels of weights can be utilized to achieve unbiasedness and to gain efficiency. The resulting estimator has a closed-form asymptotic covariance matrix, and is found in simulations to be substantially more efficient than the estimator based on martingale estimating equations. The extension to left-truncated data will be discussed. We illustrate the proposed method on data from a cardiovascular risk factor study conducted in Taiwan.  相似文献   

6.
A generalized self-consistency approach to maximum likelihood estimation (MLE) and model building was developed in Tsodikov [2003. Semiparametric models: a generalized self-consistency approach. J. Roy. Statist. Soc. Ser. B Statist. Methodology 65(3), 759–774] and applied to a survival analysis problem. We extend the framework to obtain second-order results such as information matrix and properties of the variance. Multinomial model motivates the paper and is used throughout as an example. Computational challenges with the multinomial likelihood motivated Baker [1994. The Multinomial–Poisson transformation. The Statist. 43, 495–504] to develop the Multinomial–Poisson (MP) transformation for a large variety of regression models with multinomial likelihood kernel. Multinomial regression is transformed into a Poisson regression at the cost of augmenting model parameters and restricting the problem to discrete covariates. Imposing normalization restrictions by means of Lagrange multipliers [Lang, J., 1996. On the comparison of multinomial and Poisson log-linear models. J. Roy. Statist. Soc. Ser. B Statist. Methodology 58, 253–266] justifies the approach. Using the self-consistency framework we develop an alternative solution to multinomial model fitting that does not require augmenting parameters while allowing for a Poisson likelihood and arbitrary covariate structures. Normalization restrictions are imposed by averaging over artificial “missing data” (fake mixture). Lack of probabilistic interpretation at the “complete-data” level makes the use of the generalized self-consistency machinery essential.  相似文献   

7.
Epstein [Truncated life tests in the exponential case, Ann. Math. Statist. 25 (1954), pp. 555–564] introduced a hybrid censoring scheme (called Type-I hybrid censoring) and Chen and Bhattacharyya [Exact confidence bounds for an exponential parameter under hybrid censoring, Comm. Statist. Theory Methods 17 (1988), pp. 1857–1870] derived the exact distribution of the maximum-likelihood estimator (MLE) of the mean of a scaled exponential distribution based on a Type-I hybrid censored sample. Childs et al. [Exact likelihood inference based on Type-I and Type-II hybrid censored samples from the exponential distribution, Ann. Inst. Statist. Math. 55 (2003), pp. 319–330] provided an alternate simpler expression for this distribution, and also developed analogous results for another hybrid censoring scheme (called Type-II hybrid censoring). The purpose of this paper is to derive the exact bivariate distribution of the MLE of the parameter vector of a two-parameter exponential model based on hybrid censored samples. The marginal distributions are derived and exact confidence bounds for the parameters are obtained. The results are also used to derive the exact distribution of the MLE of the pth quantile, as well as the corresponding confidence bounds. These exact confidence intervals are then compared with parametric bootstrap confidence intervals in terms of coverage probabilities. Finally, we present some numerical examples to illustrate the methods of inference developed here.  相似文献   

8.
We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration.  相似文献   

9.
We propose a profile conditional likelihood approach to handle missing covariates in the general semiparametric transformation regression model. The method estimates the marginal survival function by the Kaplan-Meier estimator, and then estimates the parameters of the survival model and the covariate distribution from a conditional likelihood, substituting the Kaplan-Meier estimator for the marginal survival function in the conditional likelihood. This method is simpler than full maximum likelihood approaches, and yields consistent and asymptotically normally distributed estimator of the regression parameter when censoring is independent of the covariates. The estimator demonstrates very high relative efficiency in simulations. When compared with complete-case analysis, the proposed estimator can be more efficient when the missing data are missing completely at random and can correct bias when the missing data are missing at random. The potential application of the proposed method to the generalized probit model with missing continuous covariates is also outlined.  相似文献   

10.
Summary. The paper considers canonical link generalized linear models with stratum-specific nuisance intercepts and missing covariate data. This family includes the conditional logistic regression model. Existing methods for this problem, each of which uses a conditioning argu- ment to eliminate the nuisance intercept, model either the missing covariate data or the missingness process. The paper compares these methods under a common likelihood framework. The semiparametric efficient estimator is identified, and a new estimator, which reduces dependence on the model for the missing covariate, is proposed. A simulation study compares the methods with respect to efficiency and robustness to model misspecification.  相似文献   

11.
Hu Yang 《Statistics》2013,47(6):759-766
In this paper, we introduce a stochastic restricted kd class estimator for the vector of parameters in a linear model when additional linear restrictions on the parameter vector are assumed to hold. The stochastic restricted kd class estimator is a generalization of the ordinary mixed estimator and the kd class estimator. We show that our new biased estimator is superior in the mean squared error matrix sense to the kd class estimator [S. Sakall?o?lu and S. Kaçiranlar, A new biased estimator based on ridge estimation, Statist. Papers 49 (2008), pp. 669–689] and the stochastic restricted Liu estimator [H. Yang and J.W. Xu, An alternative stochastic restricted Liu estimator in linear regression, Statist. Papers 50 (2009), pp. 639–647]. Finally, a numerical example is given to show the theoretical results.  相似文献   

12.
This article investigates the confidence regions for semiparametric nonlinear reproductive dispersion models (SNRDMs), which is an extension of nonlinear regression models. Based on local linear estimate of nonparametric component and generalized profile likelihood estimate of parameter in SNRDMs, a modified geometric framework of Bates and Wattes is proposed. Within this geometric framework, we present three kinds of improved approximate confidence regions for the parameters and parameter subsets in terms of curvatures. The work extends the previous results of Hamilton et al. [in Accounting for intrinsic nonlinearity in nonlinear regression parameter inference regions, Ann. Statist. 10, pp. 386–393, 1982], Hamilton [in Confidence regions for parameter subset in nonlinear regression, Biometrika, 73, pp. 57–64, 1986], Wei [in On confidence regions of embedded models in regular parameter families (a geometric approch), Austral. J. Statist. 36, pp. 327–338, 1994], Tang et al. [in Confidence regions in quasi-likelihood nonlinear models: a geometric approach, J. Biomath. 15, pp. 55–64, 2000b] and Zhu et al. [in On confidence regions of semiparametric nonlinear regression models, Acta. Math. Scient. 20, pp. 68–75, 2000].  相似文献   

13.
In this note, the asymptotic variance formulas are explicitly derived and compared between the parametric and semiparametric estimators of a regression parameter and survival probability under the additive hazards model. To obtain explicit formulas, it is assumed that the covariate term including a regression coefficient follows a gamma distribution and the baseline hazard function is constant. The results show that the semiparametric estimator of the regression coefficient parameter is fully efficient relative to the parametric counterpart when the survival time and a covariate are independent, as in the proportional hazards model. Relative to a more realistic case of the parametric additive hazards model with a Weibull baseline, the loss of efficiency of the semiparametric estimator of survival probability is moderate.  相似文献   

14.
A semiparametric estimator based on an unknown density isuniformly adaptive if the expected loss of the estimator converges to the asymptotic expected loss of the maximum liklihood estimator based on teh true density (MLE), and if convergence does not depend on either the parameter values or the form of the unknown density. Without uniform adaptivity, the asymptotic expected loss of the MLE need not approximate the expected loss of a semiparametric estimator for any finite sample I show that a two step semiparametric estimator is uniformly adaptive for the parameters of nonlinear regression models with autoregressive moving average errors.  相似文献   

15.
Consider the Lehmann model with time-dependent covariates, which is different from Cox’s model. We find out that (1) the parameter space for β under the Lehmann model is restricted, and the maximum point of the parametric likelihood for β may lie outside the parameter space; (2) for some particular time-dependent covariate, under the standard generalized likelihood the semiparametric maximum likelihood estimator (SMLE) is inconsistent and we propose a modified generalized likelihood which leads to the consistent SMLE.  相似文献   

16.
As a compromise between parametric regression and nonparametric regression, partially linear models are frequently used in statistical modelling. This article considers statistical inference for this semiparametric model when the linear covariate is measured with additive error and some additional linear restrictions on the parametric component are assumed to hold. We propose a restricted corrected profile least-squares estimator for the parametric component, and study the asymptotic normality of the estimator. To test hypothesis on the parametric component, we construct a Wald test statistic and obtain its limiting distribution. Some simulation studies are conducted to illustrate our approaches.  相似文献   

17.
The logistic regression model is used when the response variables are dichotomous. In the presence of multicollinearity, the variance of the maximum likelihood estimator (MLE) becomes inflated. The Liu estimator for the linear regression model is proposed by Liu to remedy this problem. Urgan and Tez and Mansson et al. examined the Liu estimator (LE) for the logistic regression model. We introduced the restricted Liu estimator (RLE) for the logistic regression model. Moreover, a Monte Carlo simulation study is conducted for comparing the performances of the MLE, restricted maximum likelihood estimator (RMLE), LE, and RLE for the logistic regression model.  相似文献   

18.
Jingjing Wu 《Statistics》2015,49(4):711-740
The successful application of the Hellinger distance approach to fully parametric models is well known. The corresponding optimal estimators, known as minimum Hellinger distance (MHD) estimators, are efficient and have excellent robustness properties [Beran R. Minimum Hellinger distance estimators for parametric models. Ann Statist. 1977;5:445–463]. This combination of efficiency and robustness makes MHD estimators appealing in practice. However, their application to semiparametric statistical models, which have a nuisance parameter (typically of infinite dimension), has not been fully studied. In this paper, we investigate a methodology to extend the MHD approach to general semiparametric models. We introduce the profile Hellinger distance and use it to construct a minimum profile Hellinger distance estimator of the finite-dimensional parameter of interest. This approach is analogous in some sense to the profile likelihood approach. We investigate the asymptotic properties such as the asymptotic normality, efficiency, and adaptivity of the proposed estimator. We also investigate its robustness properties. We present its small-sample properties using a Monte Carlo study.  相似文献   

19.
In the context of discrete data, a sequential fixed-width confidence interval for an unknown parameter in a parametric model is constructed using a minimum Hellinger distance estimator (MHD) as the center of the interval. It is shown that our sequential procedure is asymptotically consistent and efficient, when the assumed parametric model is correct. These results, in addition to being exactly same as those obtained by Khan [1969, A general method of determining fixed-width confidence intervals. Ann. Math. Statist. 40, 704–709] and Yu [1989, On fixed-width confidence intervals associated with maximum likelihood estimation. J. Theoret. Probab. 2, 193–199] using a maximum likelihood estimator (MLE), offer an alternative which has several in-built robustness properties. Monte Carlo simulations show that the performance of our sequential procedure based on MHD, measured in terms of average sample size and the coverage probability, are as good as those based on MLE, when the assumed Poisson model is correct. However, when the samples come from a gross-error contaminated Poisson model, our numerical results show that the deviation from the Poisson model assumption severely affects the performance of the sequential procedure based on MLE, while the procedure based on MHD continues to perform well, thus exhibiting robustness of MHD against gross-error contaminations even for random sample sizes.  相似文献   

20.
Estimation of two normal means with an order restriction is considered when a covariance matrix is known. It is shown that restricted maximum likelihood estimator (MLE) stochastically dominates both estimators proposed by Hwang and Peddada [Confidence interval estimation subject to order restrictions. Ann Statist. 1994;22(1):67–93] and Peddada et al. [Estimation of order-restricted means from correlated data. Biometrika. 2005;92:703–715]. The estimators are also compared under the Pitman nearness criterion and it is shown that the MLE is closer to ordered means than the other two estimators. Estimation of linear functions of ordered means is also considered and a necessary and sufficient condition on the coefficients is given for the MLE to dominate the other estimators in terms of mean squared error.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号