共查询到20条相似文献,搜索用时 13 毫秒
1.
This paper develops a smoothed empirical likelihood (SEL)-based method to construct confidence intervals for quantile regression parameters with auxiliary information. First, we define the SEL ratio and show that it follows a Chi-square distribution. We then construct confidence intervals according to this ratio. Finally, Monte Carlo experiments are employed to evaluate the proposed method. 相似文献
2.
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies. 相似文献
3.
There is much literature on statistical inference for distribution under missing data, but surprisingly very little previous attention has been paid to missing data in the context of estimating distribution with auxiliary information. In this article, the auxiliary information with missing data is proposed. We use Zhou, Wan and Wang's method (2008) to mitigate the effects of missing data through a reformulation of the estimating equations, imputed through a semi-parametric procedure. Whence we can estimate distribution and the τth quantile of the distribution by taking auxiliary information into account. Asymptotic properties of the distribution estimator and corresponding sample quantile are derived and analyzed. The distribution estimators based on our method are found to significantly outperform the corresponding estimators without auxiliary information. Some simulation studies are conducted to illustrate the finite sample performance of the proposed estimators. 相似文献
4.
In this paper, we investigate the empirical-likelihood-based inference for the construction of confidence intervals and regions of the parameters of interest in single index models with missing covariates at random. An augmented inverse probability weighted-type empirical likelihood ratio for the parameters of interest is defined such that this ratio is asymptotically standard chi-squared. Our approach is to directly calibrate the empirical log-likelihood ratio, and does not need multiplication by an adjustment factor for the original ratio. Our bias-corrected empirical likelihood is self-scale invariant and no plug-in estimator for the limiting variance is needed. Some simulation studies are carried out to assess the performance of our proposed method. 相似文献
5.
This paper studies penalized quantile regression for dynamic panel data with fixed effects, where the penalty involves l1 shrinkage of the fixed effects. Using extensive Monte Carlo simulations, we present evidence that the penalty term reduces the dynamic panel bias and increases the efficiency of the estimators. The underlying intuition is that there is no need to use instrumental variables for the lagged dependent variable in the dynamic panel data model without fixed effects. This provides an additional use for the shrinkage models, other than model selection and efficiency gains. We propose a Bayesian information criterion based estimator for the parameter that controls the degree of shrinkage. We illustrate the usefulness of the novel econometric technique by estimating a “target leverage” model that includes a speed of capital structure adjustment. Using the proposed penalized quantile regression model the estimates of the adjustment speeds lie between 3% and 44% across the quantiles, showing strong evidence that there is substantial heterogeneity in the speed of adjustment among firms. 相似文献
6.
Suppose that we have a linear regression model Y=X′β+ν0(X)ε with random error ε, where X is a random design variable and is observed completely, and Y is the response variable and some Y-values are missing at random (MAR). In this paper, based on the ‘complete’ data set for Y after inverse probability weighted imputation, we construct empirical likelihood statistics on EY and β which have the χ2-type limiting distributions under some new conditions compared with Xue (2009). Our results broaden the applicable scope of the approach combined with Xue (2009). 相似文献
7.
Logistic regression plays an important role in many fields. In practice, we often encounter missing covariates in different applied sectors, particularly in biomedical sciences. Ibrahim (1990) proposed a method to handle missing covariates in generalized linear model (GLM) setup. It is well known that logistic regression estimates using small or medium sized missing data are biased. Considering the missing data that are missing at random, in this paper we have reduced the bias by two methods; first we have derived a closed form bias expression using Cox and Snell (1968), and second we have used likelihood based modification similar to Firth (1993). Here we have analytically shown that the Firth type likelihood modification in Ibrahim led to the second order bias reduction. The proposed methods are simple to apply on an existing method, need no analytical work, with the exception of a little change in the optimization function. We have carried out extensive simulation studies comparing the methods, and our simulation results are also supported by a real world data. 相似文献
8.
Randomized response is an interview technique designed to eliminate response bias when sensitive questions are asked. In this paper, we present a logistic regression model on randomized response data when the covariates on some subjects are missing at random. In particular, we propose Horvitz and Thompson (1952)-type weighted estimators by using different estimates of the selection probabilities. We present large sample theory for the proposed estimators and show that they are more efficient than the estimator using the true selection probabilities. Simulation results support theoretical analysis. We also illustrate the approach using data from a survey of cable TV. 相似文献
9.
AbstractIn this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators. 相似文献
10.
In this paper, we develop Bayesian methodology and computational algorithms for variable subset selection in Cox proportional hazards models with missing covariate data. A new joint semi-conjugate prior for the piecewise exponential model is proposed in the presence of missing covariates and its properties are examined. The covariates are assumed to be missing at random (MAR). Under this new prior, a version of the Deviance Information Criterion (DIC) is proposed for Bayesian variable subset selection in the presence of missing covariates. Monte Carlo methods are developed for computing the DICs for all possible subset models in the model space. A Bone Marrow Transplant (BMT) dataset is used to illustrate the proposed methodology. 相似文献
11.
Myunghee Cho Paik 《统计学通讯:模拟与计算》2013,42(1):1-19
Various methods have been suggested in the literature to handle a missing covariate in the presence of surrogate covariates. These methods belong to one of two paradigms. In the imputation paradigm, Pepe and Fleming (1991) and Reilly and Pepe (1995) suggested filling in missing covariates using the empirical distribution of the covariate obtained from the observed data. We can proceed one step further by imputing the missing covariate using nonparametric maximum likelihood estimates (NPMLE) of the density of the covariate. Recently Murphy and Van der Vaart (1998a) showed that such an approach yields a consistent, asymptotically normal, and semiparametric efficient estimate for the logistic regression coefficient. In the weighting paradigm, Zhao and Lipsitz (1992) suggested an estimating function using completely observed records after weighting inversely by the probability of observation. An extension of this weighting approach designed to achieve semiparametric efficient bound is considered by Robins, Hsieh and Newey (RHN) (1995). The two ends of each paradigm (NPMLE and RHN) attain the efficiency bound and are asymptotically equivalent. However, both require a substantial amount of computation. A question arises whether and when, in practical situations, this extensive computation is worthwhile. In this paper we investigate the performance of single and multiple imputation estimates, weighting estimates, semiparametric efficient estimates, and two new imputation estimates. Simulation studies suggest that the sample size should be substantially large (e.g. n=2000) for NPMLE and RHN to be more efficient than simpler imputation estimates. When the sample size is moderately large (n≤ 1500), simpler imputation estimates have as small a variance as semiparametric efficient estimates. 相似文献
12.
Gabriela Ciuperca 《Journal of Statistical Computation and Simulation》2013,83(4):739-758
In this paper, a nonlinear model with response variables missing at random is studied. In order to improve the coverage accuracy for model parameters, the empirical likelihood (EL) ratio method is considered. On the complete data, the EL statistic for the parameters and its approximation have a χ2 asymptotic distribution. When the responses are reconstituted using a semi-parametric method, the empirical log-likelihood on the response variables associated with the imputed data is also asymptotically χ2. The Wilks theorem for EL on the parameters, based on reconstituted data, is also satisfied. These results can be used to construct the confidence region for the model parameters and the response variables. It is shown via Monte Carlo simulations that the EL methods outperform the normal approximation-based method in terms of coverage probability for the unknown parameter, including on the reconstituted data. The advantages of the proposed method are exemplified on real data. 相似文献
13.
AbstractIn this article, we consider the inverse probability weighted estimators for a single-index model with missing covariates when the selection probabilities are known or unknown. It is shown that the estimator for the index parameter by using estimated selection probabilities has a smaller asymptotic variance than that with true selection probabilities, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for the index parameter in single index model. However, this difference disappears for the estimators of the link function. Some numerical examples and a real data application are also conducted to illustrate the performances of the estimators. 相似文献
14.
Subset selection is an extensively studied problem in statistical learning. Especially it becomes popular for regression analysis. This problem has considerable attention for generalized linear models as well as other types of regression methods. Quantile regression is one of the most used types of regression method. In this article, we consider subset selection problem for quantile regression analysis with adopting some recent Bayesian information criteria. We also utilized heuristic optimization during selection process. Simulation and real data application results demonstrate the capability of the mentioned information criteria. According to results, these information criteria can determine the true models effectively in quantile regression models. 相似文献
15.
The authors study the empirical likelihood method for linear regression models. They show that when missing responses are imputed using least squares predictors, the empirical log‐likelihood ratio is asymptotically a weighted sum of chi‐square variables with unknown weights. They obtain an adjusted empirical log‐likelihood ratio which is asymptotically standard chi‐square and hence can be used to construct confidence regions. They also obtain a bootstrap empirical log‐likelihood ratio and use its distribution to approximate that of the empirical log‐likelihood ratio. A simulation study indicates that the proposed methods are comparable in terms of coverage probabilities and average lengths of confidence intervals, and perform better than a normal approximation based method. 相似文献
16.
We propose a new adaptive L1 penalized quantile regression estimator for high-dimensional sparse regression models with heterogeneous error sequences. We show that under weaker conditions compared with alternative procedures, the adaptive L1 quantile regression selects the true underlying model with probability converging to one, and the unique estimates of nonzero coefficients it provides have the same asymptotic normal distribution as the quantile estimator which uses only the covariates with non-zero impact on the response. Thus, the adaptive L1 quantile regression enjoys oracle properties. We propose a completely data driven choice of the penalty level λn, which ensures good performance of the adaptive L1 quantile regression. Extensive Monte Carlo simulation studies have been conducted to demonstrate the finite sample performance of the proposed method. 相似文献
17.
Distribution function estimation plays a significant role of foundation in statistics since the population distribution is always involved in statistical inference and is usually unknown. In this paper, we consider the estimation of the distribution function of a response variable Y with missing responses in the regression problems. It is proved that the augmented inverse probability weighted estimator converges weakly to a zero mean Gaussian process. A augmented inverse probability weighted empirical log-likelihood function is also defined. It is shown that the empirical log-likelihood converges weakly to the square of a Gaussian process with mean zero and variance one. We apply these results to the construction of Gaussian process approximation based confidence bands and empirical likelihood based confidence bands of the distribution function of Y. A simulation is conducted to evaluate the confidence bands. 相似文献
18.
Empirical likelihood-based inference in nonlinear regression models with missing responses at random
This paper investigates the estimations of regression parameters and response mean in nonlinear regression models in the presence of missing response variables that are missing with missingness probabilities depending on covariates. We propose four empirical likelihood (EL)-based estimators for the regression parameters and the response mean. The resulting estimators are shown to be consistent and asymptotically normal under some general assumptions. To construct the confidence regions for the regression parameters as well as the response mean, we develop four EL ratio statistics, which are proven to have the χ2 distribution asymptotically. Simulation studies and an artificial data set are used to illustrate the proposed methodologies. Empirical results show that the EL method behaves better than the normal approximation method and that the coverage probabilities and average lengths depend on the selection probability function. 相似文献
19.
Ji Chen 《Journal of nonparametric statistics》2019,31(2):420-434
Non-response or missing data is a common phenomenon in many areas. Non-ignorable non-response, a response mechanism that depends on the values of the variable having non-response, is the most difficult type of non-response to handle. This paper considers statistical inference of unknown parameters in estimating equations (EEs) when the variable of interest has non-ignorable non-response. By utilising the cutting edge techniques of non-response instrument, a parametric response propensity function can be identified and estimated. Then a semiparametric likelihood is constructed with the propensity function, EEs and auxiliary information being incorporated into the constraints to make the inference valid and improve the estimation efficiency. Asymptotic distributions for the resulting parameter estimates are derived. Empirical results including two simulation studies and a real example show that the proposed method gives promising results. 相似文献
20.
Denis H. Y. Leung Jing Qin 《Journal of the Royal Statistical Society. Series C, Applied statistics》2006,55(3):379-396
Summary. In many surveys, missing response is a common problem. As an example, Zahner, Jacobs, Freeman and Trainor analysed data from a study of child psychopathology in the State of Connecticut, USA. In that study, the response variable, psychopathology, was inferred from questions that were addressed to teachers of the children and was subject to a high level of missingness. However, the missing responses were supplemented by surrogate information that was provided by the parents and/or the primary care providers of the children. In such a situation, it is conceivable that the supplemental information can be used to recover some of the information that has been lost in the cases with missing response. This paper considers a method using empirical likelihood. Empirical likelihood is well known in providing nonparametric inference. But its application has largely been confined to complete-data situations. The method proposed exploits the semiparametric nature of empirical likelihood. The method gives consistent estimates if the cases with non-missing responses form a random sample of the population. In large samples, the method behaves similarly to a regression estimate that is applied to estimating equations. The method is easy to implement with standard statistical packages. In a small sample study, the method was found to give favourable results, when compared with existing methods. 相似文献