首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we conducted a simulation study to evaluate the performance of four algorithms: multinomial logistic regression (MLR), bagging (BAG), random forest (RF), and gradient boosting (GB), for estimating generalized propensity score (GPS). Similar to the propensity score (PS), the ultimate goal of using GPS is to estimate unbiased average treatment effects (ATEs) in observational studies. We used the GPS estimates computed from these four algorithms with the generalized doubly robust (GDR) estimator to estimate ATEs in observational studies. We evaluated these ATE estimates in terms of bias and mean squared error (MSE). Simulation results show that overall, the GB algorithm produced the best ATE estimates based on these evaluation criteria. Thus, we recommend using the GB algorithm for estimating GPS in practice.  相似文献   

2.
Coefficient estimation in linear regression models with missing data is routinely carried out in the mean regression framework. However, the mean regression theory breaks down if the error variance is infinite. In addition, correct specification of the likelihood function for existing imputation approach is often challenging in practice, especially for skewed data. In this paper, we develop a novel composite quantile regression and a weighted quantile average estimation procedure for parameter estimation in linear regression models when some responses are missing at random. Instead of imputing the missing response by randomly drawing from its conditional distribution, we propose to impute both missing and observed responses by their estimated conditional quantiles given the observed data and to use the parametrically estimated propensity scores to weigh check functions that define a regression parameter. Both estimation procedures are resistant to heavy‐tailed errors or outliers in the response and can achieve nice robustness and efficiency. Moreover, we propose adaptive penalization methods to simultaneously select significant variables and estimate unknown parameters. Asymptotic properties of the proposed estimators are carefully investigated. An efficient algorithm is developed for fast implementation of the proposed methodologies. We also discuss a model selection criterion, which is based on an ICQ ‐type statistic, to select the penalty parameters. The performance of the proposed methods is illustrated via simulated and real data sets.  相似文献   

3.
The asymmetric Laplace likelihood naturally arises in the estimation of conditional quantiles of a response variable given covariates. The estimation of its parameters entails unconstrained maximization of a concave and non-differentiable function over the real space. In this note, we describe a maximization algorithm based on the gradient of the log-likelihood that generates a finite sequence of parameter values along which the likelihood increases. The algorithm can be applied to the estimation of mixed-effects quantile regression, Laplace regression with censored data, and other models based on Laplace likelihood. In a simulation study and in a number of real-data applications, the proposed algorithm has shown notable computational speed.  相似文献   

4.
Expectile regression [Newey W, Powell J. Asymmetric least squares estimation and testing, Econometrica. 1987;55:819–847] is a nice tool for estimating the conditional expectiles of a response variable given a set of covariates. Expectile regression at 50% level is the classical conditional mean regression. In many real applications having multiple expectiles at different levels provides a more complete picture of the conditional distribution of the response variable. Multiple linear expectile regression model has been well studied [Newey W, Powell J. Asymmetric least squares estimation and testing, Econometrica. 1987;55:819–847; Efron B. Regression percentiles using asymmetric squared error loss, Stat Sin. 1991;1(93):125.], but it can be too restrictive for many real applications. In this paper, we derive a regression tree-based gradient boosting estimator for nonparametric multiple expectile regression. The new estimator, referred to as ER-Boost, is implemented in an R package erboost publicly available at http://cran.r-project.org/web/packages/erboost/index.html. We use two homoscedastic/heteroscedastic random-function-generator models in simulation to show the high predictive accuracy of ER-Boost. As an application, we apply ER-Boost to analyse North Carolina County crime data. From the nonparametric expectile regression analysis of this dataset, we draw several interesting conclusions that are consistent with the previous study using the economic model of crime. This real data example also provides a good demonstration of some nice features of ER-Boost, such as its ability to handle different types of covariates and its model interpretation tools.  相似文献   

5.
Quantile regression (QR) provides estimates of a range of conditional quantiles. This stands in contrast to traditional regression techniques, which focus on a single conditional mean function. Lee et al. [Regularization of case-specific parameters for robustness and efficiency. Statist Sci. 2012;27(3):350–372] proposed efficient QR by rounding the sharp corner of the loss. The main modification generally involves an asymmetric ?2 adjustment of the loss function around zero. We extend the idea of ?2 adjusted QR to linear heterogeneous models. The ?2 adjustment is constructed to diminish as sample size grows. Conditions to retain consistency properties are also provided.  相似文献   

6.
梯度Boosting思想在解释Boosting算法的运行机制时基于基学习器张成的空间为连续泛函空间,但是实际上在有限样本条件下形成的基学习器空间不一定是连续的。针对这一问题,从可加模型的角度出发,基于平方损失,提出一种重抽样提升回归树的新方法。该方法是一种加权的加法模型的逐步更新算法。实验结果表明,这种方法可以显著地提升一棵回归树的效果,减小预测误差,并且能得到比L2Boost算法更低的预测误差。  相似文献   

7.
Quantile regression (QR) proposed by Koenker and Bassett [Regression quantiles, Econometrica 46(1) (1978), pp. 33–50] is a statistical technique that estimates conditional quantiles. It has been widely studied and applied to economics. Meinshausen [Quantile regression forests, J. Mach. Learn. Res. 7 (2006), pp. 983–999] proposed quantile regression forests (QRF), a non-parametric way based on random forest. QRF performs well in terms of prediction accuracy, but it struggles with noisy data sets. This motivates us to propose a multi-step QR tree method using GUIDE (Generalized, Unbiased, Interaction Detection and Estimation) made by Loh [Regression trees with unbiased variable selection and interaction detection, Statist. Sinica 12 (2002), pp. 361–386]. Our simulation study shows that the multi-step QR tree performs better than a single tree or QRF especially when it deals with data sets having many irrelevant variables.  相似文献   

8.
This paper proposes a consistent parametric test of Granger-causality in quantiles. Although the concept of Granger-causality is defined in terms of the conditional distribution, most articles have tested Granger-causality using conditional mean regression models in which the causal relations are linear. Rather than focusing on a single part of the conditional distribution, we develop a test that evaluates nonlinear causalities and possible causal relations in all conditional quantiles, which provides a sufficient condition for Granger-causality when all quantiles are considered. The proposed test statistic has correct asymptotic size, is consistent against fixed alternatives, and has power against Pitman deviations from the null hypothesis. As the proposed test statistic is asymptotically nonpivotal, we tabulate critical values via a subsampling approach. We present Monte Carlo evidence and an application considering the causal relation between the gold price, the USD/GBP exchange rate, and the oil price.  相似文献   

9.
分位数回归技术综述   总被引:16,自引:0,他引:16  
普通最小二乘回归建立了在自变量X=x下因变量Y的条件均值与X的关系的线性模型。而分位数回归(Quantile Regression)则利用自变量X和因变量y的条件分位数进行建模。与普通的均值回归相比,它能充分反映自变量X对于因变量y的分布的位置、刻度和形状的影响,有着十分广泛的应用,尤其是对于一些非常关注尾部特征的情况。文章介绍了分位数回归的概念以及分位数回归的估计、检验和拟合优度,回顾了分位数回归的发展过程以及其在一些经济研究领域中的应用,最后做了总结。  相似文献   

10.
A class of trimmed linear conditional estimators based on regression quantiles for the linear regression model is introduced. This class serves as a robust analogue of non-robust linear unbiased estimators. Asymptotic analysis then shows that the trimmed least squares estimator based on regression quantiles ( Koenker and Bassett ( 1978 ) ) is the best in this estimator class in terms of asymptotic covariance matrices. The class of trimmed linear conditional estimators contains the Mallows-type bounded influence trimmed means ( see De Jongh et al ( 1988 ) ) and trimmed instrumental variables estimators. A large sample methodology based on trimmed instrumental variables estimator for confidence ellipsoids and hypothesis testing is also provided.  相似文献   

11.
Value at risk (VaR) is the standard measure of market risk used by financial institutions. Interpreting the VaR as the quantile of future portfolio values conditional on current information, the conditional autoregressive value at risk (CAViaR) model specifies the evolution of the quantile over time using an autoregressive process and estimates the parameters with regression quantiles. Utilizing the criterion that each period the probability of exceeding the VaR must be independent of all the past information, we introduce a new test of model adequacy, the dynamic quantile test. Applications to real data provide empirical support to this methodology.  相似文献   

12.
We propose a new algorithm for simultaneous variable selection and parameter estimation for the single-index quantile regression (SIQR) model . The proposed algorithm, which is non iterative , consists of two steps. Step 1 performs an initial variable selection method. Step 2 uses the results of Step 1 to obtain better estimation of the conditional quantiles and , using them, to perform simultaneous variable selection and estimation of the parametric component of the SIQR model. It is shown that the initial variable selection method consistently estimates the relevant variables , and the estimated parametric component derived in Step 2 satisfies the oracle property.  相似文献   

13.
Tianqing Liu 《Statistics》2016,50(1):89-113
This paper proposes an empirical likelihood-based weighted (ELW) quantile regression approach for estimating the conditional quantiles when some covariates are missing at random. The proposed ELW estimator is computationally simple and achieves semiparametric efficiency if the probability of missingness is correctly specified. The limiting covariance matrix of the ELW estimator can be estimated by a resampling technique, which does not involve nonparametric density estimation or numerical derivatives. Simulation results show that the ELW method works remarkably well in finite samples. A real data example is used to illustrate the proposed ELW method.  相似文献   

14.
15.
Bias-corrected random forests in regression   总被引:1,自引:0,他引:1  
It is well known that random forests reduce the variance of the regression predictors compared to a single tree, while leaving the bias unchanged. In many situations, the dominating component in the risk turns out to be the squared bias, which leads to the necessity of bias correction. In this paper, random forests are used to estimate the regression function. Five different methods for estimating bias are proposed and discussed. Simulated and real data are used to study the performance of these methods. Our proposed methods are significantly effective in reducing bias in regression context.  相似文献   

16.
Quantile regression is a very important statistical tool for predictive modelling and risk assessment. For many applications, conditional quantile at different levels are estimated separately. Consequently the monotonicity of conditional quantiles can be violated when quantile regression curves cross each other. In this paper, we propose a new Bayesian multiple quantile regression based on heavy tailed distribution for non-crossing. We consider a linear quantile regression model for simultaneous Bayesian estimation of multiple quantiles based on a regularly varying assumptions. The numerical and competitive performance of the proposed method is illustrated by simulation.  相似文献   

17.
Alice L. Morais 《Statistics》2017,51(2):294-313
We extend the Weibull power series (WPS) class of distributions to the new class of extended Weibull power series (EWPS) class of distributions. The EWPS distributions are related to series and parallel systems with a random number of components, whereas the WPS distributions [Morais AL, Barreto-Souza W. A compound class of Weibull and power series distributions. Computational Statistics and Data Analysis. 2011;55:1410–1425] are related to series systems only. Unlike the WPS distributions, for which the Weibull is a limiting special case, the Weibull law is a particular case of the EWPS distributions. We prove that the distributions in this class are identifiable under a simple assumption. We also prove stochastic and hazard rate order results and highlight that the shapes of the EWPS distributions are markedly more flexible than the shapes of the WPS distributions. We define a regression model for the EWPS response random variable to model a scale parameter and its quantiles. We present the maximum likelihood estimator and prove its consistency and asymptotic normal distribution. Although series and parallel systems motivated the construction of this class, the EWPS distributions are suitable for modelling a wide range of positive data sets. To illustrate potential uses of this model, we apply it to a real data set on the tensile strength of coconut fibres and present a simple device for diagnostic purposes.  相似文献   

18.
The check loss function is used to define quantile regression. In cross-validation, it is also employed as a validation function when the true distribution is unknown. However, our empirical study indicates that validation with the check loss often leads to overfitting the data. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. This has the effect of guarding against overfitting to some extent. The adjustment is devised to shrink to zero as sample size grows. Through various simulation settings of linear and nonlinear regressions, the improvement due to modification of the check loss by quadratic adjustment is examined empirically.  相似文献   

19.
ABSTRACT

In this paper, we propose an adaptive stochastic gradient boosting tree for classification studies with imbalanced data. The adjustment of cost-sensitivity and the predictive threshold are integrated together with a composite criterion into the original stochastic gradient boosting tree to deal with the issues of the imbalanced data structure. Numerical study shows that the proposed method can significantly enhance the classification accuracy for the minority class with only a small loss in the true negative rate for the majority class. We discuss the relation of the cost-sensitivity to the threshold manipulation using simulations. An illustrative example of the analysis of suboptimal health-state data in traditional Chinese medicine is discussed.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号