期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Classification trees aided mixed regression model

Oguz Akbilgic 《Journal of applied statistics》2015,42(8):1773-1781

This paper introduces a novel hybrid regression method (MixReg) combining two linear regression methods, ordinary least square (OLS) and least squares ratio (LSR) regression. LSR regression is a method to find the regression coefficients minimizing the sum of squared error rate while OLS minimizes the sum of squared error itself. The goal of this study is to combine two methods in a way that the proposed method superior both OLS and LSR regression methods in terms of R² statistics and relative error rate. Applications of MixReg, on both simulated and real data, show that MixReg method outperforms both OLS and LSR regression. 相似文献

2.

Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion 总被引：1，自引：0，他引：1

Clifford M. Hurvich Jeffrey S. Simonoff & Chih-Ling Tsai 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(2):271-293

Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AIC_C, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AIC_C can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AIC_C avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AIC_C-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable. 相似文献

3.

Collaborative sliced inverse regression

Alessandro Chiancone Stéphane Girard Jocelyn Chanussot 《统计学通讯:理论与方法》2017,46(12):6035-6053

Sliced inverse regression (SIR) is an effective method for dimensionality reduction in high-dimensional regression problems. However, the method has requirements on the distribution of the predictors that are hard to check since they depend on unobserved variables. It has been shown that, if the distribution of the predictors is elliptical, then these requirements are satisfied. In case of mixture models, the ellipticity is violated and in addition there is no assurance of a single underlying regression model among the different components. Our approach clusterizes the predictors space to force the condition to hold on each cluster and includes a merging technique to look for different underlying models in the data. A study on simulated data as well as two real applications are provided. It appears that SIR, unsurprisingly, is not capable of dealing with a mixture of Gaussians involving different underlying models whereas our approach is able to correctly investigate the mixture. 相似文献

4.

Simple Transformation Techniques for Improved Non-parametric Regression 总被引：2，自引：0，他引：2

B. U. Park W. C. Kim D. Ruppert M. C. Jones D. F. Signorini & R. Kohn 《Scandinavian Journal of Statistics》1997,24(2):145-163

We propose and investigate two new methods for achieving less bias in non- parametric regression. We show that the new methods have bias of order h ⁴, where h is a smoothing parameter, in contrast to the basic kernel estimator's order h ². The methods are conceptually very simple. At the first stage, perform an ordinary non-parametric regression on { x_i , Y_i } to obtain m^ ( x_i ) (we use local linear fitting). In the first method, at the second stage, repeat the non-parametric regression but on the transformed dataset { m^ ( x_i , Y_i )}, taking the estimator at x to be this second stage estimator at m^ ( x ). In the second, and more appealing, method, again perform non-parametric regression on { m^ ( x_i , Y_i )}, but this time make the kernel weights depend on the original x scale rather than using the m^ ( x ) scale. We concentrate more of our effort in this paper on the latter because of its advantages over the former. Our emphasis is largely theoretical, but we also show that the latter method has practical potential through some simulated examples. 相似文献

5.

Efficient Bandwidth Selection in Non-parametric Regression

KATHRYN A. PREWITT 《Scandinavian Journal of Statistics》2003,30(1):75-92

In this paper we use non-parametric local polynomial methods to estimate the regression function, m ( x ). Y may be a binary or continuous response variable, and X is continuous with non-uniform density. The main contributions of this paper are the weak convergence of a bandwidth process for kernels of order (0, k ), k =2 ^j , j ≥1 and the proposal of a local data-driven bandwidth selection method which is particularly beneficial for the case when X is not distributed uniformly. This selection method minimizes estimates of the asymptotic MSE and estimates the bias portion in an innovative way which relies on the order of the kernel and not estimation of m ²( x ) directly. We show that utilization of this method results in the achievement of the optimal asymptotic MSE by the estimator, i.e. the method is efficient. Simulation studies are provided which illustrate the method for both binary and continuous response cases. 相似文献

6.

Regression Methods for Combining Multiple Classifiers

T. Górecki M. Krzyśko 《统计学通讯:模拟与计算》2015,44(3):739-755

As no single classification method outperforms other classification methods under all circumstances, decision-makers may solve a classification problem using several classification methods and examine their performance for classification purposes in the learning set. Based on this performance, better classification methods might be adopted and poor methods might be avoided. However, which single classification method is the best to predict the classification of new observations is still not clear, especially when some methods offer similar classification performance in the learning set. In this article we present various regression and classical methods, which combine several classification methods to predict the classification of new observations. The quality of the combined classifiers is examined on some real data. Nonparametric regression is the best method of combining classifiers. 相似文献

7.

On adaptive linear regression

Arnab Maity Michael Sherman 《Journal of applied statistics》2008,35(12):1409-1422

Ordinary least squares (OLS) is omnipresent in regression modeling. Occasionally, least absolute deviations (LAD) or other methods are used as an alternative when there are outliers. Although some data adaptive estimators have been proposed, they are typically difficult to implement. In this paper, we propose an easy to compute adaptive estimator which is simply a linear combination of OLS and LAD. We demonstrate large sample normality of our estimator and show that its performance is close to best for both light-tailed (e.g. normal and uniform) and heavy-tailed (e.g. double exponential and t ₃) error distributions. We demonstrate this through three simulation studies and illustrate our method on state public expenditures and lutenizing hormone data sets. We conclude that our method is general and easy to use, which gives good efficiency across a wide range of error distributions. 相似文献

8.

Doubly robust weighted composite quantile regression based on SCAD-L2

Zhimiao Cao Xiaoning Kang Mingqiu Wang 《Revue canadienne de statistique》2023,51(1):38-76

In this article, a robust variable selection procedure based on the weighted composite quantile regression (WCQR) is proposed. Compared with the composite quantile regression (CQR), WCQR is robust to heavy-tailed errors and outliers in the explanatory variables. For the choice of the weights in the WCQR, we employ a weighting scheme based on the principal component method. To select variables with grouping effect, we consider WCQR with SCAD-L₂ penalization. Furthermore, under some suitable assumptions, the theoretical properties, including the consistency and oracle property of the estimator, are established with a diverging number of parameters. In addition, we study the numerical performance of the proposed method in the case of ultrahigh-dimensional data. Simulation studies and real examples are provided to demonstrate the superiority of our method over the CQR method when there are outliers in the explanatory variables and/or the random error is from a heavy-tailed distribution. 相似文献

9.

Krylov Sequences as a Tool for Analysing Iterated Regression Algorithms

ANDERS BJÖRKSTRÖM 《Scandinavian Journal of Statistics》2010,37(1):166-175

Abstract. We use Krylov sequences to analyse a class of regression methods based on successive identification of latent factors. Some results already proved for partial least squares regression (PLSR) are shown to hold for other methods also. We prove that the well-known peculiar pattern of alternating shrinkage and inflation of the principal components is not unique for PLSR. We also show that for any method in the class under study, the coefficient of determination is always at least as high as for principal components regression with the same number of factors. 相似文献

10.

Estimating the structural dimension of regressions via parametric inverse regression

Efstathia Bura & R. Dennis Cook 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(2):393-410

A new estimation method for the dimension of a regression at the outset of an analysis is proposed. A linear subspace spanned by projections of the regressor vector X , which contains part or all of the modelling information for the regression of a vector Y on X , and its dimension are estimated via the means of parametric inverse regression. Smooth parametric curves are fitted to the p inverse regressions via a multivariate linear model. No restrictions are placed on the distribution of the regressors. The estimate of the dimension of the regression is based on optimal estimation procedures. A simulation study shows the method to be more powerful than sliced inverse regression in some situations. 相似文献

11.

Simulation-Extrapolation via the Bezier Curve in Measurement Error Models

Choongrak Kim Changkon Hong Meeseon Jeong 《统计学通讯:模拟与计算》2013,42(4):1135-1147

Simulation-extrapolation (SIMEX) is a method for correcting for bias in measurement error models, and parametric SIMEX estimates are often used. In this paper, we propose a nonparametric method for computing the SIMEX estimate via the Bezier curve, which is a popular smoothing technique in the computer graphics area. Comparisons are done for the bias of the limit values of parametric SIMEX estimates and the Bezier estimate in the various nonlinear measurement error models. 相似文献

12.

非参数异方差模型中条件回归函数的EM算法——基于农村食品消费与纯收入的实证研究

王继霞申培萍《统计与信息论坛》2014,(1):9-12

对非参数异方差模型中回归函数的EM算法进行研究,并基于EM算法得到了条件回归函数的估计。此外,通过对农村居民食品消费支出与纯收入关系的实证分析,说明了基于EM算法的估计方法比最小二乘估计方法的拟合效果更好,并对恩格尔系数进行了拟合,分析了其变化走势。相似文献

13.

Estimation of regression equation with cauchy disturbances

K.R. Kadiyala K.S.R. Murthy 《Revue canadienne de statistique》1977,5(1):111-120

In this paper we present two methods of estimating a linear regression equation with Cauchy disturbances. The first method uses the maximum likelihood principle and therefore the estimators obtained are consistent. The asymptotic covariance is derived which provides with the necessary statistics for the purpose of making inference in large samples. The second method is the method of least lines which minimizes the sum of absolute errors (MSAE) from the fitted regression. Then these two methods are compared through a Monte Carlo study. The maximum likelihood method emerges superior over the MSAE method. However, the MSAE procedure which does not depend on the distribution of the error term appears to be a close competitor to the maximum likelihood estimator. 相似文献

14.

A simulation study on SPSS ridge regression and ordinary least squares regression procedures for multicollinearity data

John Zhang Mahmud Ibrahim 《Journal of applied statistics》2005,32(6):571-588

This study compares the SPSS ordinary least squares (OLS) regression and ridge regression procedures in dealing with multicollinearity data. The LS regression method is one of the most frequently applied statistical procedures in application. It is well documented that the LS method is extremely unreliable in parameter estimation while the independent variables are dependent (multicollinearity problem). The Ridge Regression procedure deals with the multicollinearity problem by introducing a small bias in the parameter estimation. The application of Ridge Regression involves the selection of a bias parameter and it is not clear if it works better in applications. This study uses a Monte Carlo method to compare the results of OLS procedure with the Ridge Regression procedure in SPSS. 相似文献

15.

Least absolute value estimation in regression models: an annotated bibliography

Terry E. Dielman 《统计学通讯:理论与方法》2013,42(4):513-541

This paper presents a comprehensive listing of articles on least absolute value (LAV) estimation as applied to linear and non-linear regression models and in systems of equations. References to the LAV method as applied in approximation theory are also included. Annotations describing the content of each article follow each reference. 相似文献

16.

Estimation and inference in regression discontinuity designs with asymmetric kernels

Eduardo Fé 《Journal of applied statistics》2014,41(11):2406-2417

We study the behaviour of the Wald estimator of causal effects in regression discontinuity design when local linear regression (LLR) methods are combined with an asymmetric gamma kernel. We show that the resulting statistic is no more complex to implement than existing methods, remains consistent at the usual non-parametric rate, and maintains an asymptotic normal distribution but, crucially, has bias and variance that do not depend on kernel-related constants. As a result, the new estimator is more efficient and yields more reliable inference. A limited Monte Carlo experiment is used to illustrate the efficiency gains. As a by product of the main discussion, we extend previous published work by establishing the asymptotic normality of the LLR estimator with a gamma kernel. Finally, the new method is used in a substantive application. 相似文献

17.

基于分位回归的风险保费预测

杨亮孟生旺《统计与信息论坛》2016,(9):83-88

风险保费预测是非寿险费率厘定的重要组成部分。在传统的分位回归厘定风险保费中,通常假设分位数水平是事先给定的,缺乏一定的客观性。为此,提出了一种应用分位回归厘定风险保费的新方法。基于破产概率确定保单组合的总风险保费,建立个体保单的分位回归模型,并与总风险保费建立等式关系,通过数值方法求解出分位数水平,实现对个体保单风险保费的预测。通过一组实际数据分析表明,该方法具有良好的预测效果。相似文献

18.

Linearized Restricted Ridge Regression Estimator in Linear Regression

Xu-Qing Liu Feng Gao Jian-Wen Xu 《统计学通讯:理论与方法》2013,42(24):4503-4514

This article primarily aims to put forward the linearized restricted ridge regression (LRRR) estimator in linear regression models. Two types of LRRR estimators are investigated under the PRESS criterion and the optimal LRRR estimators and the optimal restricted generalized ridge regression estimator are obtained. We apply the results to the Hald data and finally make a simulation study by using the method of McDonald and Galarneau. 相似文献

19.

Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients

Young Joo Yoon Cheolwoo Park Erik Hofmeister 《Journal of applied statistics》2012,39(7):1605-1621

Cardiopulmonary cerebral resuscitation (CPCR) is a procedure to restore spontaneous circulation in patients with cardiopulmonary arrest (CPA). While animals with CPA generally have a lower success rate of CPCR than people do, CPCR studies in veterinary patients have been limited. In this paper, we construct a model for predicting success or failure of CPCR, and identifying and evaluating factors that affect the success of CPCR in veterinary patients. Due to reparametrization using multiple dummy variables or close proximity in nature, many variables in the data form groups, and thus a desirable method should take this grouping feature into account in variable selection. To accomplish these goals, we propose an adaptive group bridge method for a logistic regression model. The performance of the proposed method is evaluated under different simulated setups and compared with several other regression methods. Using the logistic group bridge model, we analyze data from a CPCR study for veterinary patients and discuss their implications on the practice of veterinary medicine. 相似文献

20.

Root-n-consistent semiparametric estimation of partially linear models based on k-nn method

Zhenjuan Liu Xuewen Lu Zhenjuan Liu 《Econometric Reviews》1997,16(4):411-420

In the context of the partially linear semiparametric model examined by Robinson (1988), we show that root-n-consisten estimation results established using kernel and series methods can also be obtained by using k-nearest-neighbor (k-nn) method. 相似文献