期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The peculiar shrinkage properties of partial least squares regression

Neil A. Butler & Michael C. Denham 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(3):585-593

Partial least squares regression has been widely adopted within some areas as a useful alternative to ordinary least squares regression in the manner of other shrinkage methods such as principal components regression and ridge regression. In this paper we examine the nature of this shrinkage and demonstrate that partial least squares regression exhibits some undesirable properties. 相似文献

2.

Two Stochastic Restricted Principal Components Regression Estimator in Linear Regression

Jibo Wu Hu Yang 《统计学通讯:理论与方法》2013,42(20):3793-3804

In this article, we propose two stochastic restricted principal components regression estimator by combining the approach followed in obtaining the ordinary mixed estimator and the principal components regression estimator in linear regression model. The performance of the two new estimators in terms of matrix MSE criterion is studied. We also give an example and a Monte Carlo simulation to show the theoretical results. 相似文献

3.

On the use of cross-validation to assess performance in multivariate prediction

Jonathan P. Krzanowski W. J. McCarthy W. V. 《Statistics and Computing》2000,10(3):209-229

We describe a Monte Carlo investigation of a number of variants of cross-validation for the assessment of performance of predictive models, including different values of k in leave-k-out cross-validation, and implementation either in a one-deep or a two-deep fashion. We assume an underlying linear model that is being fitted using either ridge regression or partial least squares, and vary a number of design factors such as sample size n relative to number of variables p, and error variance. The investigation encompasses both the non-singular (i.e. n > p) and the singular (i.e. n p) cases. The latter is now common in areas such as chemometrics but has as yet received little rigorous investigation. Results of the experiments enable us to reach some definite conclusions and to make some practical recommendations. 相似文献

4.

Multivariate Calibration — Direct and Indirect Regression Methodology

Rolf Sundberg 《Scandinavian Journal of Statistics》1999,26(2):161-207

This paper tries first to introduce and motivate the methodology of multivariate calibration. Next a review is given, mostly avoiding technicalities, of the somewhat messy theory of the subject. Two approaches are distinguished: the estimation approach (controlled calibration) and the prediction approach (natural calibration). Among problems discussed are the choice of estimator, the choice of confidence region, methodology for handling situations with more variables than observations, near-collinearities (with counter-measures like ridge type regression, principal components regression, partial least squares regression and continuum regression), pretreatment of data, and cross-validation vs true prediction. Examples discussed in detail concern estimation of the age of a rhinoceros from its horn lengths (low-dimensional), and nitrate prediction in waste-water from high-dimensional spectroscopic measurements. 相似文献

5.

Estimation and Prediction in the Presence of Spatial Confounding for Spatial Linear Models

下载免费PDF全文

Garritt L. Page Yajun Liu Zhuoqiong He Donchu Sun 《Scandinavian Journal of Statistics》2017,44(3):780-797

In studies that produce data with spatial structure, it is common that covariates of interest vary spatially in addition to the error. Because of this, the error and covariate are often correlated. When this occurs, it is difficult to distinguish the covariate effect from residual spatial variation. In an i.i.d. normal error setting, it is well known that this type of correlation produces biased coefficient estimates, but predictions remain unbiased. In a spatial setting, recent studies have shown that coefficient estimates remain biased, but spatial prediction has not been addressed. The purpose of this paper is to provide a more detailed study of coefficient estimation from spatial models when covariate and error are correlated and then begin a formal study regarding spatial prediction. This is carried out by investigating properties of the generalized least squares estimator and the best linear unbiased predictor when a spatial random effect and a covariate are jointly modelled. Under this setup, we demonstrate that the mean squared prediction error is possibly reduced when covariate and error are correlated. 相似文献

6.

Semiparametric Ridge Regression Approach in Partially Linear Models

M. Roozbeh M. Arashi H. A. Niroumand 《统计学通讯:模拟与计算》2013,42(3):449-460

In this article, we introduce a semiparametric ridge regression estimator for the vector-parameter in a partial linear model. It is also assumed that some additional artificial linear restrictions are imposed to the whole parameter space and the errors are dependent. This estimator is a generalization of the well-known restricted least-squares estimator and is confined to the (affine) subspace which is generated by the restrictions. Asymptotic distributional bias and risk are also derived and the comparison result is then given. 相似文献

7.

Study of partial least squares and ridge regression methods

Luis Firinguetti Golam Kibria Rodrigo Araya 《统计学通讯:模拟与计算》2017,46(8):6631-6644

This article considers both Partial Least Squares (PLS) and Ridge Regression (RR) methods to combat multicollinearity problem. A simulation study has been conducted to compare their performances with respect to Ordinary Least Squares (OLS). With varying degrees of multicollinearity, it is found that both, PLS and RR, estimators produce significant reductions in the Mean Square Error (MSE) and Prediction Mean Square Error (PMSE) over OLS. However, from the simulation study it is evident that the RR performs better when the error variance is large and the PLS estimator achieves its best results when the model includes more variables. However, the advantage of the ridge regression method over PLS is that it can provide the 95% confidence interval for the regression coefficients while PLS cannot. 相似文献

8.

Linearized Ridge Regression Estimator in Linear Regression

Xu-Qing Liu Feng Gao 《统计学通讯:理论与方法》2013,42(12):2182-2192

In this article, we aim to study the linearized ridge regression (LRR) estimator in a linear regression model motivated by the work of Liu (1993). The LRR estimator and the two types of generalized Liu estimators are investigated under the PRESS criterion. The method of obtaining the optimal generalized ridge regression (GRR) estimator is derived from the optimal LRR estimator. We apply the Hald data as a numerical example and then make a simulation study to show the main results. It is concluded that the idea of transforming the GRR estimator as a complicated function of the biasing parameters to a linearized version should be paid more attention in the future. 相似文献

9.

Testing Covariate Effects in Aalen's Linear Hazard Model

Jon Ketil Grønnesby 《Scandinavian Journal of Statistics》1997,24(1):125-135

The performance of tests in Aalen's linear regression model is studied using asymptotic power calculations and stochastic simulation. Aalen's original least squares test is compared to two modifications: a weighted least squares test with correct weights and a test where the variance is re-estimated under the null hypothesis. The test with re-estimated variance provides the highest power of the tests for the setting of this paper, and the gain is substantial for covariates following a skewed distribution like the exponential. It is further shown that Aalen's choice for weight function with re-estimated variance is optimal in the one-parameter case against proportional alternatives. 相似文献

10.

Model Reduction for Prediction in Regression Models

Inge S. Helland 《Scandinavian Journal of Statistics》2000,27(1):1-20

We look at prediction in regression models under squared loss for the random x case with many explanatory variables. Model reduction is done by conditioning upon only a small number of linear combinations of the original variables. The corresponding reduced model will then essentially be the population model for the chemometricians' partial least squares algorithm. Estimation of the selection matrix under this model is briefly discussed, and analoguous results for the case with multivariate response are formulated. Finally, it is shown that an assumption of multinormality may be weakened to assuming elliptically symmetric distribution, and that some of the results are valid without any distributional assumption at all. 相似文献

11.

Component selection norms for principal components regression

R. Carter Hill Thomas B. Fomby S. R. Johnson 《统计学通讯:理论与方法》2013,42(4):309-334

Multicollinearity or near exact linear dependence among the vectors of regressor variables in a multiple linear regression analysis can have important effects on the quality of least squares parameter estimates. One frequently suggested approach for these problems is principal components regression. This paper investigates alternative variable selection procedures and their implications for such an analysis. 相似文献

12.

Dimension Reduction in the Linear Model for Right-Censored Data: Predicting the Change of HIV-I RNA Levels using Clinical and Protease Gene Mutation Data

Huang J Harrington D 《Lifetime data analysis》2004,10(4):425-443

With rapid development in the technology of measuring disease characteristics at molecular or genetic level, it is possible to collect a large amount of data on various potential predictors of the clinical outcome of interest in medical research. It is often of interest to effectively use the information on a large number of predictors to make prediction of the interested outcome. Various statistical tools were developed to overcome the difficulties caused by the high-dimensionality of the covariate space in the setting of a linear regression model. This paper focuses on the situation, where the interested outcomes are subjected to right censoring. We implemented the extended partial least squares method along with other commonly used approaches for analyzing the high-dimensional covariates to the ACTG333 data set. Especially, we compared the prediction performance of different approaches with extensive cross-validation studies. The results show that the Buckley–James based partial least squares, stepwise subset model selection and principal components regression have similar promising predictive power and the partial least square method has several advantages in terms of interpretability and numerical computation. 相似文献

13.

Comparison of prediction methods for multicollinear data

Tormod Naes Harald Martens 《统计学通讯:模拟与计算》2013,42(3):545-576

In this paper we discuss the partial least squares (PLS) prediction method. The method is compared to the predictor based on principal component regression (PCR). Both theoretical considerations and computations on artificial and real data are presented. 相似文献

14.

On Necessary and Sufficient Conditions for Ordinary Least Squares Estimators to Be Best Linear Unbiased Estimators

George A. Milliken Mohammed Albohali 《The American statistician》2013,67(4):298-299

Two often-quoted necessary and sufficient conditions for ordinary least squares estimators to be best linear unbiased estimators are described. Another necessary and sufficient condition is described, providing an additional tool for checking to see whether the covariance matrix of a given linear model is such that the ordinary least squares estimator is also the best linear unbiased estimator. The new condition is used to show that one of the two published conditions is only a sufficient condition. 相似文献

15.

Prediction of response values in linear regression models from replicated experiments

H. Toutenburg Shalabh 《Statistical Papers》2002,43(3):423-433

This paper considers the problem of prediction in a linear regression model when data sets are available from replicated experiments. Pooling the data sets for the estimation of regression parameters, we present three predictors — one arising from the least squares method and two stemming from Stein-rule method. Efficiency properties of these predictors are discussed when they are used to predict actual and average values of response variable within/outside the sample. Received: November 17, 1999; revised version: August 10, 2000 相似文献

16.

模糊线性回归模型的最小二乘方法

卢佩陆秋君《统计与信息论坛》2016,(2):14-20

针对自变量和因变量皆模糊的数据系统中的回归分析问题,为避免自变量退化成数值变量时可能引致的估计误差增大而带来的问题,提出系统中引入模糊调整项的回归模型的一般结构,并运用基于模糊数间完备距离的最小二乘法研究模型解析表达式;利用水平截集概念将模糊多元回归模型转化成两个传统回归模型,根据模糊数间距离采用最小二乘法得到参数估计,给出员工工作绩效评估的算例说明方法的有效性,并结合Bootstrap方法的应用,研究回归参数所具有的随机不确定性动态变化。相似文献

17.

Optimal Designs for Best Linear Unbiased Prediction in Diallel Crosses

《统计学通讯:理论与方法》2013,42(7):1579-1586

ABSTRACT

Recently, Ghosh and Das (2003 Ghosh , H. , Das , A. ( 2003 ). Optimal diallel cross designs for estimation of heritability . J. Statist. Plann. Inference 116 : 185 – 196 . [Google Scholar]) considered the estimation of variance components and the variances of these estimates. While comparing the yielding capacities of the cross (i, j), Kempthorne and Curnow (1961 Kempthorne , O. , Curnow , R. N. ( 1961 ). The partial diallel cross . Biometrics 17 : 229 – 250 .[Crossref], [Web of Science ®] , [Google Scholar]) proposed the estimation of the yielding capacity of any cross based on the least square estimators of the general combining ability effects and/or the mean yield of the cross (i, j). In this article, the problem of predicting the yielding capacity of the cross (i, j) from the sample of inbred lines has been considered. The properties of the best linear unbiased predictor for predicting the unobserved general combining ability effects together with general mean effect has been studied. We characterize A-optimal complete diallel cross designs and some efficient partial diallel cross designs under this setup. 相似文献

18.

Efficient Computation of Reduced Regression Models

Stuart R. Lipsitz Garrett M. Fitzmaurice Debajyoti Sinha Nathanael Hevelone Edward Giovannucci Quoc-Dien Trinh 《The American statistician》2017,71(2):171-176

We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit. 相似文献

19.

Classification trees aided mixed regression model

Oguz Akbilgic 《Journal of applied statistics》2015,42(8):1773-1781

This paper introduces a novel hybrid regression method (MixReg) combining two linear regression methods, ordinary least square (OLS) and least squares ratio (LSR) regression. LSR regression is a method to find the regression coefficients minimizing the sum of squared error rate while OLS minimizes the sum of squared error itself. The goal of this study is to combine two methods in a way that the proposed method superior both OLS and LSR regression methods in terms of R² statistics and relative error rate. Applications of MixReg, on both simulated and real data, show that MixReg method outperforms both OLS and LSR regression. 相似文献

20.

Multicollinearity and the value of a priori information

Thomas E. Fomby R. Carter Hill 《统计学通讯:理论与方法》2013,42(5):477-486

A measure of multicollinearity is defined which is useful in evaluating maintained hypotheses and aiding estimator selection as it suggests when a non-traditional estimator proposed by Bock (1975) is minimax and dominates ordinary least squares. An example is used to illustrate the presented methodology. 相似文献