首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
Summary.  Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data.  相似文献   

2.
Summary Microaggregation by individual ranking is one of themost commonly applied disclosure control techniques for continuous microdata. The paper studies the effect of microaggregation by individual ranking on the least squares estimation of a multiple linear regression model. It is shown that the traditional least squares estimates are asymptotically unbiased. Moreover, the least squares estimates asymptotically have the same variances as the least squares estimates based on the original (non-aggregated) data. Thus, asymptotically, microaggregation by individual ranking does not result in a loss of efficiency in the least squares estimation of a multiple linear regression model. I thank Hans Schneeweiss for very helpful discussions and comments. Financial support from the Deutsche Forschungsgemeinschaft (German Science Foundation) is gratefully acknowledged.  相似文献   

3.
This paper proposes an adaptive estimator that is more precise than the ordinary least squares estimator if the distribution of random errors is skewed or has long tails. The adaptive estimates are computed using a weighted least squares approach with weights based on the lengths of the tails of the distribution of residuals. Smaller weights are assigned to those observations that have residuals in the tails of long-tailed distributions and larger weights are assigned to observations having residuals in the tails of short-tailed distributions. Monte Carlo methods are used to compare the performance of the proposed estimator and the performance of the ordinary least squares estimator. The estimates that were studied in this simulation include the difference between the means of two populations, the mean of a symmetric distribution, and the slope of a regression line. The adaptive estimators are shown to have lower mean squared errors than those for the ordinary least squares estimators for short-tailed, long-tailed, and skewed distributions, provided the sample size is at least 20. The ordinary least squares estimator has slightly lower mean squared error for normally distributed errors. The adaptive estimator is recommended for general use for studies having sample sizes of at least 20 observations unless the random errors are known to be normally distributed.  相似文献   

4.
Generalized least squares estimation of a system of seemingly unrelated regressions is usually a two-stage method: (1) estimation of cross-equation covariance matrix from ordinary least squares residuals for transforming data, and (2) application of least squares on transformed data. In presence of multicollinearity problem, conventionally ridge regression is applied at stage 2. We investigate the usage of ridge residuals at stage 1, and show analytically that the covariance matrix based on the least squares residuals does not always result in more efficient estimator. A simulation study and an application to a system of firms' gross investment support our finding.  相似文献   

5.
Partial least squares regression has been widely adopted within some areas as a useful alternative to ordinary least squares regression in the manner of other shrinkage methods such as principal components regression and ridge regression. In this paper we examine the nature of this shrinkage and demonstrate that partial least squares regression exhibits some undesirable properties.  相似文献   

6.
Response surface methodology is useful for exploring a response over a region of factor space and in searching for extrema. Its generality, makes it applicable to a variety of areas. Classical response surface methodology for a continuous response variable is generally based on least squares fitting. The sensitivity of least squares to outlying observations carries over to the surface procedures. To overcome this sensitivity, we propose response surface methodology based on robust procedures for continuous response variables. This robust methodology is analogous to the methodology based on least squares, while being much less sensitive to outlying observations. The results of a Monte Carlo study comparing it and classical surface methodologies for normal and contaminated normal errors are presented. The results show that as the proportion of contamination increases, the robust methodology correctly identifies a higher proportion of extrema than the least squares methods and that the robust estimates of extrema tend to be closer to the true extrema than the least squares methods.  相似文献   

7.
Penalized least squares estimators are sensitive to the influence of outliers like the ordinary least squares estimator. We propose a sparse regression estimator for robust variable selection and estimation based on a robust initial estimator. It is proven that our estimator has at least the same breakdown value as the initial estimator. Numerical examples are presented to illustrate our method.  相似文献   

8.
The effect of spatial autocorrelation on inferences made using ordinary least squares estimation is considered. It is found, in some cases, that ordinary least squares estimators provide a reasonable alternative to the estimated generalized least squares estimators recommended in the spatial statistics literature. One of the most serious problems in using ordinary least squares is that the usual variance estimators are severely biased when the errors are correlated. An alternative variance estimator that adjusts for any observed correlation is proposed. The need to take autocorrelation into account in variance estimation negates much of the advantage that ordinary least squares estimation has in terms of computational simplicity  相似文献   

9.
Minimax estimation of a binomial probability under LINEX loss function is considered. It is shown that no equalizer estimator is available in the statistical decision problem under consideration. It is pointed out that the problem can be solved by determining the Bayes estimator with respect to a least favorable distribution having finite support. In this situation, the optimal estimator and the least favorable distribution can be determined only by using numerical methods. Some properties of the minimax estimators and the corresponding least favorable prior distributions are provided depending on the parameters of the loss function. The properties presented are exploited in computing the minimax estimators and the least favorable distributions. The results obtained can be applied to determine minimax estimators of a cumulative distribution function and minimax estimators of a survival function.  相似文献   

10.
Summary. Least squares methods are popular for fitting valid variogram models to spatial data. The paper proposes a new least squares method based on spatial subsampling for variogram model fitting. We show that the method proposed is statistically efficient among a class of least squares methods, including the generalized least squares method. Further, it is computationally much simpler than the generalized least squares method. The method produces valid variogram estimators under very mild regularity conditions on the underlying random field and may be applied with different choices of the generic variogram estimator without analytical calculation. An extension of the method proposed to a class of spatial regression models is illustrated with a real data example. Results from a simulation study on finite sample properties of the method are also reported.  相似文献   

11.
The importance of the two-way classification model is well known, but the standard method of analysis is least squares. Often, the data of the model calls for a more robust estimation technique. This paper demonstrates the equivalence between the problem of obtaining least absolute value estimates for the two-way classification model and a capacitated transportation problem. A special purpose primal algorithm is developed to provide the least absolute value estimates. A computational comparison is made between an implementation of this specialized algorithm and a standard capacitated transportation code.  相似文献   

12.
This paper dwells on the choice between the ordinary least squares and the estimated generalized least squares estimators when the presence of heteroskedasticity is suspected. Since the estimated generalized least squares estimator does not dominate the ordinary least squares estimator completely over the whole parameter space, it is of interest to the researcher to know in advance whether the degree of severity of heteroskedasticity is such that OLS estimator outperforms the estimated generalized least squares (or 2SAE). Casting the problem in the non-spherical error mold and exploiting the principle underlying the Bayesian pretest estimator, an intuitive non-mathematical procedure is proposed to serve as an aid to the researcher in deciding when to use either the ordinary least squares (OLS) or the estimated generalized least squares (2SAE) estimators.  相似文献   

13.
The Raw1sian perspective on social policy pays particular attentionto the least advantaged members of society, but how should "the least advantaged" be identified? The concept of deprivation dominance operationalizes in part the Rawlsian evaluation of the welfare of the least advantaged members of society, but a statistical procedure for testing deprivation dominance is needed. In this paper, we construct a new distribution-free test for deprivation dominance and apply i t to Canadian income survey data  相似文献   

14.
We propose a robust estimator in the errors-in-variables model using the least trimmed squares estimator. We call this estimator the orthogonal least trimmed squares (OLTS) estimator. We show that the OLTS estimator has the high breakdown point and appropriate equivariance properties. We develop an algorithm for the OLTS estimate. Simulations are performed to compare the efficiencies of the OLTS estimates with the total least squares (TLS) estimates and a numerical example is given to illustrate the effectiveness of the estimate.  相似文献   

15.
Several approaches have been suggested for fitting linear regression models to censored data. These include Cox's propor­tional hazard models based on quasi-likelihoods. Methods of fitting based on least squares and maximum likelihoods have also been proposed. The methods proposed so far all require special purpose optimization routines. We describe an approach here which requires only a modified standard least squares routine.

We present methods for fitting a linear regression model to censored data by least squares and method of maximum likelihood. In the least squares method, the censored values are replaced by their expectations, and the residual sum of squares is minimized. Several variants are suggested in the ways in which the expect­ation is calculated. A parametric (assuming a normal error model) and two non-parametric approaches are described. We also present a method for solving the maximum likelihood equations in the estimation of the regression parameters in the censored regression situation. It is shown that the solutions can be obtained by a recursive algorithm which needs only a least squares routine for optimization. The suggested procesures gain considerably in computational officiency. The Stanford Heart Transplant data is used to illustrate the various methods.  相似文献   

16.
Summary.  Because highly correlated data arise from many scientific fields, we investigate parameter estimation in a semiparametric regression model with diverging number of predictors that are highly correlated. For this, we first develop a distribution-weighted least squares estimator that can recover directions in the central subspace, then use the distribution-weighted least squares estimator as a seed vector and project it onto a Krylov space by partial least squares to avoid computing the inverse of the covariance of predictors. Thus, distrbution-weighted partial least squares can handle the cases with high dimensional and highly correlated predictors. Furthermore, we also suggest an iterative algorithm for obtaining a better initial value before implementing partial least squares. For theoretical investigation, we obtain strong consistency and asymptotic normality when the dimension p of predictors is of convergence rate O { n 1/2/ log ( n )} and o ( n 1/3) respectively where n is the sample size. When there are no other constraints on the covariance of predictors, the rates n 1/2 and n 1/3 are optimal. We also propose a Bayesian information criterion type of criterion to estimate the dimension of the Krylov space in the partial least squares procedure. Illustrative examples with a real data set and comprehensive simulations demonstrate that the method is robust to non-ellipticity and works well even in 'small n –large p ' problems.  相似文献   

17.
The structured total least squares estimator, defined via a constrained optimization problem, is a generalization of the total least squares estimator when the data matrix and the applied correction satisfy given structural constraints. In the paper, an affine structure with additional assumptions is considered. In particular, Toeplitz and Hankel structured, noise free and unstructured blocks are allowed simultaneously in the augmented data matrix. An equivalent optimization problem is derived that has as decision variables only the estimated parameters. The cost function of the equivalent problem is used to prove consistency of the structured total least squares estimator. The results for the general affine structured multivariate model are illustrated by examples of special models. Modification of the results for block-Hankel/Toeplitz structures is also given. As a by-product of the analysis of the cost function, an iterative algorithm for the computation of the structured total least squares estimator is proposed.  相似文献   

18.
The small sample performance of least median of squares, reweighted least squares, least squares, least absolute deviations, and three partially adaptive estimators are compared using Monte Carlo simulations. Two data problems are addressed in the paper: (1) data generated from non-normal error distributions and (2) contaminated data. Breakdown plots are used to investigate the sensitivity of partially adaptive estimators to data contamination relative to RLS. One partially adaptive estimator performs especially well when the errors are skewed, while another partially adaptive estimator and RLS perform particularly well when the errors are extremely leptokur-totic. In comparison with RLS, partially adaptive estimators are only moderately effective in resisting data contamination; however, they outperform least squares and least absolute deviation estimators.  相似文献   

19.
When all experimental runs cannot be performed under homogeneous conditions, blocking can be used to increase the power for testing the treatment effects. Orthogonal blocking provides the same estimator of the polynomial effects as the one that would be obtained by ignoring the blocks. In many real-life design scenarios, there is at least one factor that is hard to change, leading to a split-plot structure. This paper shows that for a balanced ordinary least square–generalized least square equivalent split-plot design, orthogonal blocking can be achieved. Orthogonally blocked split-plot central composite designs are constructed and a catalog is provided.  相似文献   

20.
This article presents the framework of a linear measurement error model for analysing the data in environmental studies where observations on variables are subjected to measurement errors. The problem of predicting the average and actual values of such variables separately as well as simultaneously are discussed. The methods of ordinary least squares, joint least squares and generalized least squares are employed to construct the predictors. Efficiency properties of these predictors are derived and their comparative study is made. These predictors are exposed to a data set and their performance properties are analysed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号