首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 734 毫秒
1.
The restrictive properties of compositional data, that is multivariate data with positive parts that carry only relative information in their components, call for special care to be taken while performing standard statistical methods, for example, regression analysis. Among the special methods suitable for handling this problem is the total least squares procedure (TLS, orthogonal regression, regression with errors in variables, calibration problem), performed after an appropriate log-ratio transformation. The difficulty or even impossibility of deeper statistical analysis (confidence regions, hypotheses testing) using the standard TLS techniques can be overcome by calibration solution based on linear regression. This approach can be combined with standard statistical inference, for example, confidence and prediction regions and bounds, hypotheses testing, etc., suitable for interpretation of results. Here, we deal with the simplest TLS problem where we assume a linear relationship between two errorless measurements of the same object (substance, quantity). We propose an iterative algorithm for estimating the calibration line and also give confidence ellipses for the location of unknown errorless results of measurement. Moreover, illustrative examples from the fields of geology, geochemistry and medicine are included. It is shown that the iterative algorithm converges to the same values as those obtained using the standard TLS techniques. Fitted lines and confidence regions are presented for both original and transformed compositional data. The paper contains basic principles of linear models and addresses many related problems.  相似文献   

2.
This paper considers the problem of prediction in a linear regression model when data sets are available from replicated experiments. Pooling the data sets for the estimation of regression parameters, we present three predictors — one arising from the least squares method and two stemming from Stein-rule method. Efficiency properties of these predictors are discussed when they are used to predict actual and average values of response variable within/outside the sample. Received: November 17, 1999; revised version: August 10, 2000  相似文献   

3.
The authors consider dimensionality reduction methods used for prediction, such as reduced rank regression, principal component regression and partial least squares. They show how it is possible to obtain intermediate solutions by estimating simultaneously the latent variables for the predictors and for the responses. They obtain a continuum of solutions that goes from reduced rank regression to principal component regression via maximum likelihood and least squares estimation. Different solutions are compared using simulated and real data.  相似文献   

4.
This study compares the SPSS ordinary least squares (OLS) regression and ridge regression procedures in dealing with multicollinearity data. The LS regression method is one of the most frequently applied statistical procedures in application. It is well documented that the LS method is extremely unreliable in parameter estimation while the independent variables are dependent (multicollinearity problem). The Ridge Regression procedure deals with the multicollinearity problem by introducing a small bias in the parameter estimation. The application of Ridge Regression involves the selection of a bias parameter and it is not clear if it works better in applications. This study uses a Monte Carlo method to compare the results of OLS procedure with the Ridge Regression procedure in SPSS.  相似文献   

5.
In this paper, we propose a new estimation method for binary quantile regression and variable selection which can be implemented by an iteratively reweighted least square approach. In contrast to existing approaches, this method is computationally simple, guaranteed to converge to a unique solution and implemented with standard software packages. We demonstrate our methods using Monte-Carlo experiments and then we apply the proposed method to the widely used work trip mode choice dataset. The results indicate that the proposed estimators work well in finite samples.  相似文献   

6.
Abstract

In this paper, we propose an outlier-detection approach that uses the properties of an intercept estimator in a difference-based regression model (DBRM) that we first introduce. This DBRM uses multiple linear regression, and invented it to detect outliers in a multiple linear regression. Our outlier-detection approach uses only the intercept; it does not require estimates for the other parameters in the DBRM. In this paper, we first employed a difference-based intercept estimator to study the outlier-detection problem in a multiple regression model. We compared our approach with several existing methods in a simulation study and the results suggest that our approach outperformed the others. We also demonstrated the advantage of our approach using a real data application. Our approach can extend to nonparametric regression models for outliers detection.  相似文献   

7.
Regression Analysis (RA) is one of the frequently used tool for forecasting. The Ordinary Least Squares (OLS) Technique is the basic instrument of RA and there are many regression techniques based on OLS. This paper includes a new regression approach, called Least Squares Ratio (LSR), and comparison of OLS and LSR according to mean square errors of estimation of theoretical regression parameters (mse ß) and dependent value (mse y).  相似文献   

8.
In this paper we discuss the partial least squares (PLS) prediction method. The method is compared to the predictor based on principal component regression (PCR). Both theoretical considerations and computations on artificial and real data are presented.  相似文献   

9.
Short-term forecasting of wind generation requires a model of the function for the conversion of meteorological variables (mainly wind speed) to power production. Such a power curve is nonlinear and bounded, in addition to being nonstationary. Local linear regression is an appealing nonparametric approach for power curve estimation, for which the model coefficients can be tracked with recursive Least Squares (LS) methods. This may lead to an inaccurate estimate of the true power curve, owing to the assumption that a noise component is present on the response variable axis only. Therefore, this assumption is relaxed here, by describing a local linear regression with orthogonal fit. Local linear coefficients are defined as those which minimize a weighted Total Least Squares (TLS) criterion. An adaptive estimation method is introduced in order to accommodate nonstationarity. This has the additional benefit of lowering the computational costs of updating local coefficients every time new observations become available. The estimation method is based on tracking the left-most eigenvector of the augmented covariance matrix. A robustification of the estimation method is also proposed. Simulations on semi-artificial datasets (for which the true power curve is available) underline the properties of the proposed regression and related estimation methods. An important result is the significantly higher ability of local polynomial regression with orthogonal fit to accurately approximate the target regression, even though it may hardly be visible when calculating error criteria against corrupted data.  相似文献   

10.
The pros and cons of applying regression shrinkage prediction arguments and methods to autoregressive time series forecasting are discussed. Simulation evidence of the performance of a Stein regression prediction formula suggests that the overall dominance of the shrunken predictor over least squares in regression no longer holds in time series samples of a reasonable length. Rather, shrinkage appears the better of the two, with respect to prediction mean squared error, only for weaker relationships and seems to be inferior to the least squares predictor when the autoregressive relationship is strong.  相似文献   

11.
Most methods for survival prediction from high-dimensional genomic data combine the Cox proportional hazards model with some technique of dimension reduction, such as partial least squares regression (PLS). Applying PLS to the Cox model is not entirely straightforward, and multiple approaches have been proposed. The method of Park et al. (Bioinformatics 18(Suppl. 1):S120–S127, 2002) uses a reformulation of the Cox likelihood to a Poisson type likelihood, thereby enabling estimation by iteratively reweighted partial least squares for generalized linear models. We propose a modification of the method of park et al. (2002) such that estimates of the baseline hazard and the gene effects are obtained in separate steps. The resulting method has several advantages over the method of park et al. (2002) and other existing Cox PLS approaches, as it allows for estimation of survival probabilities for new patients, enables a less memory-demanding estimation procedure, and allows for incorporation of lower-dimensional non-genomic variables like disease grade and tumor thickness. We also propose to combine our Cox PLS method with an initial gene selection step in which genes are ordered by their Cox score and only the highest-ranking k% of the genes are retained, obtaining a so-called supervised partial least squares regression method. In simulations, both the unsupervised and the supervised version outperform other Cox PLS methods.  相似文献   

12.
空间回归模型由于引入了空间地理信息而使得其参数估计变得复杂,因为主要采用最大似然法,致使一般人认为在空间回归模型参数估计中不存在最小二乘法。通过分析空间回归模型的参数估计技术,研究发现,最小二乘法和最大似然法分别用于估计空间回归模型的不同的参数,只有将两者结合起来才能快速有效地完成全部的参数估计。数理论证结果表明,空间回归模型参数最小二乘估计量是最佳线性无偏估计量。空间回归模型的回归参数可以在估计量为正态性的条件下而实施显著性检验,而空间效应参数则不可以用此方法进行检验。  相似文献   

13.
We consider the issue of performing testing inferences on the parameters that index the linear regression model under heteroskedasticity of unknown form. Quasi-t test statistics use asymptotically correct standard errors obtained from heteroskedasticity-consistent covariance matrix estimators. An alternative approach involves making an assumption about the functional form of the response variances and jointly modelling mean and dispersion effects. In this paper we compare the accuracy of testing inferences made using the two approaches. We consider several different quasi-t tests and also z tests performed after estimated generalized least squares estimation which was carried out using three different estimation strategies. The numerical evidence shows that some quasi-t tests are typically considerably less size distorted in small samples than the tests carried out after the jointly modelling of mean and dispersion effects. Finally, we present and discuss two empirical applications.  相似文献   

14.
We propose a robust regression method called regression with outlier shrinkage (ROS) for the traditional n>pn>p cases. It improves over the other robust regression methods such as least trimmed squares (LTS) in the sense that it can achieve maximum breakdown value and full asymptotic efficiency simultaneously. Moreover, its computational complexity is no more than that of LTS. We also propose a sparse estimator, called sparse regression with outlier shrinkage (SROS), for robust variable selection and estimation. It is proven that SROS can not only give consistent selection but also estimate the nonzero coefficients with full asymptotic efficiency under the normal model. In addition, we introduce a concept of nearly regression equivariant estimator for understanding the breakdown properties of sparse estimators, and prove that SROS achieves the maximum breakdown value of nearly regression equivariant estimators. Numerical examples are presented to illustrate our methods.  相似文献   

15.
Approaches for regressor construction in the linear prediction problem are investigated in a framework similar to partial least squares and continuum regression, but weighted to allow for custom specification of an evaluative scheme. A cross-validatory continuum regression procedure is proposed, and shown to compare well with ordinary continuum regression in empirical demonstrations.  相似文献   

16.
We consider the construction of designs for the extrapolation of regression responses, allowing both for possible heteroscedasticity in the errors and for imprecision in the specification of the response function. We find minimax designs and correspondingly optimal estimation weights in the context of the following problems: (1) for ordinary least squares estimation, determine a design to minimize the maximum value of the integrated mean squared prediction error (IMSPE), with the maximum being evaluated over both types of departure; (2) for weighted least squares estimation, determine both weights and a design to minimize the maximum IMSPE; (3) choose weights and design points to minimize the maximum IMSPE, subject to a side condition of unbiasedness. Solutions to (1) and (2) are given for multiple linear regression with no interactions, a spherical design space and an annular extrapolation space. For (3) the solution is given in complete generality; as one example we consider polynomial regression. Applications to a dose-response problem for bioassays are discussed. Numerical comparisons, including a simulation study, indicate that, as well as being easily implemented, the designs and weights for (3) perform as well as those for (1) and (2) and outperform some common competitors for moderate but undetectable amounts of model bias.  相似文献   

17.
Abstract

Errors-in-variable (EIV) regression is often used to gauge linear relationship between two variables both suffering from measurement and other errors, such as, the comparison of two measurement platforms (e.g., RNA sequencing vs. microarray). Scientists are often at a loss as to which EIV regression model to use for there are infinite many choices. We provide sound guidelines toward viable solutions to this dilemma by introducing two general nonparametric EIV regression frameworks: the compound regression and the constrained regression. It is shown that these approaches are equivalent to each other and, to the general parametric structural modeling approach. The advantages of these methods lie in their intuitive geometric representations, their distribution free nature, and their ability to offer candidate solutions with various optimal properties when the ratio of the error variances is unknown. Each includes the classic nonparametric regression methods of ordinary least squares, geometric mean regression (GMR), and orthogonal regression as special cases. Under these general frameworks, one can readily uncover some surprising optimal properties of the GMR, and truly comprehend the benefit of data normalization. Supplementary materials for this article are available online.  相似文献   

18.
Random coefficient regression models have been used to analyze cross-sectional and longitudinal data in economics and growth-curve data from biological and agricultural experiments. In the literature several estimators, including the ordinary least squares and the estimated generalized least squares (EGLS), have been considered for estimating the parameters of the mean model. Based on the asymptotic properties of the EGLS estimators, test statistics have been proposed for testing linear hypotheses involving the parameters of the mean model. An alternative estimator, the simple mean of the individual regression coefficients, provides estimation and hypothesis-testing procedures that are simple to compute and teach. The large sample properties of this simple estimator are shown to be similar to that of the EGLS estimator. The performance of the proposed estimator is compared with that of the existing estimators by Monte Carlo simulation.  相似文献   

19.
The geometric characterization of linear regression in terms of the ‘concentration ellipse’ by Galton [Galton, F., 1886, Family likeness in stature (with Appendix by Dickson, J.D.H.). Proceedings of the Royal Society of London, 40, 42–73.] and Pearson [Pearson, K., 1901, On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.] was extended to the case of unequal variances of the presumably uncorrelated errors in the experimental data [McCartin, B.J., 2003, A geometric characterization of linear regression. Statistics, 37(2), 101–117.]. In this paper, this geometric characterization is further extended to planar (and also linear) regression in three dimensions where a beautiful interpretation in terms of the concentration ellipsoid is developed.  相似文献   

20.
We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号