首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
We consider a regression analysis of multivariate response on a vector of predictors. In this article, we develop a sliced inverse regression-based method for reducing the dimension of predictors without requiring a prespecified parametric model. Our proposed method preserves as much regression information as possible. We derive the asymptotic weighted chi-squared test for dimension. Simulation results are reported and comparisons are made with three methods—most predictable variates, k-means inverse regression and canonical correlation approach.  相似文献   

2.
In this paper, we introduce linear modeling of canonical correlation analysis, which estimates canonical direction matrices by minimising a quadratic objective function. The linear modeling results in a class of estimators of canonical direction matrices, and an optimal class is derived in the sense described herein. The optimal class guarantees several of the following desirable advantages: first, its estimates of canonical direction matrices are asymptotically efficient; second, its test statistic for determining the number of canonical covariates always has a chi‐squared distribution asymptotically; third, it is straight forward to construct tests for variable selection. The standard canonical correlation analysis and other existing methods turn out to be suboptimal members of the class. Finally, we study the role of canonical variates as a means of dimension reduction for predictors and responses in multivariate regression. Numerical studies and data analysis are presented.  相似文献   

3.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

4.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

5.
In discrete event simulation, the method of control variates is often used to reduce the variance of estimation for the mean of the output response. In the present paper, it is shown that when three or more control variates are used, the usual linear regression estimator of the mean response is one of a large class of unbiased estimators, many of which have smaller variance than the usual estimator. In simulation studies using control variates, a confidence interval for the mean response is typically reported as well. Intervals with shorter width have been proposed using control variates in the literature. The present paper however develops confidence intervals which not only have shorter width but also have higher coverage probability than the usual confidence interval  相似文献   

6.
For estimation of time-varying coefficient longitudinal models, the widely used local least-squares (LS) or covariance-weighted local LS smoothing uses information from the local sample average. Motivated by the fact that a combination of multiple quantiles provides a more complete picture of the distribution, we investigate quantile regression-based methods to improve efficiency by optimally combining information across quantiles. Under the working independence scenario, the asymptotic variance of the proposed estimator approaches the Cramér–Rao lower bound. In the presence of dependence among within-subject measurements, we adopt a prewhitening technique to transform regression errors into independent innovations and show that the prewhitened optimally weighted quantile average estimator asymptotically achieves the Cramér–Rao bound for the independent innovations. Fully data-driven bandwidth selection and optimal weights estimation are implemented through a two-step procedure. Monte Carlo studies show that the proposed method delivers more robust and superior overall performance than that of the existing methods.  相似文献   

7.
The canonical variates in canonical correlation analysis are often interpreted by looking at the weights or loadings of the variables in each canonical variate and effectively ignoring those variables whose weights or loadings are small. It is shown that such a procedure can be misleading. The related problem of selecting a subset of the original variables which preserves the information in the most important canonical variates is also examined. Because of different possible definitions of ‘the information in canonical variates’, any such subset selection needs very careful consideration.  相似文献   

8.
Neglecting heteroscedasticity of error terms may imply the wrong identification of a regression model (see appendix). Employment of (heteroscedasticity resistent) White's estimator of covariance matrix of estimates of regression coefficients may lead to the correct decision about the significance of individual explanatory variables under heteroscedasticity. However, White's estimator of covariance matrix was established for least squares (LS)-regression analysis (in the case when error terms are normally distributed, LS- and maximum likelihood (ML)-analysis coincide and hence then White's estimate of covariance matrix is available for ML-regression analysis, tool). To establish White's-type estimate for another estimator of regression coefficients requires Bahadur representation of the estimator in question, under heteroscedasticity of error terms. The derivation of Bahadur representation for other (robust) estimators requires some tools. As the key too proved to be a tight approximation of the empirical distribution function (d.f.) of residuals by the theoretical d.f. of the error terms of the regression model. We need the approximation to be uniform in the argument of d.f. as well as in regression coefficients. The present paper offers this approximation for the situation when the error terms are heteroscedastic.  相似文献   

9.
ABSTRACT

In this article, we propose a more general criterion called Sp -criterion, for subset selection in the multiple linear regression Model. Many subset selection methods are based on the Least Squares (LS) estimator of β, but whenever the data contain an influential observation or the distribution of the error variable deviates from normality, the LS estimator performs ‘poorly’ and hence a method based on this estimator (for example, Mallows’ Cp -criterion) tends to select a ‘wrong’ subset. The proposed method overcomes this drawback and its main feature is that it can be used with any type of estimator (either the LS estimator or any robust estimator) of β without any need for modification of the proposed criterion. Moreover, this technique is operationally simple to implement as compared to other existing criteria. The method is illustrated with examples.  相似文献   

10.
The mode of a distribution provides an important summary of data and is often estimated on the basis of some non‐parametric kernel density estimator. This article develops a new data analysis tool called modal linear regression in order to explore high‐dimensional data. Modal linear regression models the conditional mode of a response Y given a set of predictors x as a linear function of x . Modal linear regression differs from standard linear regression in that standard linear regression models the conditional mean (as opposed to mode) of Y as a linear function of x . We propose an expectation–maximization algorithm in order to estimate the regression coefficients of modal linear regression. We also provide asymptotic properties for the proposed estimator without the symmetric assumption of the error density. Our empirical studies with simulated data and real data demonstrate that the proposed modal regression gives shorter predictive intervals than mean linear regression, median linear regression and MM‐estimators.  相似文献   

11.
Vasicek (1976) proposed an estimator of entropy based on spacings. A new estimator of entropy is proposed. This new estimator is based on local linear regression. Comparisons between this new estimator and Vasicek's estimator are made. The mean square error (MSE) of the new estimator is consistently smaller than the MSE of Vasicek's estimator.  相似文献   

12.
Summary. A new estimator of the regression parameters is introduced in a multivariate multiple-regression model in which both the vector of explanatory variables and the vector of response variables are assumed to be random. The affine equivariant estimate matrix is constructed using the sign covariance matrix (SCM) where the sign concept is based on Oja's criterion function. The influence function and asymptotic theory are developed to consider robustness and limiting efficiencies of the SCM regression estimate. The estimate is shown to be consistent with a limiting multinormal distribution. The influence function, as a function of the length of the contamination vector, is shown to be linear in elliptic cases; for the least squares (LS) estimate it is quadratic. The asymptotic relative efficiencies with respect to the LS estimate are given in the multivariate normal as well as the t -distribution cases. The SCM regression estimate is highly efficient in the multivariate normal case and, for heavy-tailed distributions, it performs better than the LS estimate. Simulations are used to consider finite sample efficiencies with similar results. The theory is illustrated with an example.  相似文献   

13.
A Bayesian formulation of the canonical form of the standard regression model is used to compare various Stein-type estimators and the ridge estimator of regression coefficients, A particular (“constant prior”) Stein-type estimator having the same pattern of shrinkage as the ridge estimator is recommended for use.  相似文献   

14.
In regression analysis, to deal with the problem of multicollinearity, the restricted principal components regression estimator is proposed. In this paper, we compared the restricted principal components regression estimator, the principal components regression estimator, and the ordinary least-squares estimator with each other under the Pitman's closeness criterion. We showed that the restricted principal components regression estimator is always superior to the principal components regression estimator, under certain conditions the restricted principal components regression estimator is superior to the ordinary least-squares estimator under the Pitman's closeness criterion and under certain conditions the principal components regression estimator is superior to the ordinary least-squares estimator under the Pitman's closeness criterion.  相似文献   

15.
This work is concerned with the estimation of multi-dimensional regression and the asymptotic behavior of the test involved in selecting models. The main problem with such models is that we need to know the covariance matrix of the noise to get an optimal estimator. We show in this article that if we choose to minimize the logarithm of the determinant of the empirical error covariance matrix, then we get an asymptotically optimal estimator. Moreover, under suitable assumptions, we show that this cost function leads to a very simple asymptotic law for testing the number of parameters of an identifiable and regular regression model. Numerical experiments confirm the theoretical results.  相似文献   

16.
The Dabrowska (Ann Stat 16:1475–1489, 1988) product integral representation of the multivariate survivor function is extended, leading to a nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation. Empirical process methods are used to sketch proofs for this estimator’s strong consistency and weak convergence properties. Summary measures of pairwise and higher-order dependencies are also defined and nonparametrically estimated. Simulation evaluation is given for the special case of three failure time variates.  相似文献   

17.
As known, the ordinary least-squares estimator (OLSE) is unbiased and also, has the minimum variance among all the linear unbiased estimators. However, under multicollinearity the estimator is generally unstable and poor in the sense that variance of the regression coefficients may be inflated and absolute values of the estimates may be too large. There are several classes of biased estimators in statistical literature to decrease the effect of multicollinearity in the design matrix. Here, based on the Cholesky decomposition, we propose such an estimator which makes the data to be slightly distorted. The exact risk expressions as well as the biases are derived for the proposed estimator. Also, some results demonstrating superiority of the suggested estimator over OLSE are obtained. Finally, a Monté-Carlo simulation study and a real data application related to acetylene data are presented to support our theoretical discussions.  相似文献   

18.
Seemingly unrelated regression (SUR) method is applied to the instrumental variable (IV) estimation of the canonical contagion models. A finite sample Monte Carlo experiment shows that the resulting estimator, IV-SUR estimator, is substantially better than the existing IV estimator in terms of both bias and mean squares error under diverse circumstance of instrument, conditional heteroscedasticity, and cross-section correlation.  相似文献   

19.
Variance estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional variance estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the level of noise. We propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation, to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator and the plug-in one-stage estimators using the lasso and smoothly clipped absolute deviation are also studied and compared. Their performances can be improved by the reffitted cross-validation method proposed.  相似文献   

20.
Swindel (1976) introduced a modified ridge regression estimator based on prior information. A necessary and sufficient condition is derived for Swindel's proposed estimator to have lower risk than the conventional ordinary ridge regression estimator when both estimators are computed using the same value of k.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号