首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new statistic, (p), is developed for variable selection in a system-of-equations model. The standardized total mean square error in the (p)statistic is weighted by the covariance matrix of dependent variables instead of the error covariance matrix of the true model as in the original definition. The new statistic can be also used for model selection in the non-nested models. The estimate of (p), SC(p), is derived and shown to become SCε(p) in the similar form of Cp in a single-equation model when the covariance matrix of sampled dependent variables is replaced by the error covariance matrix under the full model.  相似文献   

2.
Robust automatic selection techniques for the smoothing parameter of a smoothing spline are introduced. They are based on a robust predictive error criterion and can be viewed as robust versions of C p and cross-validation. They lead to smoothing splines which are stable and reliable in terms of mean squared error over a large spectrum of model distributions.  相似文献   

3.
We developed robust estimators that minimize a weighted L1 norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L1 estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L1 estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study.  相似文献   

4.
Process capability index Cp has been the most popular one used in the manufacturing industry to provide numerical measures on process precision. For normally distributed processes with automatic fully inspections, the inspected processes follow truncated normal distributions. In this article, we provide the formulae of moments used for the Edgeworth approximation on the precision measurement Cp for truncated normally distributed processes. Based on the developed moments, lower confidence bounds with various sample sizes and confidence levels are provided and tabulated. Consequently, practitioners can use lower confidence bounds to determine whether their manufacturing processes are capable of preset precision requirements.  相似文献   

5.
We propose the L1 distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L1 in the large-N limit. We argue that the L1 distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k.  相似文献   

6.
The conceptual predictive statistic, Cp, is a widely used criterion for model selection in linear regression. Cp serves as an estimator of a discrepancy, a measure that reflects the disparity between the generating model and a fitted candidate model. This discrepancy, based on scaled squared error loss, is asymmetric: an alternate measure is obtained by reversing the roles of the two models in the definition of the measure. We propose a variant of the Cp statistic based on estimating a symmetrized version of the discrepancy targeted by Cp. We claim that the resulting criterion provides better protection against overfitting than Cp, since the symmetric discrepancy is more sensitive towards detecting overspecification than its asymmetric counterpart. We illustrate our claim by presenting simulation results. Finally, we demonstrate the practical utility of the new criterion by discussing a modeling application based on data collected in a cardiac rehabilitation program at University of Iowa Hospitals and Clinics.  相似文献   

7.
《统计学通讯:理论与方法》2012,41(13-14):2465-2489
The Akaike information criterion, AIC, and Mallows’ C p statistic have been proposed for selecting a smaller number of regressors in the multivariate regression models with fully unknown covariance matrix. All of these criteria are, however, based on the implicit assumption that the sample size is substantially larger than the dimension of the covariance matrix. To obtain a stable estimator of the covariance matrix, it is required that the dimension of the covariance matrix is much smaller than the sample size. When the dimension is close to the sample size, it is necessary to use ridge-type estimators for the covariance matrix. In this article, we use a ridge-type estimators for the covariance matrix and obtain the modified AIC and modified C p statistic under the asymptotic theory that both the sample size and the dimension go to infinity. It is numerically shown that these modified procedures perform very well in the sense of selecting the true model in large dimensional cases.  相似文献   

8.
ABSTRACT

In this article, we propose a more general criterion called Sp -criterion, for subset selection in the multiple linear regression Model. Many subset selection methods are based on the Least Squares (LS) estimator of β, but whenever the data contain an influential observation or the distribution of the error variable deviates from normality, the LS estimator performs ‘poorly’ and hence a method based on this estimator (for example, Mallows’ Cp -criterion) tends to select a ‘wrong’ subset. The proposed method overcomes this drawback and its main feature is that it can be used with any type of estimator (either the LS estimator or any robust estimator) of β without any need for modification of the proposed criterion. Moreover, this technique is operationally simple to implement as compared to other existing criteria. The method is illustrated with examples.  相似文献   

9.
This article is concerned with testing multiple hypotheses, one for each of a large number of small data sets. Such data are sometimes referred to as high-dimensional, low-sample size data. Our model assumes that each observation within a randomly selected small data set follows a mixture of C shifted and rescaled versions of an arbitrary density f. A novel kernel density estimation scheme, in conjunction with clustering methods, is applied to estimate f. Bayes information criterion and a new criterion weighted mean of within-cluster variances are used to estimate C, which is the number of mixture components or clusters. These results are applied to the multiple testing problem. The null sampling distribution of each test statistic is determined by f, and hence a bootstrap procedure that resamples from an estimate of f is used to approximate this null distribution.  相似文献   

10.
The problem of estimating the common mean μ of two univariate normal populations with unknown and unequal variances is considered from a decision-theoretic point of view. We restrict our attention to an appropriate class C and its three subclasses C0C1C2of un-biased estimates of μ. We consider the usual estimate μ0 of μ which is the weighted linear combination of the sample means with weights as reciprocals of the sample variances. Its admissibility in C0 and extended admissibility in C is proved. Admissible estimates in C1 and C2are also obtained.The loss is always assumed to be squared error. The question of admissibility of μ0 in the class of all estimators is still open.  相似文献   

11.
The concept of causality is naturally defined in terms of conditional distribution, however almost all the empirical works focus on causality in mean. This paper aims to propose a nonparametric statistic to test the conditional independence and Granger non-causality between two variables conditionally on another one. The test statistic is based on the comparison of conditional distribution functions using an L2 metric. We use Nadaraya–Watson method to estimate the conditional distribution functions. We establish the asymptotic size and power properties of the test statistic and we motivate the validity of the local bootstrap. We ran a simulation experiment to investigate the finite sample properties of the test and we illustrate its practical relevance by examining the Granger non-causality between S&P 500 Index returns and VIX volatility index. Contrary to the conventional t-test which is based on a linear mean-regression, we find that VIX index predicts excess returns both at short and long horizons.  相似文献   

12.
Superefficiency of a projection density estimator The author constructs a projection density estimator with a data‐driven truncation index. This estimator reaches the superoptimal rates 1/n in mean integrated square error and {In ln(n/n}1/2 in uniform almost sure convergence over a given subspace which is dense in the class of all possible densities; the rate of the estimator is quasi‐optimal everywhere else. The subspace in question may be chosen a priori by the statistician.  相似文献   

13.
Consider the problem of estimating the mean of a p (≥3)-variate multi-normal distribution with identity variance-covariance matrix and with unweighted sum of squared error loss. A class of minimax, noncomparable (i.e. no estimate in the class dominates any other estimate in the class) estimates is proposed; the class contains rules dominating the simple James-Stein estimates. The estimates are essentially smoothed versions of the scaled, truncated James-Stein estimates studied by Efron and Morris. Explicit and analytically tractable expressions for their risks are obtained and are used to give guidelines for selecting estimates within the class.  相似文献   

14.
A number of efficient computer codes are available for the simple linear L 1 regression problem. However, a number of these codes can be made more efficient by utilizing the least squares solution. In fact, a couple of available computer programs already do so.

We report the results of a computational study comparing several openly available computer programs for solving the simple linear L 1 regression problem with and without computing and utilizing a least squares solution.  相似文献   

15.
The performance of nine different nonparametric regression estimates is empirically compared on ten different real datasets. The number of data points in the real datasets varies between 7, 900 and 18, 000, where each real dataset contains between 5 and 20 variables. The nonparametric regression estimates include kernel, partitioning, nearest neighbor, additive spline, neural network, penalized smoothing splines, local linear kernel, regression trees, and random forests estimates. The main result is a table containing the empirical L2 risks of all nine nonparametric regression estimates on the evaluation part of the different datasets. The neural networks and random forests are the two estimates performing best. The datasets are publicly available, so that any new regression estimate can be easily compared with all nine estimates considered in this article by just applying it to the publicly available data and by computing its empirical L2 risks on the evaluation part of the datasets.  相似文献   

16.
When one or few observations are deleted in the multiple linear regression model, they can affect the variable selection. In this paper we derived the formula for the Mallows Cp criterion when k observations are deleted and express it as a functionn of basic building blocks such as residuals and leverages. Also, two real date sets are used to see how the selected model changes as few observations re deleted.  相似文献   

17.
In this article, we propose a method of averaging generalized least squares estimators for linear regression models with heteroskedastic errors. The averaging weights are chosen to minimize Mallows’ Cp-like criterion. We show that the weight vector selected by our method is optimal. It is also shown that this optimality holds even when the variances of the error terms are estimated and the feasible generalized least squares estimators are averaged. The variances can be estimated parametrically or nonparametrically. Monte Carlo simulation results are encouraging. An empirical example illustrates that the proposed method is useful for predicting a measure of firms’ performance.  相似文献   

18.
For the general linear regression model Y = Xη + e, we construct small-sample exponentially tilted empirical confidence intervals for a linear parameter 6 = aTη and for nonlinear functions of η. The coverage error for the intervals is Op(1/n), as shown in Tingley and Field (1990). The technique, though sample-based, does not require bootstrap resampling. The first step is calculation of an estimate for η. We have used a Mallows estimate. The algorithm applies whenever η is estimated as the solution of a system of equations having expected value 0. We include calculations of the relative efficiency of the estimator (compared with the classical least-squares estimate). The intervals are compared with asymptotic intervals as found, for example, in Hampel et at. (1986). We demonstrate that the procedure gives sensible intervals for small samples.  相似文献   

19.
In this paper we show that versions of statistical functionals which are obtained by smoothing the corresponding empirical d.f. with an appropriate kernel can reduce the variance and the mean square error of the statistic. This is shown by studying the influence function of the functional. The smaller variance is achieved when the influence function is either discontinuous or piecewise linear with convexity towards the x-axis. Examples for M- and L-estimators are given.  相似文献   

20.
In bayesian inference, the Bayes estimator is the alternative with the minimum expected loss. In most cases, the loss function shows the distance between the alternative and the parameter. Therefore, any distance can lead to a loss function. Among the best known distance functions is L p one, where the choice of value p may be difficult and arbitrary. This paper examines robust models where the loss function is modelled by family L p . Our solution concept is the non-dominated alternative. We characterize the non-dominated set by having the posterior distribution function satisfy a particular asymmetry property. We also include an example to illustrate the methodology described.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号