共查询到20条相似文献,搜索用时 15 毫秒
1.
The maximum absolute studentized residual is commonly used for testing for a single outlier in a linear regression model. This test statistic, however, is seldom discussed in a nonlinear regression setting. We simulate the critical values for the tests under various nonlinear models. The associated critical values are found to be very close to one another. Moreover, they are very well approximated using the critical values obtained from F-distributions based on the Bonferroni equations in linear models. The results are promising even in samples of size 6. 相似文献
2.
This article considers both Partial Least Squares (PLS) and Ridge Regression (RR) methods to combat multicollinearity problem. A simulation study has been conducted to compare their performances with respect to Ordinary Least Squares (OLS). With varying degrees of multicollinearity, it is found that both, PLS and RR, estimators produce significant reductions in the Mean Square Error (MSE) and Prediction Mean Square Error (PMSE) over OLS. However, from the simulation study it is evident that the RR performs better when the error variance is large and the PLS estimator achieves its best results when the model includes more variables. However, the advantage of the ridge regression method over PLS is that it can provide the 95% confidence interval for the regression coefficients while PLS cannot. 相似文献
3.
Five widely used test statistics for detecting outliers and influential observations were studied using Monte Carlo method . The test statistic based on Studentized residuals, with critical values given by Tietjen, Moore and Beckman (1973), appears to be the best procedure for detecting a single outlier in simple linear regression. 相似文献
4.
A weighted bootstrap approximation for comparing the error distributions in nonparametric regression
Gustavo I. Rivas Martínez 《Journal of Statistical Computation and Simulation》2017,87(18):3503-3520
Several procedures have been proposed for testing the equality of error distributions in two or more nonparametric regression models. Here we deal with methods based on comparing estimators of the cumulative distribution function (CDF) of the errors in each population to an estimator of the common CDF under the null hypothesis. The null distribution of the associated test statistics has been approximated by means of a smooth bootstrap (SB) estimator. This paper proposes to approximate their null distribution through a weighted bootstrap. It is shown that it produces a consistent estimator. The finite sample performance of this approximation is assessed by means of a simulation study, where it is also compared to the SB. This study reveals that, from a computational point of view, the proposed approximation is more efficient than the one provided by the SB. 相似文献
5.
Consider a partially linear regression model with an unknown vector parameter β, an unknown functiong(·), and unknown heteroscedastic error variances. In this paper we develop an asymptotic semiparametric generalized least
squares estimation theory under some weak moment conditions. These moment conditions are satisfied by many of the error distributions
encountered in practice, and our theory does not require the number of replications to go to infinity. 相似文献
6.
Consider the problem of pointwise estimation of f in a multivariate isotonic regression model Z=f(X1,…,Xd)+ϵ, where Z is the response variable, f is an unknown nonparametric regression function, which is isotonic with respect to each component, and ϵ is the error term. In this article, we investigate the behavior of the least squares estimator of f. We generalize the greatest convex minorant characterization of isotonic regression estimator for the multivariate case and use it to establish the asymptotic distribution of properly normalized version of the estimator. Moreover, we test whether the multivariate isotonic regression function at a fixed point is larger (or smaller) than a specified value or not based on this estimator, and the consistency of the test is established. The practicability of the estimator and the test are shown on simulated and real data as well. 相似文献
7.
《Journal of Statistical Computation and Simulation》2012,82(1):15-23
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox–Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k, provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level. 相似文献
8.
We discuss and evaluate bootstrap algorithms for obtaining confidence intervals for parameters in Generalized Linear Models when the data are correlated. The methods are based on a stratified bootstrap and are suited to correlation occurring within “blocks” of data (e.g., individuals within a family, teeth within a mouth, etc.). Application of the intervals to data from a Dutch follow-up study on preterm infants shows the corroborative usefulness of the intervals, while the intervals are seen to be a powerful diagnostic in studying annual measles data. In a simulation study, we compare the coverage rates of the proposed intervals with existing methods (e.g., via Generalized Estimating Equations). In most cases, the bootstrap intervals are seen to perform better than current methods, and are produced in an automatic fashion, so that the user need not know (or have to guess) the dependence structure within a block. 相似文献
9.
Nityananda Sarkar 《统计学通讯:理论与方法》2013,42(7):1987-2000
It is well-known in the literature on multicollinearity that one of the major consequences of multicollinearity on the ordinary least squares estimator is that the estimator produces large sampling variances, which in turn might inappropriately lead to exclusion of otherwise significant coefficients from the model. To circumvent this problem, two accepted estimation procedures which are often suggested are the restricted least squares method and the ridge regression method. While the former leads to a reduction in the sampling variance of the estimator, the later ensures a smaller mean square error value for the estimator. In this paper we have proposed a new estimator which is based on a criterion that combines the ideas underlying these two estimators. The standard properties of this new estimator have been studied in the paper. It has also been shown that this estimator is superior to both the restricted least squares as well as the ordinary ridge regression estimators by the criterion of mean sauare error of the estimator of the regression coefficients when the restrictions are indeed correct. The conditions for superiority of this estimator over the other two have also been derived for the situation when the restrictions are not correct. 相似文献
10.
Bin Wang Satya N. MishraMadhuri S. Mulekar Nutan MishraKun Huang 《Journal of statistical planning and inference》2010
The generalized bootstrap is a parametric bootstrap method in which the underlying distribution function is estimated by fitting a generalized lambda distribution to the observed data. In this study, the generalized bootstrap is compared with the traditional parametric and non-parametric bootstrap methods in estimating the quantiles at different levels, especially for high quantiles. The performances of the three methods are evaluated in terms of cover rate, average interval width and standard deviation of width of the 95% bootstrap confidence intervals. Simulation results showed that the generalized bootstrap has overall better performance than the non-parametric bootstrap in high quantile estimation. 相似文献
11.
Jack Kleijnen 《统计学通讯:模拟与计算》2013,42(3):303-313
In experimental design applications unbiased estimators si 2 of the variances σi 2 are possible. These estimators may be used in Weighted Least Squares (WLS) when estimating the parameters β. The resulting small-sample behavior is investigated in a Monte Carlo experiment. This experiment shows that an asymptotically valid covariance formula can be used if si 2 is based on, say, at least 5 observations. The WLS estimator based on estimators si 2 gives more accurate estimators of β, provided the σi 2 differ by a factor, say, 10. 相似文献
12.
Jong-Wuu Wu 《Statistical Papers》2001,42(4):489-503
In this paper, we suggest a least squares procedure for the determination of the number of upper outliers in an exponential sample by minimizing sample mean squared error. Moreover, the method can reduce the masking or “swamping” effects. In addition, we have also found that the least squares procedure is easy and simple to compute than test test procedure T k suggested by Zhang (1998) for determining the number of upper outliers, since Zhang (1998) need to use the complicated null distribution of T k . Moreover, we give three practical examples and a simulated example to illustrate the procedures. Further, simulation studies are given to show the advantages of the proposed method. Finally, the proposed least squares procedure can also determine the number of upper outliers in other continuous univariate distributions (for example, Pareto, Gumbel, Weibull, etc.). Received: May 10, 1999; revised version: June 5, 2000 相似文献
13.
The paper considers the consequences of incorrectly using the ordinary least squares estimator, when the true but unknown model is a switching regression. Bias and mean square error express ons are given for slope and residual variance estimators. Except for in very specialized cases the estimators are biased. A numerical exarnple illustrates some of the issues raised and provides a conpelison between the ordinary least squares and maximum likelihood estimators. 相似文献
14.
Andreas Fromkorth Michael Kohler 《Journal of statistical planning and inference》2011,141(1):172-188
Estimation of a regression function from independent and identical distributed data is considered. The L2 error with integration with respect to the design measure is used as error criterion. Upper bounds on the L2 error of least squares regression estimates are presented, which bound the error of the estimate in case that in the sample given to the estimate the values of the independent and the dependent variables are pertubated by some arbitrary procedure. The bounds are applied to analyze regression-based Monte Carlo methods for pricing American options in case of errors in modelling the price process. 相似文献
15.
J. S. Chawla 《Statistical Papers》1988,29(1):227-230
The necessary and sufficient condition is obtained such that ridge estimator is better than the least squares estimator relative
to the matrix mean square error. 相似文献
16.
Alex de la Cruz Huayanay Jorge L. Bazán Vicente G. Cancho Dipak K. Dey 《Journal of Statistical Computation and Simulation》2019,89(9):1694-1714
In binary regression, imbalanced data result from the presence of values equal to zero (or one) in a proportion that is significantly greater than the corresponding real values of one (or zero). In this work, we evaluate two methods developed to deal with imbalanced data and compare them to the use of asymmetric links. The results based on simulation study show, that correction methods do not adequately correct bias in the estimation of regression coefficients and that the models with power links and reverse power considered produce better results for certain types of imbalanced data. Additionally, we present an application for imbalanced data, identifying the best model among the various ones proposed. The parameters are estimated using a Bayesian approach, considering the Hamiltonian Monte-Carlo method, utilizing the No-U-Turn Sampler algorithm and the comparisons of models were developed using different criteria for model comparison, predictive evaluation and quantile residuals. 相似文献
17.
Bootstrapping has been used as a diagnostic tool for validating model results for a wide array of statistical models. Here we evaluate the use of the non-parametric bootstrap for model validation in mixture models. We show that the bootstrap is problematic for validating the results of class enumeration and demonstrating the stability of parameter estimates in both finite mixture and regression mixture models. In only 44% of simulations did bootstrapping detect the correct number of classes in at least 90% of the bootstrap samples for a finite mixture model without any model violations. For regression mixture models and cases with violated model assumptions, the performance was even worse. Consequently, we cannot recommend the non-parametric bootstrap for validating mixture models.
The cause of the problem is that when resampling is used influential individual observations have a high likelihood of being sampled many times. The presence of multiple replications of even moderately extreme observations is shown to lead to additional latent classes being extracted. To verify that these replications cause the problems we show that leave-k-out cross-validation where sub-samples taken without replacement does not suffer from the same problem. 相似文献
18.
This paper presents a brief review of the asymptotic properties of the pseudo-maximum likelihood estimator in the regression model where the reciprocal of the mean of the dependent variable is considered to be a linear function of the regressor variables, and the observations on the dependent variable are assumed to have an inverse Gaussian distribution. The large sample theory for the pseudo-maximum likelihood estimator presented in Babu and Chaubey (1996) is highlighted and a simulation study is carried out to compare the approximation yielded by the bootstrap distribution to that of the asymptotic distribution. 相似文献
19.
A. H.M. Rahmatullah Imon 《Journal of applied statistics》2009,36(3):347-358
The heterogeneity of error variance often causes a huge interpretive problem in linear regression analysis. Before taking any remedial measures we first need to detect this problem. A large number of diagnostic plots are now available in the literature for detecting heteroscedasticity of error variances. Among them the ‘residuals’ and ‘fits’ (R–F) plot is very popular and commonly used. In the R–F plot residuals are plotted against the fitted responses, where both these components are obtained using the ordinary least squares (OLS) method. It is now evident that the OLS fits and residuals suffer a huge setback in the presence of unusual observations and hence the R–F plot may not exhibit the real scenario. The deletion residuals based on a data set free from all unusual cases should estimate the true errors in a better way than the OLS residuals. In this paper we propose ‘deletion residuals’ and the ‘deletion fits’ (DR–DF) plot for the detection of the heterogeneity of error variances in a linear regression model to get a more convincing and reliable graphical display. Examples show that this plot locates unusual observations more clearly than the R–F plot. The advantage of using deletion residuals in the detection of heteroscedasticity of error variance is investigated through Monte Carlo simulations under a variety of situations. 相似文献
20.
We study nonlinear least-squares problem that can be transformed to linear problem by change of variables. We derive a general formula for the statistically optimal weights and prove that the resulting linear regression gives an optimal estimate (which satisfies an analogue of the Rao-Cramer lower bound) in the limit of small noise. 相似文献