期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of some lack of fit tests based on near replicates

James W. Neill Dallas E. Johnson 《统计学通讯:理论与方法》2013,42(10):3533-3570

Several tests for regression lack of fit proposed by Christensen (1989), Shillington (1979) and Neill and Johnson (1985) are compared. The tests considered are applicable for the case of nonreplication and reduce to the classical lack of fit test when independent replications are available. A simulation study is used to compare the size and power of the test procedures for small sample sizes and various configurations of nonreplication. In addition, each test is shown to be consistent as well as invariant with respect to location and scale changes made on the regressor variables. 相似文献

2.

General lack of fit tests based on families of groupings

Forrest R. Miller James W. Neill 《Journal of statistical planning and inference》2008,138(8):2433-2449

Lack of fit tests based on groupings of the observations are developed. These tests are first applied to models with replication. In this case, the classic Fisher test assumes that the true model is contained in the one-way ANOVA model. However, Christensen [(2003). Significantly insignificant F tests. Amer. Statist. 57, 27–32] has noted that small values of the F-statistic may indicate lack of fit due to features which are not part of the proposed model. Such model inadequacy is called within-cluster lack of fit, whereas the standard Fisher lack of fit is called between-cluster lack of fit. Typically, lack of fit exists as a combination of these two pure types, and can be extremely difficult to detect depending on the nature of the mixture. In this paper, the one-way ANOVA model is embedded in larger models using groupings of the observations, which provides tests with good power for detecting all of the above types of model inadequacies, including mixtures. In particular, several such tests are considered, each based on a different grouping of the observations, and the multiple testing approach of Baraud et al. [(2003). Adaptive tests of linear hypotheses by model selection. Ann. Statist. 31, 225–251] is followed. More generally, the preceding testing procedure based on families of groupings is extended to the case of nonreplication. For this case, it is proposed that such families be determined by linear orders on the predictors based on disjoint parallel tubes in predictor space. Test statistics follow the cluster-based regression lack of fit tests presented by Christensen [(1989). Lack of fit based on near or exact replicates. Ann. Statist. 17, 673–683; (1991). Small sample characterizations of near replicate lack of fit tests. J. Amer. Statist. Assoc. 86, 752–756], by considering the groupings as determining special types of clusterings. In order to detect general lack of fit, several such tests are again considered, each based on a different grouping of the observations, and the multiple testing approach given by Baraud et al. [(2003). Adaptive tests of linear hypotheses by model selection. Ann. Statist. 31, 225–251] is followed. Simulation results illustrating the power of the proposed testing procedure are given. 相似文献

3.

Approaches to regression analysis with multiple measurements from individual sampling units

《Journal of Statistical Computation and Simulation》2012,82(3-4):149-175

Application of ordinary least-squares regression to data sets which contain multiple measurements from individual sampling units produces an unbiased estimator of the parameters but a biased estimator of the covariance matrix of the parameter estimates. The present work considers a random coefficient, linear model to deal with such data sets: this model permits many senses in which multiple measurements are taken from a sampling unit, not just when it is measured at several times. Three procedures to estimate the covariance matrix of the error term of the model are considered. Given these, three procedures to estimate the parameters of the model and their covariance matrix are considered; these are ordinary least-squares, generalized least-squares, and an adjusted ordinary least-squares procedure which produces an unbiased estimator of the covariance matrix of the parameters with small samples. These various procedures are compared in simulation studies using three examples from the biological literature. The possibility of testing hypotheses about the vector of parameters is also considered. It is found that all three procedures for regression estimation produce estimators of the parameters with bias of no practical consequence, Both generalized least-squares and adjusted ordinary least-squares generally produce estimators of the covariance matrix of the parameter estimates with bias of no practical consequence, while ordinary least-squares produces a negatively biased estimator. Neither ordinary nor generalized least-squares provide satisfactory hypothesis tests of the vector of parameter estimates. It is concluded that adjusted ordinary least-squares, when applied with either of two of the procedures used to estimate the error coveriance matrix, shows promise for practical application with data sets of the nature considered here. 相似文献

4.

Goodness-of-fit test for Gaussian regression with block correlated errors

S. Huet 《Statistics》2015,49(2):239-266

We propose a procedure to test that the expectation of a Gaussian vector is linear against a nonparametric alternative. We consider the case where the covariance matrix of the observations has a block diagonal structure. This framework encompasses regression models with autocorrelated errors, heteroscedastic regression models, mixed-effects models and growth curves. Our procedure does not depend on any prior information about the alternative. We prove that the test is asymptotically of the nominal level and consistent. We characterize the set of vectors on which the test is powerful and prove the classical √log log (n)/n convergence rate over directional alternatives. We propose a bootstrap version of the test as an alternative to the initial one and provide a simulation study in order to evaluate both procedures for small sample sizes when the purpose is to test goodness of fit in a Gaussian mixed-effects model. Finally, we illustrate the procedures using a real data set. 相似文献

5.

Admissible variable-selection procedures when fitting misspecified regression models by least squares

Paul Kabaila 《统计学通讯:理论与方法》2013,42(10):2299-2306

For loss equal to squared error of prediction, Kempthorne(l984) has proved that all variable-selection procedures are admissible for choosing among least-squares fits of a normal linear regression model. We extend this result to the case of a normal linear regression model in which the form of the expected response vector is misspecified. 相似文献

6.

BOOTSTRAP TESTS FOR THE ERROR DISTRIBUTION IN LINEAR AND NONPARAMETRIC REGRESSION MODELS

Natalie Neumeyer Holger Dette Eva-Renate Nagel 《Australian & New Zealand Journal of Statistics》2006,48(2):129-156

In this paper we investigate several tests for the hypothesis of a parametric form of the error distribution in the common linear and non‐parametric regression model, which are based on empirical processes of residuals. It is well known that tests in this context are not asymptotically distribution‐free and the parametric bootstrap is applied to deal with this problem. The performance of the resulting bootstrap test is investigated from an asymptotic point of view and by means of a simulation study. The results demonstrate that even for moderate sample sizes the parametric bootstrap provides a reliable and easy accessible solution to the problem of goodness‐of‐fit testing of assumptions regarding the error distribution in linear and non‐parametric regression models. 相似文献

7.

Specification tests in ordered logit and probit models

Andrew A. Weiss 《Econometric Reviews》1997,16(4):361-391

In this paper, I study the application of various specification tests to ordered logit and probit models with heteroskedastic errors, with the primary focus on the ordered probit model. The tests are Lagrange multiplier tests, information matrix tests, and chi-squared goodness of fit tests. The alternatives are omitted variables in the regression equation, omitted varaibles in the equation describing the heteroskedasticity, and non-logistic/non-normal errors. The alternative error distributions include a generalized logistic distribution in the ordered logit model and the Pearson family in the ordered. 相似文献

8.

Influence diagnostics in constrained general linear models

Hadi Emami Mostafa Emami 《统计学通讯:理论与方法》2013,42(18):5331-5340

ABSTRACT

Constrained general linear models (CGLMs) have wide applications in practice. Similar to other data analysis, the identification of influential observations that may be potential outliers is an important step beyond in the CGLMs. We develop multiple case-deletion diagnostics for detecting influential observations in the CGLMs. The diagnostics are functions of basic building blocks: studentized residuals, error contrast matrix, and the inverse of the response variable covariance matrix. The basic building blocks are computed only once from the complete data analysis and provide information on the influence of the data on different aspects of the model fit. Computational formulas are given which make the procedures feasible. An illustrative example with a real data set is also reported. 相似文献

9.

Testing for lack of fit in regression - a review

James W. Neill Dallas E. Johnson 《统计学通讯:理论与方法》2013,42(4):485-511

Assessment of the adequacy of a proposed linear regression model is necessarily subjective. However, the following three criteria may warrant investigation whether the distributional assumptions for the stochastic portion of the model are satisfied, whether the predictive capability of the model is satisfactory, and whether the deterministic portion of the model is adejuate in a statistical sense. The first two criteria have been reviewed in the literature to some extent. This paper reviews statistical tests and procedures which aid the experimenter in deterrmining lack of fit or functional misspecification associated with the deterministic portion of a proposed linear regression model. 相似文献

10.

Lack of fit tests based on sums of ordered residuals for linear models

下载免费PDF全文

Mohammad W. Hattab Ronald Christensen 《Australian & New Zealand Journal of Statistics》2018,60(2):230-257

Christensen & Lin ( 2015 ) suggested two lack of fit tests to assess the adequacy of a linear model based on partial sums of residuals. In particular, their tests evaluated the adequacy of the mean function. Their tests relied on asymptotic results without requiring small sample normality. We propose four new tests, find their asymptotic distributions, and propose an alternative simulation method for defining tests that is remarkably robust to the distribution of the errors. To assess their strengths and weaknesses, the Christensen & Lin ( 2015 ) tests and the new tests were compared in different scenarios by simulation. In particular, the new tests include two based on partial sums of absolute residuals. Previous partial sums of residuals tests have used signed residuals whose values when summed can cancel each other out. The use of absolute residuals requires small sample normality, but allows detection of lack of fit that was previously not possible with partial sums of residuals. 相似文献

11.

Prediction Error Estimation Under Bregman Divergence for Non-Parametric Regression and Classification

CHUNMING ZHANG 《Scandinavian Journal of Statistics》2008,35(3):496-523

Abstract. Prediction error is critical to assess model fit and evaluate model prediction. We propose the cross-validation (CV) and approximated CV methods for estimating prediction error under the Bregman divergence (BD), which embeds nearly all of the commonly used loss functions in the regression, classification procedures and machine learning literature. The approximated CV formulas are analytically derived, which facilitate fast estimation of prediction error under BD. We then study a data-driven optimal bandwidth selector for local-likelihood estimation that minimizes the overall prediction error or equivalently the covariance penalty. It is shown that the covariance penalty and CV methods converge to the same mean-prediction-error-criterion. We also propose a lower-bound scheme for computing the local logistic regression estimates and demonstrate that the algorithm monotonically enhances the target local likelihood and converges. The idea and methods are extended to the generalized varying-coefficient models and additive models. 相似文献

12.

Finite Sample Performance of tests for Symmetry of the Errors in a Linear Model

《Journal of Statistical Computation and Simulation》2012,82(11):863-879

The finite sample performance of a number of tests for symmetry of the distribution of the errors of a linear model is considered. The first family of tests is based on the discrepancy between two regression fits. The first fit is appropriate under symmetric errors while the second is appropriate for skewed as well as symmetric error distributions. The second family of procedures consists of tests for the univariate symmetry problem. Thus, in the linear model setting these tests are based on residuals. An extensive empirical study of the finite sample, null behavior of the tests is presented. The results of a power comparison among the tests is also discussed. 相似文献

13.

Evaluation of three lack of fit tests in linear regression models

Daniel Wang Michael Conerly 《Journal of applied statistics》2003,30(6):683-696

A key diagnostic in the analysis of linear regression models is whether the fitted model is appropriate for the observed data. The classical lack of fit test is used for testing the adequacy of a linear regression model when replicates are available. While many efforts have been made in finding alternative lack of fit tests for models without replicates, this paper focuses on studying the efficacy of three tests: the classical lack of fit test, Utts' (1982) test, Burn & Ryan's (1983) test. The powers of these tests are computed for a variety of situations. Comments and conclusions on the overall performance of these tests are made, including recommendations for future studies. 相似文献

14.

Effects of the estimation of covariance matrix parameters in the generalized multivariate linear model

Gregory C. Reinsel 《统计学通讯:理论与方法》2013,42(5):639-650

We Consider the generalized multivariate linear model and assume the covariance matrix of the p x 1 vector of responses on a given individual can be represented in the general linear structure form described by Anderson (1973). The effects of the use of estimates of the parameters of the covariance matrix on the generalized least squares estimator of the regression coefficients and on the prediction of a portion of a future vector, when only the first portion of the vector has been observed, are investigated. Approximations are derived for the covariance matrix of the generalized least squares estimator and for the mean square error matrix of the usual predictor, for the practical case where estimated parameters are used. 相似文献

15.

THE CONSTRUCTION OF EQUILEVERAGE DESIGNS FOR MULTIPLE LINEAR REGRESSION

Michael B. Dollinger Robert G. Staudte 《Australian & New Zealand Journal of Statistics》1990,32(1):99-118

The hat matrix is widely used as a diagnostic tool in linear regression because it contains the leverages which the independent variables exert on the fitted values. In some experiments, cases with high leverage may be avoided by judicious choice of design for the independent variables. A variety of methods for constructing equileverage designs for linear regression are discussed. Such designs remove one of the factors, namely large leverage points, which can lead to nonrobust estimators and tests. In addition, a method is given for combining equileverage designs to test for lack of fit of the linear model. 相似文献

16.

Asymptotic distribution of least square estimators for linear models with dependent errors

Emmanuel Caron 《Statistics》2019,53(4):885-902

In this paper, we consider the usual linear regression model in the case where the error process is assumed strictly stationary. We use a result from Hannan (Central limit theorems for time series regression. Probab Theory Relat Fields. 1973;26(2):157–170), who proved a Central Limit Theorem for the usual least squares estimator under general conditions on the design and on the error process. Whatever the design satisfying Hannan's conditions, we define an estimator of the covariance matrix and we prove its consistency under very mild conditions. As an application, we show how to modify the usual tests on the linear model in this dependent context, in such a way that the type-I error rate remains asymptotically correct, and we illustrate the performance of this procedure through different sets of simulations. 相似文献

17.

EDF tests of goodness of fit for transform-both-sides models

Gemai Chen 《Revue canadienne de statistique》1996,24(3):363-372

Very often in regression analysis, a particular functional form connecting known covariates and unknown parameters is either suggested by previous work or demanded by theoretical considerations so that the deterministic part of the responses has a known form. However, the underlying error structure is often less well understood. In this case, the transform-both-sides (TBS) models are appropriate. In this paper we generalize the usual TBS models and develop tests to assess goodness of fit when fitting TBS or GTBS models. Parameter estimation is discussed, and tests based on the Cramér-von Mises statistic and the Anderson-Darling statistic are presented with a table suitable for finite-sample applications. 相似文献

18.

Asymptotic tests and monte carlo studies associated with the multiplicative interaction model

Mervyn G. Marasinghe 《统计学通讯:理论与方法》2013,42(9):2219-2231

Several authors have proposed approximations to percentage points required for testing certain hypotheses associated with the multiplicative interaction model. Alternative approximations based on the asymptotic joint distribution of the characteristic roots of a noncentral Wishart matrix are proposed in this paper. The type I error rates of the resulting tests and the existing procedures are then compared using Monte Carlo methods. 相似文献

19.

Comparisons of tests of distributional assumption in Poisson regression model

Deniz Ozonur Hatice Tul Kubra Akdur Hulya Bayrak 《统计学通讯:模拟与计算》2017,46(8):6197-6207

Count data consists of discrete non-negative integer values. Poisson regression model is one of the most popular model used to model count data. This model assumes that response variable has Poisson distribution. The purpose of this article is to assess distributional assumption of this model by using some goodness of fit tests. These tests are compared in respect to type I error and power rates of tests with different samples, parameters and sample sizes. Simulation study suggests that the most powerful tests are generally Dean–Lawless and Cameron–Trivedi score tests. 相似文献

20.

Testing lack of fit in regression without replication

E. Richard Shillington 《Revue canadienne de statistique》1979,7(2):137-146

An F-statistic which tests a hypothesized linear regression model against the general alternative is developed. Observations are grouped using “near neighbours” and a generalization of the usual lack of fit test is derived. Two data sets from Daniel and Wood (1971) are used to illustrate the methodology. Power considerations are discussed. 相似文献