首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 620 毫秒
1.
Summary.  We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools are developed systematically, by using an analogy between our spatial residuals and the usual residuals for (non-spatial) generalized linear models. The conditional intensity λ plays the role of the mean response. This makes it possible to adapt existing knowledge about model validation for generalized linear models to the spatial point process context, giving recommendations for diagnostic plots. A plot of smoothed residuals against spatial location, or against a spatial covariate, is effective in diagnosing spatial trend or co-variate effects. Q – Q -plots of the residuals are effective in diagnosing interpoint interaction.  相似文献   

2.
This study compares empirical type I error and power of different permutation techniques that can be used for partial correlation analysis involving three data vectors and for partial Mantel tests. The partial Mantel test is a form of first-order partial correlation analysis involving three distance matrices which is widely used in such fields as population genetics, ecology, anthropology, psychometry and sociology. The methods compared are the following: (1) permute the objects in one of the vectors (or matrices); (2) permute the residuals of a null model; (3) correlate residualized vector 1 (or matrix A) to residualized vector 2 (or matrix B); permute one of the residualized vectors (or matrices); (4) permute the residuals of a full model. In the partial correlation study, the results were compared to those of the parametric t-test which provides a reference under normality. Simulations were carried out to measure the type I error and power of these permutatio methods, using normal and non-normal data, without and with an outlier. There were 10 000 simulations for each situation (100 000 when n = 5); 999 permutations were produced per test where permutations were used. The recommended testing procedures are the following:(a) In partial correlation analysis, most methods can be used most of the time. The parametric t-test should not be used with highly skewed data. Permutation of the raw data should be avoided only when highly skewed data are combined with outliers in the covariable. Methods implying permutation of residuals, which are known to only have asymptotically exact significance levels, should not be used when highly skewed data are combined with small sample size. (b) In partial Mantel tests, method 2 can always be used, except when highly skewed data are combined with small sample size. (c) With small sample sizes, one should carefully examine the data before partial correlation or partial Mantel analysis. For highly skewed data, permutation of the raw data has correct type I error in the absence of outliers. When highly skewed data are combined with outliers in the covariable vector or matrix, it is still recommended to use the permutation of raw data. (d) Method 3 should never be used.  相似文献   

3.
Christensen & Lin ( 2015 ) suggested two lack of fit tests to assess the adequacy of a linear model based on partial sums of residuals. In particular, their tests evaluated the adequacy of the mean function. Their tests relied on asymptotic results without requiring small sample normality. We propose four new tests, find their asymptotic distributions, and propose an alternative simulation method for defining tests that is remarkably robust to the distribution of the errors. To assess their strengths and weaknesses, the Christensen & Lin ( 2015 ) tests and the new tests were compared in different scenarios by simulation. In particular, the new tests include two based on partial sums of absolute residuals. Previous partial sums of residuals tests have used signed residuals whose values when summed can cancel each other out. The use of absolute residuals requires small sample normality, but allows detection of lack of fit that was previously not possible with partial sums of residuals.  相似文献   

4.
Considered are tests for normality of the errors in ridge regression. If an intercept is included in the model, it is shown that test statistics based on the empirical distribution function of the ridge residuals have the same limiting distribution as in the one-sample test for normality with estimated mean and variance. The result holds with weak assumptions on the behavior of the independent variables; asymptotic normality of the ridge estimator is not required.  相似文献   

5.
In many areas of application mixed linear models serve as a popular tool for analyzing highly complex data sets. For inference about fixed effects and variance components, likelihood-based methods such as (restricted) maximum likelihood estimators, (RE)ML, are commonly pursued. However, it is well-known that these fully efficient estimators are extremely sensitive to small deviations from hypothesized normality of random components as well as to other violations of distributional assumptions. In this article, we propose a new class of robust-efficient estimators for inference in mixed linear models. The new three-step estimation procedure provides truncated generalized least squares and variance components' estimators with hard-rejection weights adaptively computed from the data. More specifically, our data re-weighting mechanism first detects and removes within-subject outliers, then identifies and discards between-subject outliers, and finally it employs maximum likelihood procedures on the “clean” data. Theoretical efficiency and robustness properties of this approach are established.  相似文献   

6.
A new family of statistics is proposed to test for the presence of serial correlation in linear regression models. The tests are based on partial sums of lagged cross-products of regression residuals that define a class of interesting Gaussian processes. These processes are characterized in terms of regressor functions, the serial-correlation structure, the distribution of the noise process, and the order of the lag of the cross-products of residuals. It is shown that these four factors affect the lagged residual processes independently. Large-sample distributional results are presented for test statistics under the null hypothesis of no serial correlation or for alternatives from a range of interesting hypotheses. Some indication of the circumstances to which the asymptotic results apply in finite-sample situations and of those to which they should be applied with some caution are obtained through a simulation study. Tables of selected quantiles of the proposed tests are also given. The tests are illustrated with two examples taken from the empirical literature. It is also proposed that plots of lagged residual processes be used as diagnostic tools to gain insight into the correlation structure of residuals derived from regression fits.  相似文献   

7.
We provide the theoretical justification of bootstrapping stationary invertible echelon vector autoregressive moving-average (VARMA) models using linear methods. The asymptotic validity of the bootstrap is established with strong white noise under parametric and nonparametric assumptions. Our methods are practical and useful for building reliable simulation-based inference and forecasting without implementing nonlinear estimation techniques such as ML which is usually burdensome, time demanding or impractical, particularly in big or highly persistent systems. The relevance of our procedures is more pronounced in the context of dynamic simulation-based techniques such as maximized Monte Carlo (MMC) tests [see Dufour J-M. Monte Carlo tests with nuisance parameters: a general approach to finite-sample inference and nonstandard asymptotics in econometrics. J Econom. 2006;133(2):443–477 and Dufour J-M, Jouini T. Finite-sample simulation-based tests in VAR models with applications to Granger causality testing. J Econom. 2006;135(1–2):229–254 for the VAR case]. Simulation evidence shows that, compared with conventional asymptotics, our bootstrap methods have good finite-sample properties in approximating the actual distribution of the studentized echelon VARMA parameter estimates, and in providing echelon parameter confidence sets with satisfactory coverage.  相似文献   

8.
Abstract.  In this paper, we carry out an in-depth investigation of diagnostic measures for assessing the influence of observations and model misspecification in the presence of missing covariate data for generalized linear models. Our diagnostic measures include case-deletion measures and conditional residuals. We use the conditional residuals to construct goodness-of-fit statistics for testing possible misspecifications in model assumptions, including the sampling distribution. We develop specific strategies for incorporating missing data into goodness-of-fit statistics in order to increase the power of detecting model misspecification. A resampling method is proposed to approximate the p -value of the goodness-of-fit statistics. Simulation studies are conducted to evaluate our methods and a real data set is analysed to illustrate the use of our various diagnostic measures.  相似文献   

9.
Maximum likelihood approach is the most frequently employed approach for the inference of linear mixed models. However, it relies on the normal distributional assumption of the random effects and the within-subject errors, and it is lack of robustness against outliers. This article proposes a semiparametric estimation approach for linear mixed models. This approach is based on the first two marginal moments of the response variable, and does not require any parametric distributional assumptions of random effects or error terms. The consistency and asymptotically normality of the estimator are derived under fairly general conditions. In addition, we show that the proposed estimator has a bounded influence function and a redescending property so it is robust to outliers. The methodology is illustrated through an application to the famed Framingham cholesterol data. The finite sample behavior and the robustness properties of the proposed estimator are evaluated through extensive simulation studies.  相似文献   

10.
Model checking with discrete data regressions can be difficult because the usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model. Posterior predictive checks have been proposed as a Bayesian way to average the results of goodness-of-fit tests in the presence of uncertainty in estimation of the parameters. We try this approach using a variety of discrepancy variables for generalized linear models fitted to a historical data set on behavioural learning. We then discuss the general applicability of our findings in the context of a recent applied example on which we have worked. We find that the following discrepancy variables work well, in the sense of being easy to interpret and sensitive to important model failures: structured displays of the entire data set, general discrepancy variables based on plots of binned or smoothed residuals versus predictors and specific discrepancy variables created on the basis of the particular concerns arising in an application. Plots of binned residuals are especially easy to use because their predictive distributions under the model are sufficiently simple that model checks can often be made implicitly. The following discrepancy variables did not work well: scatterplots of latent residuals defined from an underlying continuous model and quantile–quantile plots of these residuals.  相似文献   

11.
In this paper the most commonly used diagnostic criteria for the identification of outliers or leverage points in the ordinary regression model are reviewed. Their use in the context of the errors-in-variables (e.v.) linear model is discussed and evidence is given that under the e.v. model assumptions the distinction between outliers and leverage points no longer exists.  相似文献   

12.
Commonly applied diagnostic procedures in random-coefficient (multilevel) analysis are based on an inspection of the residuals, motivated by established procedures for ordinary regression. The deficiencies of such procedures are discussed and an alternative based on simulation from the fitted model (parametric bootstrap) is proposed. Although computationally intensive, the method proposed requires little programming effort additional to implementing the model fitting procedure. It can be tailored for specific kinds of outliers. Some computationally less demanding alternatives are described.  相似文献   

13.
A diagnostic technique is proposed to detect major gene effects and other systematic departures from a model for the trait means in the presence of outliers. The technique is based on the examination of residuals from fitting variance components models to quantitative pedigree data using robust statistical procedures. The approach is demonstrated using the total ridge count and ridge count of the middle finger from 54 extended families affected with the Fragile X syndrome, and a sample of 217 normal pedigrees.  相似文献   

14.
Rank tests are known to be robust to outliers and violation of distributional assumptions. Two major issues besetting microarray data are violation of the normality assumption and contamination by outliers. In this article, we formulate the normal theory simultaneous tests and their aligned rank transformation (ART) analog for detecting differentially expressed genes. These tests are based on the least-squares estimates of the effects when data follow a linear model. Application of the two methods are then demonstrated on a real data set. To evaluate the performance of the aligned rank transform method with the corresponding normal theory method, data were simulated according to the characteristics of a real gene expression data. These simulated data are then used to compare the two methods with respect to their sensitivity to the distributional assumption and to outliers for controlling the family-wise Type I error rate, power, and false discovery rate. It is demonstrated that the ART generally possesses the robustness of validity property even for microarray data with small number of replications. Although these methods can be applied to more general designs, in this article the simulation study is carried out for a dye-swap design since this design is broadly used in cDNA microarray experiments.  相似文献   

15.
Goodness-of-fit Tests for Mixed Models   总被引:2,自引:1,他引:1  
Abstract.  Mixed linear models have become a very useful tool for modelling experiments with dependent observations within subjects, but to establish their appropriateness several assumptions have to be checked. In this paper, we focus on the normality assumptions, using goodness-of-fit tests that make allowance for possible design imbalance. These tests rely on asymptotic results, which are established via empirical process theory. The power of the tests is explored empirically, and examples illustrate some aspects of the usage of the tests.  相似文献   

16.
In fitting regression model, one or more observations may have substantial effects on estimators. These unusual observations are precisely detected by a new diagnostic measure, Pena's statistic. In this article, we introduce a type of Pena's statistic for each point in Liu regression. Using the forecast change property, we simplify the Pena's statistic in a numerical sense. It is found that the simplified Pena's statistic behaves quite well as far as detection of influential observations is concerned. We express Pena's statistic in terms of the Liu leverages and residuals. The normality of this statistic is also discussed and it is demonstrated that it can identify a subset of high Liu leverage outliers. For numerical evaluation, simulated studies are given and a real data set has been analysed for illustration.  相似文献   

17.
The investigation on the identification of outliers in linear regression models can be extended to those for circular regression case. In this paper, we propose a new numerical statistic called mean circular error to identify possible outliers in circular regression models by using a row deletion approach. Through intensive simulation studies, the cut-off points of the statistic are obtained and its power of performance investigated. It is found that the performance improves as the concentration parameter of circular residuals becomes larger or the sample size becomes smaller. As an illustration, the statistic is applied to a wind direction data set.  相似文献   

18.
This paper studies outlier detection and accommodation in general spatial models including spatial autoregressive models and spatial error model as special cases. Using mean-shift and variance-weight models respectively, test statistics for multiple outliers are derived and the detecting procedures are proposed. In addition, several key diagnostic measures such as standardized residuals and leverage measure are defined in general spatial models. Outlier modified models are proposed to accommodate outliers in the data set. The performance of test statistics, including size and power, are examined via simulation studies. Three real examples are analyzed and the results show that the proposed methodology is useful for identifying and accommodating outliers in general spatial models.  相似文献   

19.
The stalactite plot for the detection of multivariate outliers   总被引:1,自引:0,他引:1  
Detection of multiple outliers in multivariate data using Mahalanobis distances requires robust estimates of the means and covariance of the data. We obtain this by sequential construction of an outlier free subset of the data, starting from a small random subset. The stalactite plot provides a cogent summary of suspected outliers as the subset size increases. The dependence on subset size can be virtually removed by a simulation-based normalization. Combined with probability plots and resampling procedures, the stalactite plot, particularly in its normalized form, leads to identification of multivariate outliers, even in the presence of appreciable masking.  相似文献   

20.
We propose a strongly root-n consistent simulation-based estimator for the generalized linear mixed models. This estimator is constructed based on the first two marginal moments of the response variables, and it allows the random effects to have any parametric distribution (not necessarily normal). Consistency and asymptotic normality for the proposed estimator are derived under fairly general regularity conditions. We also demonstrate that this estimator has a bounded influence function and that it is robust against data outliers. A bias correction technique is proposed to reduce the finite sample bias in the estimation of variance components. The methodology is illustrated through an application to the famed seizure count data and some simulation studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号