首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When one wants to check a tentatively proposed model for departures that are not well specified, looking at residuals is the most common diagnostic technique. Here, we investigate the use of Bayesian standardized residuals to detect unknown hierarchical structure. Asymptotic theory, also supported by simulations, shows that the use of Bayesian standardized residuals is effective when the within group correlation, ρ, is large. However, we show that standardized residuals may not detect hierarchical structure when ρ is small. Thus, if it is important to detect modest hierarchical structure (i.e., ρ small) one should use other diagnostic techniques in addition to the standardized residuals. We use “quality of care” data from the Patterns of Care Study, a two-stage cluster sample of patients undergoing radiation therapy for cervix cancer, to illustrate the potential use of these residuals to detect missing hierarchical structure.  相似文献   

2.
The use of logistic regression modeling has seen a great deal of attention in the literature in recent years. This includes all aspects of the logistic regression model including the identification of outliers. A variety of methods for the identification of outliers, such as the standardized Pearson residuals, are now available in the literature. These methods, however, are successful only if the data contain a single outlier. In the presence of multiple outliers in the data, which is often the case in practice, these methods fail to detect the outliers. This is due to the well-known problems of masking (false negative) and swamping (false positive) effects. In this article, we propose a new method for the identification of multiple outliers in logistic regression. We develop a generalized version of standardized Pearson residuals based on group deletion and then propose a technique for identifying multiple outliers. The performance of the proposed method is then investigated through several examples.  相似文献   

3.
This paper derives a simple ANOVA-F-statistic which tests for random individual effects in a one-way error component model, using recursive residuals. Power comparisons are performed for this F-test when it is computed using true disturbances and recursive residuals from a panel data regression. Under the null, both statistics have an exact F distribution. The standardized version of the Breusch and Pagan (1980) Lagrange Multiplier test (SLM) as well as a fixed effects F-statistic (FE) recommended by Moulton and Randolph (1989), are also included in this comparison. The exact power function can be computed in all cases using Imhof's (1961) procedure. Our results suggest that the F-test based on recursive residuals is inferior to the popular SLM and FE tests based on computational simplicity, power comparisons and its sensitivity to the K observations starting the recursion.  相似文献   

4.
The aim of this paper is to propose conditions for exploring the class of identifiable Gaussian models with one latent variable. In particular, we focus attention on the topological structure of the complementary graph of the residuals. These conditions are mainly based on the presence of odd cycles and bridge edges in the complementary graph. We propose to use the spanning tree representation of the graph and the associated matrix of fundamental cycles. In this way it is possible to obtain an algorithm able to establish in advance whether modifying the graph corresponding to an identifiable model, the resulting graph still denotes identifiability.  相似文献   

5.
Abstract.  Epidemiology research often entails the analysis of failure times subject to grouping. In large cohorts interval grouping also offers a feasible choice of data reduction to actually facilitate an analysis of the data. Based on an underlying Cox proportional hazards model for the exact failure times one may deduce a grouped data version of this model which may then be used to analyse the data. The model bears a lot of resemblance to a generalized linear model, yet due to the nature of data one also needs to incorporate censoring. In the case of non-trivial censoring this precludes model checking procedures based on ordinary residuals as calculation of these requires knowledge of the censoring distribution. In this paper, we represent interval grouped data in a dynamical way using a counting process approach. This enables us to identify martingale residuals which can be computed without knowledge of the censoring distribution. We use these residuals to construct graphical as well as numerical model checking procedures. An example from epidemiology is provided.  相似文献   

6.
To bootstrap a regression problem, pairs of response and explanatory variables or residuals can be resam‐pled, according to whether we believe that the explanatory variables are random or fixed. In the latter case, different residuals have been proposed in the literature, including the ordinary residuals (Efron 1979), standardized residuals (Bickel & Freedman 1983) and Studentized residuals (Weber 1984). Freedman (1981) has shown that the bootstrap from ordinary residuals is asymptotically valid when the number of cases increases and the number of variables is fixed. Bickel & Freedman (1983) have shown the asymptotic validity for ordinary residuals when the number of variables and the number of cases both increase, provided that the ratio of the two converges to zero at an appropriate rate. In this paper, the authors introduce the use of BLUS (Best Linear Unbiased with Scalar covariance matrix) residuals in bootstrapping regression models. The main advantage of the BLUS residuals, introduced in Theil (1965), is that they are uncorrelated. The main disadvantage is that only np residuals can be computed for a regression problem with n cases and p variables. The asymptotic results of Freedman (1981) and Bickel & Freedman (1983) for the ordinary (and standardized) residuals are generalized to the BLUS residuals. A small simulation study shows that even though only np residuals are available, in small samples bootstrapping BLUS residuals can be as good as, and sometimes better than, bootstrapping from standardized or Studentized residuals.  相似文献   

7.
This paper concerns model selection for autoregressive time series when the observations are contaminated with trend. We propose an adaptive least absolute shrinkage and selection operator (LASSO) type model selection method, in which the trend is estimated by B-splines, the detrended residuals are calculated, and then the residuals are used as if they were observations to optimize an adaptive LASSO type objective function. The oracle properties of such an adaptive LASSO model selection procedure are established; that is, the proposed method can identify the true model with probability approaching one as the sample size increases, and the asymptotic properties of estimators are not affected by the replacement of observations with detrended residuals. The intensive simulation studies of several constrained and unconstrained autoregressive models also confirm the theoretical results. The method is illustrated by two time series data sets, the annual U.S. tobacco production and annual tree ring width measurements.  相似文献   

8.
Integer-valued time series models make use of thinning operators for coherency in the nature of count data. However, the thinning operators make residuals unobservable and are the main difficulty in developing diagnostic tools for autocorrelated count data. In this regard, we introduce a new residual, which takes the form of predictive distribution functions, to assess probabilistic forecasts, and this new residual is supplemented by a modified usual residuals. Under integer-valued autoregressive (INAR) models, the properties of these two residuals are investigated and used to evaluate the predictive performance and model adequacy of the INAR models. We compare our residuals with the existing residuals through simulation studies and apply our method to select an appropriate INAR model for an over-dispersed real data.  相似文献   

9.
Verifying the existence of a relationship between two multivariate time series represents an important consideration. In this article, the procedure developed by Cheung and Ng [A causality-in-variance test and its application to financial market prices, J. Econom. 72 (1996), pp. 33–48] designed to test causality in variance for univariate time series is generalized in several directions. A first approach proposes test statistics based on residual cross-covariance matrices of squared (standardized) residuals and cross products of (standardized) residuals. In a second approach, transformed residuals are defined for each residual vector time series, and test statistics are constructed based on the cross-correlations of these transformed residuals. Test statistics at individual lags and portmanteau-type test statistics are developed. Conditions are given under which the new test statistics converge in distribution towards chi-square distributions. The proposed methodology can be used to determine the directions of causality in variance, and appropriate test statistics are presented. Monte Carlo simulation results show that the new test statistics offer satisfactory empirical properties. An application with two bivariate financial time series illustrates the methods.  相似文献   

10.
An added variable plot is a commonly used plot in regression diagnostics. The rationale for this plot is to provide information about the addition of a further explanatory variable to the model. In addition, an added variable plot is most often used for detecting high leverage points and influential data. So far as we know, this type of plot involves the least squares residuals which, we suspect, could produce a confusing picture when a group of unusual cases are present in the data. In this situation, added variable plots may not only fail to detect the unusual cases but also may fail to focus on the need for adding a further regressor to the model. We suggest that residuals from deletion should be more convincing and reliable in this type of plot. The usefulness of an added variable plot based on residuals from deletion is investigated through a few examples and a Monte Carlo simulation experiment in a variety of situations.  相似文献   

11.
In a prevalent cohort study with follow-up, the incidence process is not directly observed since only the onset times of prevalent cases can be ascertained. Assessing the “stationarity” of the underlying incidence process can be important for at least three reasons, including an improvement in efficiency when estimating the survivor function. We propose, for the first time, a formal test for stationarity using data from a prevalent cohort study with follow-up. The test makes use of a characterization of stationarity, an extension of this characterization developed in this paper, and of a test for matched pairs of right censored data. We report the results from a power study assuming varying degrees of departure from the null hypothesis of stationarity. The test is also applied to data obtained as part of the Canadian Study of Health and Aging (CSHA) to verify whether the incidence rate of dementia amongst the elderly in Canada has remained constant.  相似文献   

12.
When genuine panel data samples are not available, repeated cross-sectional surveys can be used to form so-called pseudo panels. In this article, we investigate the properties of linear pseudo panel data estimators with fixed number of cohorts and time observations. We extend standard linear pseudo panel data setup to models with factor residuals by adapting the quasi-differencing approach developed for genuine panels. In a Monte Carlo study, we find that the proposed procedure has good finite sample properties in situations with endogeneity, cohort interactive effects, and near nonidentification. Finally, as an illustration the proposed method is applied to data from Ecuador to study labor supply elasticity. Supplementary materials for this article are available online.  相似文献   

13.
A criterion for choosing an estimator in a family of semi-parametric estimators from incomplete data is proposed. This criterion is the expected observed log-likelihood (ELL). Adapted versions of this criterion in case of censored data and in presence of explanatory variables are exhibited. We show that likelihood cross-validation (LCV) is an estimator of ELL and we exhibit three bootstrap estimators. A simulation study considering both families of kernel and penalized likelihood estimators of the hazard function (indexed on a smoothing parameter) demonstrates good results of LCV and a bootstrap estimator called ELLbboot . We apply the ELLbboot criterion to compare the kernel and penalized likelihood estimators to estimate the risk of developing dementia for women using data from a large cohort study.  相似文献   

14.
Binary response models consider pseudo-R 2 measures which are not based on residuals while several concepts of residuals were developed for tests. In this paper the endogenous variable of the latent model corresponding to the binary observable model is substituted by a pseudo variable. Then goodness of fit measures and tests can be based on a joint concept of residuals as for linear models. Different kinds of residuals based on probit ML estimates are employed. The analytical investigations and the simulation results lead to the recommendation to use standardized residuals where there is no difference between observed and generalized residuals. In none of the investigated situations this estimator is far away from the best result. While in large samples all considered estimators are very similar, small sample properties speak in favour of residuals which are modifications of those suggested in the literature. An empirical application demonstrates that it is not necessary to develop new testing procedures for the observable models with dichotomous regressands. Well-know approaches for linear models with continuous endogenous variables which are implemented in usual econometric packages can be used for pseudo latent models. An erratum to this article is available at .  相似文献   

15.
In this paper we examine the properties of four types of residual vectors, arising from fitting a linear regression model to a set of data by least squares. The four types of residuals are (i) the Stepwise residuals (Hedayat and Robson, 1970), (ii) the Recursive residuals (Brown, Durbin, and Evans, 1975), (iii) the Sequentially Adjusted residuals (to be defined herein), and (iv) the BLUS residuals (Theil, 1965, 1971). We also study the relationships among the four residual vectors. It is found that, for any given sequence of observations, (i) the first three sets of residuals are identical, (ii) each of the first three sets, being identical, is a member of Thei’rs (1965, 1971) family of residuals; specifically, they are Linear Unbiased with a Scalar covariance matrix (LUS) but not Best Linear Unbiased with a Scalar covariance matrix (BLUS). We find the explicit form of the transformation matrix and show that the first three sets of residual vectors can be written as an orthogonal transformation of the BLUS residual vector. These and other properties may prove to be useful in the statistical analysis of residuals.  相似文献   

16.
Although regression estimates are quite robust to slight departure from normality, symmetric prediction intervals assuming normality can be highly unsatisfactory and problematic if the residuals have a skewed distribution. For data with distributions outside the class covered by the Generalized Linear Model, a common way to handle non-normality is to transform the response variable. Unfortunately, transforming the response variable often destroys the theoretical or empirical functional relationship connecting the mean of the response variable to the explanatory variables established on the original scale. Further complication arises if a single transformation cannot both stabilize variance and attain normality. Furthermore, practitioners also find the interpretation of highly transformed data not obvious and often prefer an analysis on the original scale. The present paper presents an alternative approach for handling simultaneously heteroscedasticity and non-normality without resorting to data transformation. Unlike classical approaches, the proposed modeling allows practitioners to formulate the mean and variance relationships directly on the original scale, making data interpretation considerably easier. The modeled variance relationship and form of non-normality in the proposed approach can be easily examined through a certain function of the standardized residuals. The proposed method is seen to remain consistent for estimating the regression parameters even if the variance function is misspecified. The method along with some model checking techniques is illustrated with a real example.  相似文献   

17.
This study considers the problem of testing for a parameter change in integer-valued time series models in which the conditional density of current observations is assumed to follow a Poisson distribution. As a test, we consider the CUSUM of the squares test based on the residuals from INGARCH models and find that the test converges weakly to the supremum of a Brownian bridge. A simulation study demonstrates its superiority to the residual and standardized residual-based CUSUM tests of Kang and Lee [Parameter change test for Poisson autoregressive models. Scand J Statist. 2014;41:1136–1152] and Lee and Lee [CUSUM tests for general nonlinear inter-valued GARCH models: comparison study. Ann Inst Stat Math. 2019;71:1033–1057.] as well as the CUSUM of squares test based on standardized residuals.  相似文献   

18.
Summary.  We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools are developed systematically, by using an analogy between our spatial residuals and the usual residuals for (non-spatial) generalized linear models. The conditional intensity λ plays the role of the mean response. This makes it possible to adapt existing knowledge about model validation for generalized linear models to the spatial point process context, giving recommendations for diagnostic plots. A plot of smoothed residuals against spatial location, or against a spatial covariate, is effective in diagnosing spatial trend or co-variate effects. Q – Q -plots of the residuals are effective in diagnosing interpoint interaction.  相似文献   

19.
Summary. The Cox proportional hazards model, which is widely used for the analysis of treatment and prognostic effects with censored survival data, makes the assumption that the hazard ratio is constant over time. Nonparametric estimators have been developed for an extended model in which the hazard ratio is allowed to change over time. Estimators based on residuals are appealing as they are easy to use and relate in a simple way to the more restricted Cox model estimator. After fitting a Cox model and calculating the residuals, one can obtain a crude estimate of the time-varying coefficients by adding a smooth of the residuals to the initial (constant) estimate. Treating the crude estimate as the fit, one can re-estimate the residuals. Iteration leads to consistent estimation of the nonparametric time-varying coefficients. This approach leads to clear guidelines for residual analysis in applications. The results are illustrated by an analysis of the Medical Research Council's myeloma trials, and by simulation.  相似文献   

20.
We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号