首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Abstract.  In this paper, we carry out an in-depth investigation of diagnostic measures for assessing the influence of observations and model misspecification in the presence of missing covariate data for generalized linear models. Our diagnostic measures include case-deletion measures and conditional residuals. We use the conditional residuals to construct goodness-of-fit statistics for testing possible misspecifications in model assumptions, including the sampling distribution. We develop specific strategies for incorporating missing data into goodness-of-fit statistics in order to increase the power of detecting model misspecification. A resampling method is proposed to approximate the p -value of the goodness-of-fit statistics. Simulation studies are conducted to evaluate our methods and a real data set is analysed to illustrate the use of our various diagnostic measures.  相似文献   

2.
This note discusses a problem that might occur when forward stepwise regression is used for variable selection and among the candidate variables is a categorical variable with more than two categories. Most software packages (such as SAS, SPSSx, BMDP) include special programs for performing stepwise regression. The user of these programs has to code categorical variables with dummy variables. In this case the forward selection might wrongly indicate that a categorical variable with more than two categories is nonsignificant. This is a disadvantage of the forward selection compared with the backward elimination method. A way to avoid the problem would be to test in a single step all dummy variables corresponding to the same categorical variable rather than one dummy variable at a time, such as in the analysis of covariance. This option, however, is not available in forward stepwise procedures, except for stepwise logistic regression in BMDP. A practical possibility is to repeat the forward stepwise regression and change the reference categories each time.  相似文献   

3.
ABSTRACT

This paper analyses the behaviour of the goodness-of-fit tests for regression models. To this end, it uses statistics based on an estimation of the integrated regression function with missing observations either in the response variable or in some of the covariates. It proposes several versions of one empirical process, constructed from a previous estimation, that uses only the complete observations or replaces the missing observations with imputed values. In the case of missing covariates, a link model is used to fill the missing observations with other complete covariates. In all the situations, Bootstrap methodology is used to calibrate the distribution of the test statistics. A broad simulation study compares the different procedures based on empirical regression methodology, with smoothed tests previously studied in the literature. The comparison reflects the effect of the correlation between the covariates in the tests based on the imputed sample for missing covariates. In addition, the paper proposes a computational binning strategy to evaluate the tests based on an empirical process for large data sets. Finally, two applications to real data illustrate the performance of the tests.  相似文献   

4.
We introduce directed goodness-of-fit tests for Cox-type regression models in survival analysis. “Directed” means that one may choose against which alternatives the tests are particularly powerful. The tests are based on sums of weighted martingale residuals and their asymptotic distributions. We derive optimal tests against certain competing models which include Cox-type regression models with different covariates and/or a different link function. We report results from several simulation studies and apply our test to a real dataset.  相似文献   

5.
To bootstrap a regression problem, pairs of response and explanatory variables or residuals can be resam‐pled, according to whether we believe that the explanatory variables are random or fixed. In the latter case, different residuals have been proposed in the literature, including the ordinary residuals (Efron 1979), standardized residuals (Bickel & Freedman 1983) and Studentized residuals (Weber 1984). Freedman (1981) has shown that the bootstrap from ordinary residuals is asymptotically valid when the number of cases increases and the number of variables is fixed. Bickel & Freedman (1983) have shown the asymptotic validity for ordinary residuals when the number of variables and the number of cases both increase, provided that the ratio of the two converges to zero at an appropriate rate. In this paper, the authors introduce the use of BLUS (Best Linear Unbiased with Scalar covariance matrix) residuals in bootstrapping regression models. The main advantage of the BLUS residuals, introduced in Theil (1965), is that they are uncorrelated. The main disadvantage is that only np residuals can be computed for a regression problem with n cases and p variables. The asymptotic results of Freedman (1981) and Bickel & Freedman (1983) for the ordinary (and standardized) residuals are generalized to the BLUS residuals. A small simulation study shows that even though only np residuals are available, in small samples bootstrapping BLUS residuals can be as good as, and sometimes better than, bootstrapping from standardized or Studentized residuals.  相似文献   

6.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

7.
A probability property that connects the skew normal (SN) distribution with the normal distribution is used for proposing a goodness-of-fit test for the composite null hypothesis that a random sample follows an SN distribution with unknown parameters. The random sample is transformed to approximately normal random variables, and then the Shapiro–Wilk test is used for testing normality. The implementation of this test does not require neither parametric bootstrap nor the use of tables for different values of the slant parameter. An additional test for the same problem, based on a property that relates the gamma and SN distributions, is also introduced. The results of a power study conducted by the Monte Carlo simulation show some good properties of the proposed tests in comparison to existing tests for the same problem.  相似文献   

8.
Test procedures are constructed for testing the goodness-of-fit of the error distribution in the regression context. The test statistic is based on an L 2-type distance between the characteristic function of the (assumed) error distribution and the empirical characteristic function of the residuals. The asymptotic null distribution as well as the behavior of the test statistic under contiguous alternatives is investigated, while the issue of the choice of suitable estimators has been particularly emphasized. Theoretical results are accompanied by a simulation study.  相似文献   

9.
The Generalized Error Distribution is a widespread flexible family of symmetric probability distribution. Thanks to its properties it is becoming more and more popular in many science fields therefore determining if a sample is drawn from a GED is an important issue that usually is pursued with a graphical approach. In this paper we present a new goodness-of-fit test for GED that shows good performances for detecting non GED distribution when the alternative distribution is either skewed or a mixture. A comparison between well known tests and this new procedure is performed through a simulation study. We have developed a function that performs the analysis described in this paper in the R environment. The computational time required to compute this procedure is negligible.  相似文献   

10.
Summary. Standard goodness-of-fit tests for a parametric regression model against a series of nonparametric alternatives are based on residuals arising from a fitted model. When a parametric regression model is compared with a nonparametric model, goodness-of-fit testing can be naturally approached by evaluating the likelihood of the parametric model within a nonparametric framework. We employ the empirical likelihood for an α -mixing process to formulate a test statistic that measures the goodness of fit of a parametric regression model. The technique is based on a comparison with kernel smoothing estimators. The empirical likelihood formulation of the test has two attractive features. One is its automatic consideration of the variation that is associated with the nonparametric fit due to empirical likelihood's ability to Studentize internally. The other is that the asymptotic distribution of the test statistic is free of unknown parameters, avoiding plug-in estimation. We apply the test to a discretized diffusion model which has recently been considered in financial market analysis.  相似文献   

11.
This article presents a multiple hypothesis test procedure that combines two well known tests for structural change in the linear regression model, the CUSUM test and the recursive t test. The CUSUM test is run through the sequence of recursive residuals as usual; if the CUSUM plot does not violate the critical lines, one more step is taken to perform the t test for hypothesis of zero mean based on all recursive residuals. The asymptotic size of this multiple hypothesis test is derived; power simulation results suggest that it outperforms the traditional CUSUM test and complements other tests that are currently stressed in econometrics.  相似文献   

12.
This article presents a multiple hypothesis test procedure that combines two well known tests for structural change in the linear regression model, the CUSUM test and the recursive t test. The CUSUM test is run through the sequence of recursive residuals as usual; if the CUSUM plot does not violate the critical lines, one more step is taken to perform the t test for hypothesis of zero mean based on all recursive residuals. The asymptotic size of this multiple hypothesis test is derived; power simulation results suggest that it outperforms the traditional CUSUM test and complements other tests that are currently stressed in econometrics.  相似文献   

13.
Logistic-normal models can be applied for analysis of longitudinal binary data. The aim of this article is to propose a goodness-of-fit test using nonparametric smoothing techniques for checking the adequacy of logistic-normal models. Moreover, the leave-one-out cross-validation method for selecting the suitable bandwidth is developed. The quadratic form of the proposed test statistic based on smoothing residuals provides a global measure for checking the model with categorical and continuous covariates. The formulae of expectation and variance of the proposed statistics are derived, and their asymptotic distribution is approximated by a scaled chi-squared distribution. The power performance of the proposed test for detecting the interaction term or the squared term of continuous covariates is examined by simulation studies. A longitudinal dataset is utilized to illustrate the application of the proposed test.  相似文献   

14.
ABSTRACT

The compound Poisson-exponential distribution is a basic model in risk analysis and stochastic hydrology. Graphical procedures for assessing this distribution are proposed which utilize the residuals from a regression involving the moment generating function. Plots furnished with a 95% simultaneous confidence band are constructed. The band and critical points of the equivalent goodness-of-fit test are found by utilizing asymptotic results and fitted regressions involving the supremum of the standardized residuals, the sample size, and the estimated Poisson mean. Simulation results indicate that the tests have good level stability and appreciable power against competing compound Poisson distributions of a mixed type.  相似文献   

15.
The authors propose a goodness-of-fit test for parametric regression models when the response variable is right-censored. Their test compares an estimation of the error distribution based on parametric residuals to another estimation relying on nonparametric residuals. They call on a bootstrap mechanism in order to approximate the critical values of tests based on Kolmogorov-Smirnov and Cramér-von Mises type statistics. They also present the results of Monte Carlo simulations and use data from a study about quasars to illustrate their work.  相似文献   

16.
It is important to detect the variance heterogeneity in regression model because efficient inference requires that heteroscedasticity is taken into consideration if it really exists. For the varying-coefficient partially linear regression models, however, the problem of detecting heteroscedasticity has received very little attention. In this paper, we present two classes of tests of heteroscedasticity for varying-coefficient partially linear regression models. The first test statistic is constructed based on the residuals, in which the error term is from a normal distribution. The second one is motivated by the idea that testing heteroscedasticity is equivalent to testing pseudo-residuals for a constant mean. Asymptotic normality is established with different rates corresponding to the null hypothesis of homoscedasticity and the alternative. Some Monte Carlo simulations are conducted to investigate the finite sample performance of the proposed tests. The test methodologies are illustrated with a real data set example.  相似文献   

17.
ABSTRACT.  In this paper, we develop an approximation for the most powerful invariant test of one location-scale family against another one. The approach is based on the Laplace method for integrals and yields a very accurate approximation of the density of a maximal invariant. Moreover, it can be applied to a much wider set of pairs of densities than previously possible. Many examples are worked out. The resulting test is easy to compute and its power is shown to be very close to that of the best test. By using versions of the Laplace method, the approach is extended to goodness-of-fit tests for residuals in regression and to some multivariate distributions. A small simulation study confirms the theoretical results. An example concludes the paper.  相似文献   

18.
Various methods to control the influence of a covariate on a response variable are compared. These methods are ANOVA with or without homogeneity of variances (HOV) of errors and Kruskal–Wallis (K–W) tests on (covariate-adjusted) residuals and analysis of covariance (ANCOVA). Covariate-adjusted residuals are obtained from the overall regression line fit to the entire data set ignoring the treatment levels or factors. It is demonstrated that the methods on covariate-adjusted residuals are only appropriate when the regression lines are parallel and covariate means are equal for all treatments. Empirical size and power performance of the methods are compared by extensive Monte Carlo simulations. We manipulated the conditions such as assumptions of normality and HOV, sample size, and clustering of the covariates. The parametric methods on residuals and ANCOVA exhibited similar size and power when error terms have symmetric distributions with variances having the same functional form for each treatment, and covariates have uniform distributions within the same interval for each treatment. In such cases, parametric tests have higher power compared to the K–W test on residuals. When error terms have asymmetric distributions or have variances that are heterogeneous with different functional forms for each treatment, the tests are liberal with K–W test having higher power than others. The methods on covariate-adjusted residuals are severely affected by the clustering of the covariates relative to the treatment factors when covariate means are very different for treatments. For data clusters, ANCOVA method exhibits the appropriate level. However, such a clustering might suggest dependence between the covariates and the treatment factors, so makes ANCOVA less reliable as well.  相似文献   

19.
In this paper a set of residuals for the multivariate linear regression model is introduced. These residuals are shown to be independent with known distributions which do not depend on the parameters of the model. Transformations of the mentioned residuals may be used to construct exact α goodness-of-fit tests for the multivariate regression model.  相似文献   

20.
The problem of missing values problem is common in all branches of statistics and especially in regression analysis. Here we consider estimation of the regression parameters in the presence of missingness in the response. The usual method is to replace the missing value by its predicted value based on the available observations without any correction for the disturbance term. Instead we suggest a method which corrects the usual predictor with a guess of the disturbance term based on the available residuals. Comparison between the two methods shows that the latter leads to better results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号