首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Consider the usual linear regression model consisting of two or more explanatory variables. There are many methods aimed at indicating the relative importance of the explanatory variables. But in general these methods do not address a fundamental issue: when all of the explanatory variables are included in the model, how strong is the empirical evidence that the first explanatory variable is more or less important than the second explanatory variable? How strong is the empirical evidence that the first two explanatory variables are more important than the third explanatory variable? The paper suggests a robust method for dealing with these issues. The proposed technique is based on a particular version of explanatory power used in conjunction with a modification of the basic percentile method.  相似文献   

2.
We study the finite-sample properties of White's test for heteroskedasticity in stochastic regression models where explanatory variables are random and not given. We investigate by simulation the effect of non independence of explanatory variables and error term and heteroskedasticity on White's test. A standard bootstrap method in the computationally convenient form is found to work well with respect to the size and power.  相似文献   

3.
The analysis of failure time data often involves two strong assumptions. The proportional hazards assumption postulates that hazard rates corresponding to different levels of explanatory variables are proportional. The additive effects assumption specifies that the effect associated with a particular explanatory variable does not depend on the levels of other explanatory variables. A hierarchical Bayes model is presented, under which both assumptions are relaxed. In particular, time-dependent covariate effects are explicitly modelled, and the additivity of effects is relaxed through the use of a modified neural network structure. The hierarchical nature of the model is useful in that it parsimoniously penalizes violations of the two assumptions, with the strength of the penalty being determined by the data.  相似文献   

4.
An alternative graphical method, called the SSR plot, is proposed for use with a multiple regression model. The new method uses the fact that the sum of squares for regression (SSR) of two explanatory variables can be partitioned into the SSR of one variable and the increment in SSR due to the addition of the second variable. The SSR plot represents each explanatory variable as a vector in a half circle. Our proposed SSR plot explains that the explanatory variables corresponding to the vectors located closer to the horizontal axis have stronger effects on the response variable. Furthermore, for a regression model with two explanatory variables, the magnitude of the angle between two vectors can be used to identify suppression.  相似文献   

5.
Two diagnostic plots for selecting explanatory variables are introduced to assess the accuracy of a generalized beta-linear model. The added variable plot is developed to examine the need for adding a new explanatory variable to the model. The constructed variable plot is developed to identify the nonlinearity of the explanatory variable in the model. The two diagnostic procedures are also useful for detecting unusual observations that may affect the regression much. Simulation studies and analysis of two practical examples are conducted to illustrate the performances of the proposed plots.  相似文献   

6.
We investigated CART performance with a unimodal response curve for one continuous response and four continuous explanatory variables, where two variables were important (i.e. directly related to the response) and the other two were not. We explored performance under three relationship strengths and two explanatory variable conditions: equal importance and one variable four times as important as the other. We compared CART variable selection performance using three tree-selection rules ('minimum risk', 'minimum risk complexity', 'one standard error') to stepwise polynomial ordinary least squares (OLS) under four sample size conditions. The one-standard-error and minimum risk-complexity methods performed about as well as stepwise OLS with large sample sizes when the relationship was strong. With weaker relationships, equally important explanatory variables and larger sample sizes, the one-standard-error and minimum-risk-complexity rules performed better than stepwise OLS. With weaker relationships and explanatory variables of unequal importance, tree-structured methods did not perform as well as stepwise OLS. Comparing performance within tree-structured methods, with a strong relationship and equally important explanatory variables, the one-standard-error rule was more likely to choose the correct model than were the other tree-selection rules. The minimum-risk-complexity rule was more likely to choose the correct model than were the other tree-selection rules (1) with weaker relationships and equally important explanatory variables; and (2) under all relationship strengths when explanatory variables were of unequal importance and sample sizes were lower.  相似文献   

7.
Techniques of credit scoring have been developed these last years in order to reduce the risk taken by banks and financial institutions in the loans that they are granting. Credit Scoring is a classification problem of individuals in one of the two following groups: defaulting borrowers or non-defaulting borrowers. The aim of this paper is to propose a new method of discrimination when the dependent variable is categorical and when a large number of categorical explanatory variables are retained. This method, Categorical Multiblock Linear Discriminant Analysis, computes components which take into account both relationships between explanatory categorical variables and canonical correlation between each explanatory categorical variable and the dependent variable. A comparison with three other techniques and an application on credit scoring data are provided.  相似文献   

8.
Predictive influence of explanatory variables has been studied in both univariate and multivariate distributions. In the Bayesian approach, the same problem is considered in absence of multicollinearity in the dataset. The aim of this article is to study the same in the presence of perfect multicollinearity. To do this, we first derived the predictive distributions for full model and reduced model using vague prior density. Then the discrepancies between these predictive distributions are measured by the Kullback–Leibler (K–L) directed measure of divergence to assess the influence of deleted explanatory variables. Finally, distribution of the discrepancies is derived and the test procedure is performed.  相似文献   

9.
This article considers the unconditional asymptotic covariance matrix of the least squares estimator in the linear regression model with stochastic explanatory variables. The asymptotic covariance matrix of the least squares estimator of regression parameters is evaluated relative to the standard asymptotic covariance matrix when the joint distribution of the dependent and explanatory variables is in the class of elliptically symmetric distributions. An empirical example using financial data is presented. Numerical examples and simulation experiments are given to illustrate the difference of the two asymptotic covariance matrices.  相似文献   

10.
Hausman test is popularly used to examine the endogeneity of explanatory variables in a regression model. To derive a well-defined asymptotic distribution of Hausman test, the correlation between the instrumental variables and the error term needs to converge to zero. However, it is possible that there remains considerable correlation in finite samples between the instruments and the error, even though their correlation eventually converges to zero. This article investigates the potential problem that such “pseudo-exogenous” instruments may create. We show that the performance of Hausman test is deteriorated when the instruments are asymptotically exogenous but endogenous in finite samples, through Monte Carlo simulations.  相似文献   

11.
Abstract

In this paper, under the assumption of linear relationship between two variables we provide alternative simple method of proving the existing result connecting correlation coefficient with those of skewness of response and explanatory variables. Further we have given a relationship between correlation coefficient and coefficient of kurtosis of response and explanatory variables assuming the linear relationship between the two variables. Simple alternative way of deriving the formula, which helps in finding the direction dependence in linear regression, is discussed.  相似文献   

12.
Score test of homogeneity for survival data   总被引:3,自引:0,他引:3  
If follow-up is made for subjects which are grouped into units, such as familial or spatial units then it may be interesting to test whether the groups are homogeneous (or independent for given explanatory variables). The effect of the groups is modelled as random and we consider a frailty proportional hazards model which allows to adjust for explanatory variables. We derive the score test of homogeneity from the marginal partial likelihood and it turns out to be the sum of a pairwise correlation term of martingale residuals and an overdispersion term. In the particular case where the sizes of the groups are equal to one, this statistic can be used for testing overdispersion. The asymptotic variance of this statistic is derived using counting process arguments. An extension to the case of several strata is given. The resulting test is computationally simple; its use is illustrated using both simulated and real data. In addition a decomposition of the score statistic is proposed as a sum of a pairwise correlation term and an overdispersion term. The pairwise correlation term can be used for constructing a statistic more robust to departure from the proportional hazard model, and the overdispesion term for constructing a test of fit of the proportional hazard model.  相似文献   

13.
Most of the available literature on accelerated life testing deals with tests that use only one accelerating variable and no other explanatory variables. Frequently, however, there is a need to use more than one accelerating or other experimental variables. Examples include a test of capacitors at higher than usual levels of temperature and voltage, and a test of circuit boards at higher than usual levels of temperature, humidity, and voltage. M-step, step-stress models are extended to include k stress variables. Optimum M-step, step-stress designs with k stress variables are found. The polynomial model is considered as a special case, and a lack of fit test is discussed. Also a goodness-of-fit test is proposed and the appropriateness of using its asymptotic chi-square distribution for small samples is shown.  相似文献   

14.
Consider a vector valued response variable related to a vector valued explanatory variable through a normal multivariate linear model. The multivariate calibration problem deals with statistical inference on unknown values of the explanatory variable. The problem addressed is the construction of joint confidence regions for several unknown values of the explanatory variable. The problem is investigated when the variance covariance matrix is a scalar multiple of the identity matrix and also when it is a completely unknown positive definite matrix. The problem is solved in only two cases: (i) the response and explanatory variables have the same dimensions, and (ii) the explanatory variable is a scalar. In the former case, exact joint confidence regions are derived based on a natural pivot statistic. In the latter case, the joint confidence regions are only conservative. Computational aspects and the practical implementation of the confidence regions are discussed and illustrated using an example.  相似文献   

15.
We propose a method of comparing two functional linear models in which explanatory variables are functions (curves) and responses can be either scalars or functions. In such models, the role of parameter vectors (or matrices) is played by integral operators acting on a function space. We test the null hypothesis that these operators are the same in two independent samples. The complexity of the test statistics increases as we move from scalar to functional responses and relax assumptions on the covariance structure of the regressors. They all, however, have an asymptotic chi‐squared distribution with the number of degrees of freedom which depends on a specific setting. The test statistics are readily computable using the R package fda , and have good finite sample properties. The test is applied to egg‐laying curves of Mediterranean flies and to data from terrestrial magnetic observatories. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

16.
Ridge regression solves multicollinearity problems by introducing a biasing parameter that is called ridge parameter; it shrinks the estimates and their standard errors in order to reach acceptable results. Selection of the ridge parameter was done using several subjective and objective techniques that are concerned with certain criteria. In this study, selection of the ridge parameter depends on other important statistical measures to reach a better value of the ridge parameter. The proposed ridge parameter selection technique depends on a mathematical programming model and the results are evaluated using a simulation study. The performance of the proposed method is good when the error variance is greater than or equal to one; the sample consists of 20 observations, the number of explanatory variables in the model is 2, and there is a very strong correlation between the two explanatory variables.  相似文献   

17.
删除截距项和遗漏解释变量是线性回归模型估计中的两个常见错误,删除截距项错误发生的原因是检验过程中发现其不显著而将其剔除,这会造成模型参数估计和假设检验的失真;遗漏解释变量的错误发生原因是人们错误认为只要变量存在相关性且存在因果联系就可以进行回归分析,以至于不考虑其它重要的解释变量,此时建立的模型不能用于经济结构分析和政策评价,最多只能用于预测目的。  相似文献   

18.
The so-called “fixed effects” approach to the estimation of panel data models suffers from the limitation that it is not possible to estimate the coefficients on explanatory variables that are time-invariant. This is in contrast to a “random effects” approach, which achieves this by making much stronger assumptions on the relationship between the explanatory variables and the individual-specific effect. In a linear model, it is possible to obtain the best of both worlds by making random effects-type assumptions on the time-invariant explanatory variables while maintaining the flexibility of a fixed effects approach when it comes to the time-varying covariates. This article attempts to do the same for some popular nonlinear models.  相似文献   

19.
Dependence in outcome variables may pose formidable difficulty in analyzing data in longitudinal studies. In the past, most of the studies made attempts to address this problem using the marginal models. However, using the marginal models alone, it is difficult to specify the measures of dependence in outcomes due to association between outcomes as well as between outcomes and explanatory variables. In this paper, a generalized approach is demonstrated using both the conditional and marginal models. This model uses link functions to test for dependence in outcome variables. The estimation and test procedures are illustrated with an application to the mobility index data from the Health and Retirement Survey and also simulations are performed for correlated binary data generated from the bivariate Bernoulli distributions. The results indicate the usefulness of the proposed method.  相似文献   

20.
Suppose that the conditional density of a response variable given a vector of explanatory variables is parametrically modelled, and that data are collected by a two-phase sampling design. First, a simple random sample is drawn from the population. The stratum membership in a finite number of strata of the response and explanatory variables is recorded for each unit. Second, a subsample is drawn from the phase-one sample such that the selection probability is determined by the stratum membership. The response and explanatory variables are fully measured at this phase. We synthesize existing results on nonparametric likelihood estimation and present a streamlined approach for the computation and the large sample theory of profile likelihood in four different situations. The amount of information in terms of data and assumptions varies depending on whether the phase-one data are retained, the selection probabilities are known, and/or the stratum probabilities are known. We establish and illustrate numerically the order of efficiency among the maximum likelihood estimators, according to the amount of information utilized, in the four situations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号