首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Dynamic regression models are widely used because they express and model the behaviour of a system over time. In this article, two dynamic regression models, the distributed lag (DL) model and the autoregressive distributed lag model, are evaluated focusing on their lag lengths. From a classical statistics point of view, there are various methods to determine the number of lags, but none of them are the best in all situations. This is a serious issue since wrong choices will provide bad estimates for the effects of the regressors on the response variable. We present an alternative for the aforementioned problems by considering a Bayesian approach. The posterior distributions of the numbers of lags are derived under an improper prior for the model parameters. The fractional Bayes factor technique [A. O'Hagan, Fractional Bayes factors for model comparison (with discussion), J. R. Statist. Soc. B 57 (1995), pp. 99–138] is used to handle the indeterminacy in the likelihood function caused by the improper prior. The zero-one loss function is used to penalize wrong decisions. A naive method using the specified maximum number of DLs is also presented. The proposed and the naive methods are verified using simulation data. The results are promising for the method we proposed. An illustrative example with a real data set is provided.  相似文献   

2.
This paper aims at introducing a Bayesian robust error-in-variable regression model in which the dependent variable is censored. We extend previous works by assuming a multivariate t distribution for jointly modelling the behaviour of the errors and the latent explanatory variable. Inference is done under the Bayesian paradigm. We use a data augmentation approach and develop a Markov chain Monte Carlo algorithm to sample from the posterior distributions. We run a Monte Carlo study to evaluate the efficiency of the posterior estimators in different settings. We compare the proposed model to three other models previously discussed in the literature. As a by-product we also provide a Bayesian analysis of the t-tobit model. We fit all four models to analyse the 2001 Medical Expenditure Panel Survey data.  相似文献   

3.
A Bayesian method for estimating a time-varying regression model subject to the presence of structural breaks is proposed. Heteroskedastic dynamics, via both GARCH and stochastic volatility specifications, and an autoregressive factor, subject to breaks, are added to generalize the standard return prediction model, in order to efficiently estimate and examine the relationship and how it changes over time. A Bayesian computational method is employed to identify the locations of structural breaks, and for estimation and inference, simultaneously accounting for heteroskedasticity and autocorrelation. The proposed methods are illustrated using simulated data. Then, an empirical study of the Taiwan and Hong Kong stock markets, using oil and gas price returns as a state variable, provides strong support for oil prices being an important explanatory variable for stock returns.  相似文献   

4.
In this article, to reduce computational load in performing Bayesian variable selection, we used a variant of reversible jump Markov chain Monte Carlo methods, and the Holmes and Held (HH) algorithm, to sample model index variables in logistic mixed models involving a large number of explanatory variables. Furthermore, we proposed a simple proposal distribution for model index variables, and used a simulation study and real example to compare the performance of the HH algorithm with our proposed and existing proposal distributions. The results show that the HH algorithm with our proposed proposal distribution is a computationally efficient and reliable selection method.  相似文献   

5.
We investigated CART performance with a unimodal response curve for one continuous response and four continuous explanatory variables, where two variables were important (i.e. directly related to the response) and the other two were not. We explored performance under three relationship strengths and two explanatory variable conditions: equal importance and one variable four times as important as the other. We compared CART variable selection performance using three tree-selection rules ('minimum risk', 'minimum risk complexity', 'one standard error') to stepwise polynomial ordinary least squares (OLS) under four sample size conditions. The one-standard-error and minimum risk-complexity methods performed about as well as stepwise OLS with large sample sizes when the relationship was strong. With weaker relationships, equally important explanatory variables and larger sample sizes, the one-standard-error and minimum-risk-complexity rules performed better than stepwise OLS. With weaker relationships and explanatory variables of unequal importance, tree-structured methods did not perform as well as stepwise OLS. Comparing performance within tree-structured methods, with a strong relationship and equally important explanatory variables, the one-standard-error rule was more likely to choose the correct model than were the other tree-selection rules. The minimum-risk-complexity rule was more likely to choose the correct model than were the other tree-selection rules (1) with weaker relationships and equally important explanatory variables; and (2) under all relationship strengths when explanatory variables were of unequal importance and sample sizes were lower.  相似文献   

6.
Bayesian model building techniques are developed for data with a strong time series structure and possibly exogenous explanatory variables that have strong explanatory and predictive power. The emphasis is on finding whether there are any explanatory variables that might be used for modelling if the data have a strong time series structure that should also be included. We use a time series model that is linear in past observations and that can capture both stochastic and deterministic trend, seasonality and serial correlation. We propose the plotting of absolute predictive error against predictive standard deviation. A series of such plots is utilized to determine which of several nested and non-nested models is optimal in terms of minimizing the dispersion of the predictive distribution and restricting predictive outliers. We apply the techniques to modelling monthly counts of fatal road crashes in Australia where economic, consumption and weather variables are available and we find that three such variables should be included in addition to the time series filter. The approach leads to graphical techniques to determine strengths of relationships between the dependent variable and covariates and to detect model inadequacy as well as determining useful numerical summaries.  相似文献   

7.
Consider the usual linear regression model consisting of two or more explanatory variables. There are many methods aimed at indicating the relative importance of the explanatory variables. But in general these methods do not address a fundamental issue: when all of the explanatory variables are included in the model, how strong is the empirical evidence that the first explanatory variable is more or less important than the second explanatory variable? How strong is the empirical evidence that the first two explanatory variables are more important than the third explanatory variable? The paper suggests a robust method for dealing with these issues. The proposed technique is based on a particular version of explanatory power used in conjunction with a modification of the basic percentile method.  相似文献   

8.
In the causal analysis of survival data a time-based response is related to a set of explanatory variables. Definition of the relation between the time and the covariates may become a difficult task, particularly in the preliminary stage, when the information is limited. Through a nonparametric approach, we propose to estimate the survival function allowing to evaluate the relative importance of each potential explanatory variable, in a simple and explanatory fashion. To achieve this aim, each of the explanatory variables is used to partition the observed survival times. The observations are assumed to be partially exchangeable according to such partition. We then consider, conditionally on each partition, a hierarchical nonparametric Bayesian model on the hazard functions. We define and compare different prior distribution for the hazard functions.  相似文献   

9.
In this contribution we aim at improving ordinal variable selection in the context of causal models for credit risk estimation. In this regard, we propose an approach that provides a formal inferential tool to compare the explanatory power of each covariate and, therefore, to select an effective model for classification purposes. Our proposed model is Bayesian nonparametric thus keeps the amount of model specification to a minimum. We consider the case in which information from the covariates is at the ordinal level. A noticeable instance of this regards the situation in which ordinal variables result from rankings of companies that are to be evaluated according to different macro and micro economic aspects, leading to ordinal covariates that correspond to various ratings, that entail different magnitudes of the probability of default. For each given covariate, we suggest to partition the statistical units in as many groups as the number of observed levels of the covariate. We then assume individual defaults to be homogeneous within each group and heterogeneous across groups. Our aim is to compare and, therefore select, the partition structures resulting from the consideration of different explanatory covariates. The metric we choose for variable comparison is the calculation of the posterior probability of each partition. The application of our proposal to a European credit risk database shows that it performs well, leading to a coherent and clear method for variable averaging of the estimated default probabilities.  相似文献   

10.
Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation–maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed.  相似文献   

11.
In this paper, we propose a methodology to analyze longitudinal data through distances between pairs of observations (or individuals) with regard to the explanatory variables used to fit continuous response variables. Restricted maximum-likelihood and generalized least squares are used to estimate the parameters in the model. We applied this new approach to study the effect of gender and exposure on the deviant behavior variable with respect to tolerance for a group of youths studied over a period of 5 years. Were performed simulations where we compared our distance-based method with classic longitudinal analysis with both AR(1) and compound symmetry correlation structures. We compared them under Akaike and Bayesian information criterions, and the relative efficiency of the generalized variance of the errors of each model. We found small gains in the proposed model fit with regard to the classical methodology, particularly in small samples, regardless of variance, correlation, autocorrelation structure and number of time measurements.  相似文献   

12.
Semiparametric models provide a more flexible form for modeling the relationship between the response and the explanatory variables. On the other hand in the literature of modeling for the missing variables, canonical form of the probability of the variable being missing (p) is modeled taking a fully parametric approach. Here we consider a regression spline based semiparametric approach to model the missingness mechanism of nonignorably missing covariates. In this model the relationship between the suitable canonical form of p (e.g. probit p) and the missing covariate is modeled through several splines. A Bayesian procedure is developed to efficiently estimate the parameters. A computationally advantageous prior construction is proposed for the parameters of the semiparametric part. A WinBUGS code is constructed to apply Gibbs sampling to obtain the posterior distributions. We show through an extensive Monte Carlo simulation experiment that response model coefficent estimators maintain better (when the true missingness mechanism is nonlinear) or equivalent (when the true missingness mechanism is linear) bias and efficiency properties with the use of proposed semiparametric missingness model compared to the conventional model.  相似文献   

13.
Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However, they perform poorly for high-dimensional data and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high-dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast, standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis–Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high-dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling.  相似文献   

14.
An alternative graphical method, called the SSR plot, is proposed for use with a multiple regression model. The new method uses the fact that the sum of squares for regression (SSR) of two explanatory variables can be partitioned into the SSR of one variable and the increment in SSR due to the addition of the second variable. The SSR plot represents each explanatory variable as a vector in a half circle. Our proposed SSR plot explains that the explanatory variables corresponding to the vectors located closer to the horizontal axis have stronger effects on the response variable. Furthermore, for a regression model with two explanatory variables, the magnitude of the angle between two vectors can be used to identify suppression.  相似文献   

15.
We propose a general latent variable model for multivariate ordinal categorical variables, in which both the responses and the covariates are ordinal, to assess the effect of the covariates on the responses and to model the covariance structure of the response variables. A?fully Bayesian approach is employed to analyze the model. The Gibbs sampler is used to simulate the joint posterior distribution of the latent variables and the parameters, and the parameter expansion and reparameterization techniques are used to speed up the convergence procedure. The proposed model and method are demonstrated by simulation studies and a real data example.  相似文献   

16.
In many studies a large number of variables is measured and the identification of relevant variables influencing an outcome is an important task. For variable selection several procedures are available. However, focusing on one model only neglects that there usually exist other equally appropriate models. Bayesian or frequentist model averaging approaches have been proposed to improve the development of a predictor. With a larger number of variables (say more than ten variables) the resulting class of models can be very large. For Bayesian model averaging Occam’s window is a popular approach to reduce the model space. As this approach may not eliminate any variables, a variable screening step was proposed for a frequentist model averaging procedure. Based on the results of selected models in bootstrap samples, variables are eliminated before deriving a model averaging predictor. As a simple alternative screening procedure backward elimination can be used. Through two examples and by means of simulation we investigate some properties of the screening step. In the simulation study we consider situations with fifteen and 25 variables, respectively, of which seven have an influence on the outcome. With the screening step most of the uninfluential variables will be eliminated, but also some variables with a weak effect. Variable screening leads to more applicable models without eliminating models, which are more strongly supported by the data. Furthermore, we give recommendations for important parameters of the screening step.  相似文献   

17.
This paper presents a Bayesian analysis of partially linear additive models for quantile regression. We develop a semiparametric Bayesian approach to quantile regression models using a spectral representation of the nonparametric regression functions and the Dirichlet process (DP) mixture for error distribution. We also consider Bayesian variable selection procedures for both parametric and nonparametric components in a partially linear additive model structure based on the Bayesian shrinkage priors via a stochastic search algorithm. Based on the proposed Bayesian semiparametric additive quantile regression model referred to as BSAQ, the Bayesian inference is considered for estimation and model selection. For the posterior computation, we design a simple and efficient Gibbs sampler based on a location-scale mixture of exponential and normal distributions for an asymmetric Laplace distribution, which facilitates the commonly used collapsed Gibbs sampling algorithms for the DP mixture models. Additionally, we discuss the asymptotic property of the sempiparametric quantile regression model in terms of consistency of posterior distribution. Simulation studies and real data application examples illustrate the proposed method and compare it with Bayesian quantile regression methods in the literature.  相似文献   

18.
面板数据的自适应Lasso分位回归方法研究   总被引:1,自引:0,他引:1  
如何在对参数进行估计的同时自动选择重要解释变量,一直是面板数据分位回归模型中讨论的热点问题之一。通过构造一种含多重随机效应的贝叶斯分层分位回归模型,在假定固定效应系数先验服从一种新的条件Laplace分布的基础上,给出了模型参数估计的Gibbs抽样算法。考虑到不同重要程度的解释变量权重系数压缩程度应该不同,所构造的先验信息具有自适应性的特点,能够准确地对模型中重要解释变量进行自动选取,且设计的切片Gibbs抽样算法能够快速有效地解决模型中各个参数的后验均值估计问题。模拟结果显示,新方法在参数估计精确度和变量选择准确度上均优于现有文献的常用方法。通过对中国各地区多个宏观经济指标的面板数据进行建模分析,演示了新方法估计参数与挑选变量的能力。  相似文献   

19.
A novel fully Bayesian approach for modeling survival data with explanatory variables using the Piecewise Exponential Model (PEM) with random time grid is proposed. We consider a class of correlated Gamma prior distributions for the failure rates. Such prior specification is obtained via the dynamic generalized modeling approach jointly with a random time grid for the PEM. A product distribution is considered for modeling the prior uncertainty about the random time grid, turning possible the use of the structure of the Product Partition Model (PPM) to handle the problem. A unifying notation for the construction of the likelihood function of the PEM, suitable for both static and dynamic modeling approaches, is considered. Procedures to evaluate the performance of the proposed model are provided. Two case studies are presented in order to exemplify the methodology. For comparison purposes, the data sets are also fitted using the dynamic model with fixed time grid established in the literature. The results show the superiority of the proposed model.  相似文献   

20.
Joint damage in psoriatic arthritis can be measured by clinical and radiological methods, the former being done more frequently during longitudinal follow-up of patients. Motivated by the need to compare findings based on the different methods with different observation patterns, we consider longitudinal data where the outcome variable is a cumulative total of counts that can be unobserved when other, informative, explanatory variables are recorded. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. An approach to the incorporation of informative observation is suggested. We present analyses based on an observational database from a psoriatic arthritis clinic. Although the use of the new statistical methodology has relatively little effect in this example, simulation studies indicate that the method can provide substantial improvements in bias and coverage in some situations where there is an important time varying explanatory variable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号