首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 585 毫秒
1.
Principal component regression uses principal components (PCs) as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not in line with the general trend. The proposed approach automatically penalises these observations so that their impact on the posterior gradually vanishes as they move further and further away from the general trend, corresponding to a concept in Bayesian statistics called whole robustness. The predictions produced are thus consistent with the bulk of the data. The approach also exploits the geometry of PCs to efficiently identify those that are significant. Individual predictions obtained from the resulting models are consolidated according to model-averaging mechanisms to account for model uncertainty. The approach is evaluated on real data and compared to its nonrobust Bayesian counterpart, the traditional frequentist approach and a commonly employed robust frequentist method. Detailed guidelines to automate the entire statistical procedure are provided. All required code is made available, see ArXiv:1711.06341.  相似文献   

2.
Circular data are observations that are represented as points on a unit circle. Times of day and directions of wind are two such examples. In this work, we present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is useful especially when the likelihood surface is ill behaved. Markov chain Monte Carlo techniques are used to fit the proposed model and to generate predictions. The method is illustrated using an environmental data set.  相似文献   

3.
This article examines the forecasting accuracies of various methods used by Federal Reserve Banks to estimate real value added by regional manufacturing industries. Using Texas manufacturing data and weighted forecasting accuracy measures consistent with index number construction for Texas, obtained results support the use of very simple methods based on the assumption of product exhaustion, allowing for technical change. More complex methods using Cobb-Douglas production functions estimated by Bayesian techniques did not perform as well, not because of lack of conceptual sophistication or appropriate prior information but probably because of the small number of observations and collinearity of the data that are available when constructing regional production indices. These results must be qualified. The weighted forecasting accuracy measures tend to obscure the fact that no one method is uniformly superior to the other methods for all industries. Given industry weights different from those for Texas, the results presented here could be reversed. Confirmation of the conclusions drawn await the results of other regional manufacturing studies.  相似文献   

4.
Bayesian calibration of computer models   总被引:5,自引:0,他引:5  
We consider prediction and uncertainty analysis for systems which are approximated using complex mathematical models. Such models, implemented as computer codes, are often generic in the sense that by a suitable choice of some of the model's input parameters the code can be used to predict the behaviour of the system in a variety of specific applications. However, in any specific application the values of necessary parameters may be unknown. In this case, physical observations of the system in the specific context are used to learn about the unknown parameters. The process of fitting the model to the observed data by adjusting the parameters is known as calibration. Calibration is typically effected by ad hoc fitting, and after calibration the model is used, with the fitted input values, to predict the future behaviour of the system. We present a Bayesian calibration technique which improves on this traditional approach in two respects. First, the predictions allow for all sources of uncertainty, including the remaining uncertainty over the fitted parameters. Second, they attempt to correct for any inadequacy of the model which is revealed by a discrepancy between the observed data and the model predictions from even the best-fitting parameter values. The method is illustrated by using data from a nuclear radiation release at Tomsk, and from a more complex simulated nuclear accident exercise.  相似文献   

5.
We present a Bayesian analysis of a piecewise linear model constructed by using basis functions which generalizes the univariate linear spline to higher dimensions. Prior distributions are adopted on both the number and the locations of the splines, which leads to a model averaging approach to prediction with predictive distributions that take into account model uncertainty. Conditioning on the data produces a Bayes local linear model with distributions on both predictions and local linear parameters. The method is spatially adaptive and covariate selection is achieved by using splines of lower dimension than the data.  相似文献   

6.
ABSTRACT

This paper deals with Bayes, robust Bayes, and minimax predictions in a subfamily of scale parameters under an asymmetric precautionary loss function. In Bayesian statistical inference, the goal is to obtain optimal rules under a specified loss function and an explicit prior distribution over the parameter space. However, in practice, we are not able to specify the prior totally or when a problem must be solved by two statisticians, they may agree on the choice of the prior but not the values of the hyperparameters. A common approach to the prior uncertainty in Bayesian analysis is to choose a class of prior distributions and compute some functional quantity. This is known as Robust Bayesian analysis which provides a way to consider the prior knowledge in terms of a class of priors Γ for global prevention against bad choices of hyperparameters. Under a scale invariant precautionary loss function, we deal with robust Bayes predictions of Y based on X. We carried out a simulation study and a real data analysis to illustrate the practical utility of the prediction procedure.  相似文献   

7.
Bayesian Additive Regression Trees (BART) is a statistical sum of trees model. It can be considered a Bayesian version of machine learning tree ensemble methods where the individual trees are the base learners. However, for datasets where the number of variables p is large the algorithm can become inefficient and computationally expensive. Another method which is popular for high-dimensional data is random forests, a machine learning algorithm which grows trees using a greedy search for the best split points. However, its default implementation does not produce probabilistic estimates or predictions. We propose an alternative fitting algorithm for BART called BART-BMA, which uses Bayesian model averaging and a greedy search algorithm to obtain a posterior distribution more efficiently than BART for datasets with large p. BART-BMA incorporates elements of both BART and random forests to offer a model-based algorithm which can deal with high-dimensional data. We have found that BART-BMA can be run in a reasonable time on a standard laptop for the “small n large p” scenario which is common in many areas of bioinformatics. We showcase this method using simulated data and data from two real proteomic experiments, one to distinguish between patients with cardiovascular disease and controls and another to classify aggressive from non-aggressive prostate cancer. We compare our results to their main competitors. Open source code written in R and Rcpp to run BART-BMA can be found at: https://github.com/BelindaHernandez/BART-BMA.git.  相似文献   

8.
Previously, Bayesian anomaly was reported for estimating reliability when subsystem failure data and system failure data were obtained from the same time period. As a result, a practical method for mitigating Bayesian anomaly was developed. In the first part of this paper, however, we show that the Bayesian anomaly can be avoided as long as the same failure information is incorporated in the model. In the second part of this paper, we consider a problem of estimating the Bayesian reliability when the failure count data on subsystems and systems are obtained from the same time period. We show that Bayesian anomaly does not exist when using the multinomial distribution with the Dirichlet prior distribution. A numerical example is given to compare the proposed method with the previous methods.  相似文献   

9.
Both knowledge-based systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Bayesian model averaging, a technique for accounting for model uncertainty.

Second, we describe a technique for eliciting a prior distribution for competing models from domain experts. We explore the predictive performance of both techniques in the context of a urological diagnostic problem.  相似文献   

10.
Practical Bayesian data analysis involves manipulating and summarizing simulations from the posterior distribution of the unknown parameters. By manipulation we mean computing posterior distributions of functions of the unknowns, and generating posterior predictive distributions. The results need to be summarized both numerically and graphically. We introduce, and implement in R, an object-oriented programming paradigm based on a random variable object type that is implicitly represented by simulations. This makes it possible to define vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow to treat these objects like any numeric vectors or arrays, providing a solution to various problems encountered in Bayesian computing involving posterior simulations. We illustrate the use of this new programming environment with examples of Bayesian computing, demonstrating missing-value imputation, nonlinear summary of regression predictions, and posterior predictive checking.  相似文献   

11.
12.
In this article we develop a class of stochastic boosting (SB) algorithms, which build upon the work of Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007). They introduce boosting algorithms which correspond to standard boosting (e.g. Bühlmann and Hothorn, Stat. Sci. 22:477–505, 2007) except that the optimization algorithms are randomized; this idea is placed within a Bayesian framework. We show that the inferential procedure in Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007) is incorrect and further develop interpretational, computational and theoretical results which allow one to assess SB’s potential for classification and regression problems. To use SB, sequential Monte Carlo (SMC) methods are applied. As a result, it is found that SB can provide better predictions for classification problems than the corresponding boosting algorithm. A theoretical result is also given, which shows that the predictions of SB are not significantly worse than boosting, when the latter provides the best prediction. We also investigate the method on a real case study from machine learning.  相似文献   

13.
We develop a Bayesian estimation method to non-parametric mixed-effect models under shape-constrains. The approach uses a hierarchical Bayesian framework and characterizations of shape-constrained Bernstein polynomials (BPs). We employ Markov chain Monte Carlo methods for model fitting, using a truncated normal distribution as the prior for the coefficients of BPs to ensure the desired shape constraints. The small sample properties of the Bayesian shape-constrained estimators across a range of functions are provided via simulation studies. Two real data analysis are given to illustrate the application of the proposed method.  相似文献   

14.
Multiple-membership logit models with random effects are models for clustered binary data, where each statistical unit can belong to more than one group. The likelihood function of these models is analytically intractable. We propose two different approaches for parameter estimation: indirect inference and data cloning (DC). The former is a non-likelihood-based method which uses an auxiliary model to select reasonable estimates. We propose an auxiliary model with the same dimension of parameter space as the target model, which is particularly convenient to reach good estimates very fast. The latter method computes maximum likelihood estimates through the posterior distribution of an adequate Bayesian model, fitted to cloned data. We implement a DC algorithm specifically for multiple-membership models. A Monte Carlo experiment compares the two methods on simulated data. For further comparison, we also report Bayesian posterior mean and Integrated Nested Laplace Approximation hybrid DC estimates. Simulations show a negligible loss of efficiency for the indirect inference estimator, compensated by a relevant computational gain. The approaches are then illustrated with two real examples on matched paired data.  相似文献   

15.
This paper is motivated from a neurophysiological study of muscle fatigue, in which biomedical researchers are interested in understanding the time-dependent relationships of handgrip force and electromyography measures. A varying coefficient model is appealing here to investigate the dynamic pattern in the longitudinal data. The response variable in the study is continuous but bounded on the standard unit interval (0, 1) over time, while the longitudinal covariates are contaminated with measurement errors. We propose a generalization of varying coefficient models for the longitudinal proportional data with errors-in-covariates. We describe two estimation methods with penalized splines, which are formalized under a Bayesian inferential perspective. The first method is an adaptation of the popular regression calibration approach. The second method is based on a joint likelihood under the hierarchical Bayesian model. A simulation study is conducted to evaluate the efficacy of the proposed methods under different scenarios. The analysis of the neurophysiological data is presented to demonstrate the use of the methods.  相似文献   

16.
In this paper, we adopt the Bayesian approach to expectile regression employing a likelihood function that is based on an asymmetric normal distribution. We demonstrate that improper uniform priors for the unknown model parameters yield a proper joint posterior. Three simulated data sets were generated to evaluate the proposed method which show that Bayesian expectile regression performs well and has different characteristics comparing with Bayesian quantile regression. We also apply this approach into two real data analysis.  相似文献   

17.
18.
When biological or physiological variables change over time, we are often interested in making predictions either of future measurements or of the time taken to reach some threshold value. On the basis of longitudinal data for multiple individuals, we develop Bayesian hierarchical models for making these predictions together with their associated uncertainty. Particular aspects addressed, which include some novel components, are handling curvature in individuals' trends over time, making predictions for both underlying and measured levels, making predictions from a single baseline measurement, making predictions from a series of measurements, allowing flexibility in the error and random-effects distributions, and including covariates. In the context of data on the expansion of abdominal aortic aneurysms over time, where reaching a certain threshold leads to referral for surgery, we discuss the practical application of these models to the planning of monitoring intervals in a national screening programme. Prediction of the time to reach a threshold was too imprecise to be practically useful, and we focus instead on limiting the probability of exceeding the threshold after given time intervals. Although more complex models can be shown to fit the data better, we find that relatively simple models seem to be adequate for planning monitoring intervals.  相似文献   

19.
In this paper, we consider the prediction of a future observation based on a type-I hybrid censored sample when the lifetime distribution of experimental units is assumed to be a Weibull random variable. Different classical and Bayesian point predictors are obtained. Bayesian predictors are obtained using squared error and linear-exponential loss functions. We also provide a simulation consistent method for computing Bayesian prediction intervals. Monte Carlo simulations are performed to compare the performances of the different methods, and one data analysis has been presented for illustrative purposes.  相似文献   

20.
We formulate a traditional growth and yield model as a Bayes model. We attempt to introduce as few new assumptions as possible. Zellner's Bayesian method of moments procedure is used, because the published model did not include any distributional assumptions. We generate predictive posterior samples for a number of stand variables using the Gibbs sampler. The means of the samples compare favorably with the predictions from the published model. In addition, our model delivers distributions of outcomes, from which it is easy to establish measures of uncertainty, such as highest posterior density regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号