首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Utilizing the notion of matching predictives as in Berger and Pericchi, we show that for the conjugate family of prior distributions in the normal linear model, the symmetric Kullback-Leibler divergence between two particular predictive densities is minimized when the prior hyperparameters are taken to be those corresponding to the predictive priors proposed in Ibrahim and Laud and Laud and Ibrahim. The main application for this result is for Bayesian variable selection.  相似文献   

2.
In the case of prior knowledge about the unknown parameter, the Bayesian predictive density coincides with the Bayes estimator for the true density in the sense of the Kullback-Leibler divergence, but this is no longer true if we consider another loss function. In this paper we present a generalized Bayes rule to obtain Bayes density estimators with respect to any α-divergence, including the Kullback-Leibler divergence and the Hellinger distance. For curved exponential models, we study the asymptotic behaviour of these predictive densities. We show that, whatever prior we use, the generalized Bayes rule improves (in a non-Bayesian sense) the estimative density corresponding to a bias modification of the maximum likelihood estimator. It gives rise to a correspondence between choosing a prior density for the generalized Bayes rule and fixing a bias for the maximum likelihood estimator in the classical setting. A criterion for comparing and selecting prior densities is also given.  相似文献   

3.
A family of Viterbi Bayesian predictive classifiers has been recently popularized for speech recognition applications with continuous acoustic signals modeled by finite mixture densities embedded in a hidden Markov framework. Here we generalize such classifiers to sequentially observed data from multiple finite alphabets and derive the optimal predictive classifier under exchangeability of the emitted symbols. We demonstrate that the optimal predictive classifier which learns from unlabelled test items improves considerably upon marginal maximum a posteriori rule in the presence of sparse training data. It is shown that the learning process saturates when the amount of test data tends to infinity, such that no further gain in classification accuracy is possible upon arrival of new test items in the long run.  相似文献   

4.
In this paper, a novel Bayesian framework is used to derive the posterior density function, predictive density for a single future response, a bivariate future response, and several future responses from the exponentiated Weibull model (EWM). We study three related types of models, the exponentiated exponential, exponentiated Weibull, and beta generalized exponential, which are all utilized to determine the goodness of fit of two real data sets. The statistical analysis indicates that the EWM best fits both data sets. We determine the predictive means, standard deviations, highest predictive density intervals, and the shape characteristics for a single future response. We also consider a new parameterization method to determine the posterior kernel densities for the parameters. The summary results of the parameters are calculated by using the Markov chain Monte Carlo method.  相似文献   

5.
This article is concerned with making predictive inference on the basis of a doubly censored sample from a two-parameter Rayleigh life model. We derive the predictive distributions for a single future response, the ith future response, and several future responses. We use the Bayesian approach in conjunction with an improper flat prior for the location parameter and an independent proper conjugate prior for the scale parameter to derive the predictive distributions. We conclude with a numerical example in which the effect of the hyperparameters on the mean and standard deviation of the predictive density is assessed.  相似文献   

6.
Construction methods for prior densities are investigated from a predictive viewpoint. Predictive densities for future observables are constructed by using observed data. The simultaneous distribution of future observables and observed data is assumed to belong to a parametric submodel of a multinomial model. Future observables and data are possibly dependent. The discrepancy of a predictive density to the true conditional density of future observables given observed data is evaluated by the Kullback-Leibler divergence. It is proved that limits of Bayesian predictive densities form an essentially complete class. Latent information priors are defined as priors maximizing the conditional mutual information between the parameter and the future observables given the observed data. Minimax predictive densities are constructed as limits of Bayesian predictive densities based on prior sequences converging to the latent information priors.  相似文献   

7.
The predictive distribution is a mixture of the original distribution model and is used for predicting a future observation. Therein, the mixing distribution is the posterior distribution of the distribution parameters in the Bayesian inference. The mixture can also be computed for the frequentist inference because the Bayesian posterior distribution has the same meaning as a frequentist confidence interval. I present arguments against the concept of predictive distribution. Examples illustrate these. The most important argument is that the predictive distribution can depend on the parameterization. An improvement of the theory of the predictive distribution is recommended.  相似文献   

8.
Within the context of the multiviriate general linear model, and using a Bayesian formulation and Kullback-Leibler divergences this paper provides a framework and the resultant methods for the problem of detecting and characterizing influential subsets of observations when the goal is to estimate parameters. It is further indicated how these influence measures inherently depend upon one's exact estimative intent. The relationship to previous work on observations influential in estimation is discussed. The estimative influence measures obtained here are also compared with predictive influence functions previously obtained. Several examples are presented illustrating the methodology.  相似文献   

9.
We propose a Bayesian nonparametric procedure for density estimation, for data in a closed, bounded interval, say [0,1]. To this aim, we use a prior based on Bemstein polynomials. This corresponds to expressing the density of the data as a mixture of given beta densities, with random weights and a random number of components. The density estimate is then obtained as the corresponding predictive density function. Comparison with classical and Bayesian kernel estimates is provided. The proposed procedure is illustrated in an example; an MCMC algorithm for approximating the estimate is also discussed.  相似文献   

10.
Consistency of Bernstein polynomial posteriors   总被引:1,自引:0,他引:1  
A Bernstein prior is a probability measure on the space of all the distribution functions on [0, 1]. Under very general assumptions, it selects absolutely continuous distribution functions, whose densities are mixtures of known beta densities. The Bernstein prior is of interest in Bayesian nonparametric inference with continuous data. We study the consistency of the posterior from a Bernstein prior. We first show that, under mild assumptions, the posterior is weakly consistent for any distribution function P 0 on [0, 1] with continuous and bounded Lebesgue density. With slightly stronger assumptions on the prior, the posterior is also Hellinger consistent. This implies that the predictive density from a Bernstein prior, which is a Bayesian density estimate, converges in the Hellinger sense to the true density (assuming that it is continuous and bounded). We also study a sieve maximum likelihood version of the density estimator and show that it is also Hellinger consistent under weak assumptions. When the order of the Bernstein polynomial, i.e. the number of components in the beta distribution mixture, is truncated, we show that under mild restrictions the posterior concentrates on the set of pseudotrue densities. Finally, we study the behaviour of the predictive density numerically and we also study a hybrid Bayes–maximum likelihood density estimator.  相似文献   

11.
Cook (1986) presented the idea of local influence to study the sensitivity of inferences to model assumptions:introduce a vector δ of perturbations to the model; choose a discrepancy function D to measure differences between the original inference and the inference under the perturbed model; study the behavior of D near δ = 0, the original model, usually by taking derivatives. Johnson and Geisser (1983) measure influence in Bayesian inference by the Kullback-Leibler divergence between predictive distributions. I~IcCulloch (1989) is a synthesis of Cook and Johnson and Geisser, using Kullback-Leibler divergence between posterior or predictive distributions as the discrepancy function in Bayesian local influence analyses. We analyze a special case for which McCulloch gives the general theory; namely, the linear model with conjugate prior. We present specific formulae for local influence measures for 1) changes in the parameters of the gamma prior for the precision, 2) changes in the mean of the normal prior for the regression coefficients, 3) changes in the covariance matrix of the normal prior for the regression coefficients and 4) changes in the case weights. Our method is an easy way to find locally influential subsets of points without knowing in advance the sizes of the subsets. The techniques are illustrated with a regression example.  相似文献   

12.
ABSTRACT

We propose a Bayesian approach to obtaining control charts when there is parameter uncertainty. Our approach consists of two stages, (i) construction of the control chart where we use a predictive distribution based on a Bayesian approach to derive the rejection region, and (ii) evaluation of the control chart where we use a sampling theory approach to examine the performance of the control chart under various hypothetical specifications for the data generation model.  相似文献   

13.
The authors show how saddlepoint techniques lead to highly accurate approximations for Bayesian predictive densities and cumulative distribution functions in stochastic model settings where the prior is tractable, but not necessarily the likelihood or the predictand distribution. They consider more specifically models involving predictions associated with waiting times for semi‐Markov processes whose distributions are indexed by an unknown parameter θ. Bayesian prediction for such processes when they are not stationary is also addressed and the inverse‐Gaussian based saddlepoint approximation of Wood, Booth & Butler (1993) is shown to accurately deal with the nonstationarity whereas the normal‐based Lugannani & Rice (1980) approximation cannot, Their methods are illustrated by predicting various waiting times associated with M/M/q and M/G/1 queues. They also discuss modifications to the matrix renewal theory needed for computing the moment generating functions that are used in the saddlepoint methods.  相似文献   

14.
This paper describes the Bayesian inference and prediction of the two-parameter Weibull distribution when the data are Type-II censored data. The aim of this paper is twofold. First we consider the Bayesian inference of the unknown parameters under different loss functions. The Bayes estimates cannot be obtained in closed form. We use Gibbs sampling procedure to draw Markov Chain Monte Carlo (MCMC) samples and it has been used to compute the Bayes estimates and also to construct symmetric credible intervals. Further we consider the Bayes prediction of the future order statistics based on the observed sample. We consider the posterior predictive density of the future observations and also construct a predictive interval with a given coverage probability. Monte Carlo simulations are performed to compare different methods and one data analysis is performed for illustration purposes.  相似文献   

15.
We extend the standard approach to Bayesian forecast combination by forming the weights for the model averaged forecast from the predictive likelihood rather than the standard marginal likelihood. The use of predictive measures of fit offers greater protection against in-sample overfitting when uninformative priors on the model parameters are used and improves forecast performance. For the predictive likelihood we argue that the forecast weights have good large and small sample properties. This is confirmed in a simulation study and in an application to forecasts of the Swedish inflation rate, where forecast combination using the predictive likelihood outperforms standard Bayesian model averaging using the marginal likelihood.  相似文献   

16.
We consider an efficient Bayesian approach to estimating integration-based posterior summaries from a separate Bayesian application. In Bayesian quadrature we model an intractable posterior density function f(·) as a Gaussian process, using an approximating function g(·), and find a posterior distribution for the integral of f(·), conditional on a few evaluations of f (·) at selected design points. Bayesian quadrature using normal g (·) is called Bayes-Hermite quadrature. We extend this theory by allowing g(·) to be chosen from two wider classes of functions. One is a family of skew densities and the other is the family of finite mixtures of normal densities. For the family of skew densities we describe an iterative updating procedure to select the most suitable approximation and apply the method to two simulated posterior density functions.  相似文献   

17.
Sequences of independent random variables are observed and on the basis of these observations future values of the process are forecast. The Bayesian predictive density of k future observations for normal, exponential, and binomial sequences which change exactly once are analyzed for several cases. It is seen that the Bayesian predictive densities are mixtures of standard probability distributions. For example, with normal sequences the Bayesian predictive density is a mixture of either normal or t-distributions, depending on whether or not the common variance is known. The mixing probabilities are the same as those occurring in the corresponding posterior distribution of the mean(s) of the sequence. The predictive mass function of the number of future successes that will occur in a changing Bernoulli sequence is computed and point and interval predictors are illustrated.  相似文献   

18.
Recent results in information theory, see Soofi (1996; 2001) for a review, include derivations of optimal information processing rules, including Bayes' theorem, for learning from data based on minimizing a criterion functional, namely output information minus input information as shown in Zellner (1988; 1991; 1997; 2002). Herein, solution post data densities for parameters are obtained and studied for cases in which the input information is that in (1) a likelihood function and a prior density; (2) only a likelihood function; and (3) neither a prior nor a likelihood function but only input information in the form of post data moments of parameters, as in the Bayesian method of moments approach. Then it is shown how optimal output densities can be employed to obtain predictive densities and optimal, finite sample structural coefficient estimates using three alternative loss functions. Such optimal estimates are compared with usual estimates, e.g., maximum likelihood, two-stage least squares, ordinary least squares, etc. Some Monte Carlo experimental results in the literature are discussed and implications for the future are provided.  相似文献   

19.
We investigate Bayesian optimal designs for changepoint problems. We find robust optimal designs which allow for arbitrary distributions before and after the change, arbitrary prior densities on the parameters before and after the change, and any log‐concave prior density on the changepoint. We define a new design measure for Bayesian optimal design problems as a means of finding the optimal design. Our results apply to any design criterion function concave in the design measure. We illustrate our results by finding the optimal design in a problem motivated by a previous clinical trial. The Canadian Journal of Statistics 37: 495–513; 2009 © 2009 Statistical Society of Canada  相似文献   

20.
A class of predictive densities is derived by weighting the observed samples in maximizing the log-likelihood function. This approach is effective in cases such as sample surveys or design of experiments, where the observed covariate follows a different distribution than that in the whole population. Under misspecification of the parametric model, the optimal choice of the weight function is asymptotically shown to be the ratio of the density function of the covariate in the population to that in the observations. This is the pseudo-maximum likelihood estimation of sample surveys. The optimality is defined by the expected Kullback–Leibler loss, and the optimal weight is obtained by considering the importance sampling identity. Under correct specification of the model, however, the ordinary maximum likelihood estimate (i.e. the uniform weight) is shown to be optimal asymptotically. For moderate sample size, the situation is in between the two extreme cases, and the weight function is selected by minimizing a variant of the information criterion derived as an estimate of the expected loss. The method is also applied to a weighted version of the Bayesian predictive density. Numerical examples as well as Monte-Carlo simulations are shown for polynomial regression. A connection with the robust parametric estimation is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号