首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 906 毫秒
Even though the literature on nonparametric density estimation is large, the literature on Bayesian estimation of the density function is relatively small. The reason is the lack of a suitable prior over the space of probability density functions. There have been attempts to define priors over the space of probability measures, but they have not yielded any workable prior for the purpose of density estimation. Dubins & Freedman (1963) have denned random distribution functions which are singular with probability one. Kraft (1964) has denned a class of distribution functions which have derivatives but not continuous derivatives and hence are not suitable for density estimation. The only really convenient prior is the Dirichlet process prior due to Ferguson (1973), but unfortunately this prior concentrates all its mass over the discrete distribution with a dense set of jumps. Recently Lo (1978) has overcome this difficulty by taking convolution of the Dirichlet process with a fixed continuous kernel. In Section 2, the existence of a version of the posterior distribution and the conditional expectation for arbitrary prior over the space of continuous density functions are discussed. The Bayes risk consistency of the Bayes estimator is discussed in Section 3. The Bayes estimator and its properties with respect to two specific prior distributions are discussed in Section 4. In Section 5 some negative results are presented. Finally a numerical example is given in Section 6.  相似文献   

Abstract. The modelling process in Bayesian Statistics constitutes the fundamental stage of the analysis, since depending on the chosen probability laws the inferences may vary considerably. This is particularly true when conflicts arise between two or more sources of information. For instance, inference in the presence of an outlier (which conflicts with the information provided by the other observations) can be highly dependent on the assumed sampling distribution. When heavy‐tailed (e.g. t) distributions are used, outliers may be rejected whereas this kind of robust inference is not available when we use light‐tailed (e.g. normal) distributions. A long literature has established sufficient conditions on location‐parameter models to resolve conflict in various ways. In this work, we consider a location–scale parameter structure, which is more complex than the single parameter cases because conflicts can arise between three sources of information, namely the likelihood, the prior distribution for the location parameter and the prior for the scale parameter. We establish sufficient conditions on the distributions in a location–scale model to resolve conflicts in different ways as a single observation tends to infinity. In addition, for each case, we explicitly give the limiting posterior distributions as the conflict becomes more extreme.  相似文献   

Modelling Heterogeneity With and Without the Dirichlet Process   总被引:4,自引:0,他引:4  
We investigate the relationships between Dirichlet process (DP) based models and allocation models for a variable number of components, based on exchangeable distributions. It is shown that the DP partition distribution is a limiting case of a Dirichlet–multinomial allocation model. Comparisons of posterior performance of DP and allocation models are made in the Bayesian paradigm and illustrated in the context of univariate mixture models. It is shown in particular that the unbalancedness of the allocation distribution, present in the prior DP model, persists a posteriori . Exploiting the model connections, a new MCMC sampler for general DP based models is introduced, which uses split/merge moves in a reversible jump framework. Performance of this new sampler relative to that of some traditional samplers for DP processes is then explored.  相似文献   

This paper studies the case where the observations come from a unimodal and skew density function with an unknown mode. The skew-symmetric representation of such a density has a symmetric component which can be written as a scale mixture of uniform densities. A Dirichlet process (DP) prior is assigned to mixing distribution. We also assume prior distributions for the mode and the skewed component. A computational approach is used to obtain the Bayes estimate of the components. An example is given to illustrate the approach.  相似文献   

Optimizing criteria for choosing a confidence set for a parameter are formulated as mathematical programming problems. The two optimizing criteria, probability of coverage and size of set, give rise to a pair of inverse programming problems. Several examples are worked out. The programming problems are then formulated to allow the incorporation of partial information about the parameter. By varying the family of prior distributions, a continuum of problems from the frequency approach to a Bayesian approach is obtained. Some examples are considered in which the family of priors contains more than one but not all prior distributions.  相似文献   

David R. Bickel 《Statistics》2018,52(3):552-570
Learning from model diagnostics that a prior distribution must be replaced by one that conflicts less with the data raises the question of which prior should instead be used for inference and decision. The same problem arises when a decision maker learns that one or more reliable experts express unexpected beliefs. In both cases, coherence of the solution would be guaranteed by applying Bayes's theorem to a distribution of prior distributions that effectively assigns the initial prior distribution a probability arbitrarily close to 1. The new distribution for inference would then be the distribution of priors conditional on the insight that the prior distribution lies in a closed convex set that does not contain the initial prior. A readily available distribution of priors needed for such conditioning is the law of the empirical distribution of sufficiently large number of independent parameter values drawn from the initial prior. According to the Gibbs conditioning principle from the theory of large deviations, the resulting new prior distribution minimizes the entropy relative to the initial prior. While minimizing relative entropy accommodates the necessity of going beyond the initial prior without departing from it any more than the insight demands, the large-deviation derivation also ensures the advantages of Bayesian coherence. This approach is generalized to uncertain insights by allowing the closed convex set of priors to be random.  相似文献   

The incorporation of prior information about θ, where θ is the success probability in a binomial sampling model, is an essential feature of Bayesian statistics. Methodology based on information-theoretic concepts is introduced which (a) quantifies the amount of information provided by the sample data relative to that provided by the prior distribution and (b) allows for a ranking of prior distributions with respect to conservativeness, where conservatism refers to restraint of extraneous information about θ which is embedded in any prior distribution. In effect, the most conservative prior distribution from a specified class (each member o f which carries the available prior information about θ) is that prior distribution within the class over which the likelihood function has the greatest average domination. The most conservative prior distributions from five different families of prior distributions over the interval (0,1) including the beta distribution are determined and compared for three situations: (1) no prior estimate of θ is available, (2) a prior point estimate or θ is available, and (3) a prior interval estimate of θ is available. The results of the comparisons not only advocate the use of the beta prior distribution in binomial sampling but also indicate which particular one to use in the three aforementioned situations.  相似文献   

This paper formulates a theory of probabilistic parametric inference and explores the limits of its applicability. Unlike Bayesian statistical models, the system does not comprise prior probability distributions. Objectivity is imposed on the theory: a particular direct probability density should always result in the same posterior probability distribution. For calibrated posterior probability distributions it is possible to construct credible regions with posterior-probability content equal to the coverage of the regions, but the calibration is not generally preserved under marginalization. As an application of the theory, the paper also constructs a filter for linear Gauss–Markov stochastic processes with unspecified initial conditions.  相似文献   

ApEn, approximate entropy, is a recently developed family of parameters and statistics quantifying regularity (complexity) in data, providing an information-theoretic quantity for continuous-state processes. We provide the motivation for ApEn development, and indicate the superiority of ApEn to the K-S entropy for statistical application, and for discrimination of both correlated stochastic and noisy deterministic processes. We study the variation of ApEn with input parameter choices, reemphasizing that ApEn is a relative measure of regularity. We study the bias in the ApEn statistic, and present evidence for asymptotic normality in the ApEn distributions, assuming weak dependence. We provide a new test for the hypothesis that an underlying time-series is generated by i.i.d. variables, which does not require distribution specification. We introduce randomized ApEn, which derives an empirical significance probability that two processes differ, based on one data set from each process.  相似文献   

In this paper, we propose novel methods of quantifying expert opinion about prior distributions for multinomial models. Two different multivariate priors are elicited using median and quartile assessments of the multinomial probabilities. First, we start by eliciting a univariate beta distribution for the probability of each category. Then we elicit the hyperparameters of the Dirichlet distribution, as a tractable conjugate prior, from those of the univariate betas through various forms of reconciliation using least-squares techniques. However, a multivariate copula function will give a more flexible correlation structure between multinomial parameters if it is used as their multivariate prior distribution. So, second, we use beta marginal distributions to construct a Gaussian copula as a multivariate normal distribution function that binds these marginals and expresses the dependence structure between them. The proposed method elicits a positive-definite correlation matrix of this Gaussian copula. The two proposed methods are designed to be used through interactive graphical software written in Java.  相似文献   

Kontkanen  P.  Myllymäki  P.  Silander  T.  Tirri  H.  Grünwald  P. 《Statistics and Computing》2000,10(1):39-54
In this paper we are interested in discrete prediction problems for a decision-theoretic setting, where the task is to compute the predictive distribution for a finite set of possible alternatives. This question is first addressed in a general Bayesian framework, where we consider a set of probability distributions defined by some parametric model class. Given a prior distribution on the model parameters and a set of sample data, one possible approach for determining a predictive distribution is to fix the parameters to the instantiation with the maximum a posteriori probability. A more accurate predictive distribution can be obtained by computing the evidence (marginal likelihood), i.e., the integral over all the individual parameter instantiations. As an alternative to these two approaches, we demonstrate how to use Rissanen's new definition of stochastic complexity for determining predictive distributions, and show how the evidence predictive distribution with Jeffrey's prior approaches the new stochastic complexity predictive distribution in the limit with increasing amount of sample data. To compare the alternative approaches in practice, each of the predictive distributions discussed is instantiated in the Bayesian network model family case. In particular, to determine Jeffrey's prior for this model family, we show how to compute the (expected) Fisher information matrix for a fixed but arbitrary Bayesian network structure. In the empirical part of the paper the predictive distributions are compared by using the simple tree-structured Naive Bayes model, which is used in the experiments for computational reasons. The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense. The evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used.  相似文献   

We introduce a general Monte Carlo method based on Nested Sampling (NS), for sampling complex probability distributions and estimating the normalising constant. The method uses one or more particles, which explore a mixture of nested probability distributions, each successive distribution occupying ∼e −1 times the enclosed prior mass of the previous distribution. While NS technically requires independent generation of particles, Markov Chain Monte Carlo (MCMC) exploration fits naturally into this technique. We illustrate the new method on a test problem and find that it can achieve four times the accuracy of classic MCMC-based Nested Sampling, for the same computational effort; equivalent to a factor of 16 speedup. An additional benefit is that more samples and a more accurate evidence value can be obtained simply by continuing the run for longer, as in standard MCMC.  相似文献   

We propose a semiparametric modeling approach for mixtures of symmetric distributions. The mixture model is built from a common symmetric density with different components arising through different location parameters. This structure ensures identifiability for mixture components, which is a key feature of the model as it allows applications to settings where primary interest is inference for the subpopulations comprising the mixture. We focus on the two-component mixture setting and develop a Bayesian model using parametric priors for the location parameters and for the mixture proportion, and a nonparametric prior probability model, based on Dirichlet process mixtures, for the random symmetric density. We present an approach to inference using Markov chain Monte Carlo posterior simulation. The performance of the model is studied with a simulation experiment and through analysis of a rainfall precipitation data set as well as with data on eruptions of the Old Faithful geyser.  相似文献   

I propose a method for inference in dynamic discrete choice models (DDCM) that utilizes Markov chain Monte Carlo (MCMC) and artificial neural networks (ANNs). MCMC is intended to handle high-dimensional integration in the likelihood function of richly specified DDCMs. ANNs approximate the dynamic-program (DP) solution as a function of the parameters and state variables prior to estimation to avoid having to solve the DP on each iteration. Potential applications of the proposed methodology include inference in DDCMs with random coefficients, serially correlated unobservables, and dependence across individual observations. The article discusses MCMC estimation of DDCMs, provides relevant background on ANNs, and derives a theoretical justification for the method. Experiments suggest this to be a promising approach.  相似文献   

Summary This expository paper provides a framework for analysing de Finetti's representation theorem for exchangeable finitely additive probabilities. Such an analysis is justified by reasoning of statistical nature, since it is shown that the abandonment of the axiom of σ-additivity has some noteworthy consequences on the common interpretation of the Bayesian paradigm. The usual (strong) fromulation of de Finetti's theorem is deduced from the finitely additive (weak) formulation, and it is used to solve the problem of stating the existence of a stochastic process, with given finite-dimensional probability distributions, whose sample paths are probability distributions. It is of importance, in particular, to specify prior distributions for nonparametric inferential problems in a Bayesian setting. Research partially supported by MPI (40% 1990, Gruppo Nazionale ?Modelli Probabilistici e Statistica Matematica?).  相似文献   

We review the distributional transform of a random variable, some of its applications, and some related multivariate distributional transformations. The distributional transform is a useful tool, which allows in many respects to deal with general distributions in the same way as with continuous distributions. In particular it allows to give a simple proof of Sklar's theorem in the general case. It has been used in the literature for stochastic ordering results. It is also useful for an adequate definition of the conditional value at risk measure and for many further purposes. We also discuss the multivariate quantile transform as well as the multivariate extension of the distributional transform and some of their applications. In the final section we consider an application to an extension of a limit theorem for the empirical copula process, also called empirical dependence function, to general not necessarily continuous distributions. This is useful for constructing and analyzing tests of dependence properties for general distributions.  相似文献   

Testing for differences between two groups is a fundamental problem in statistics, and due to developments in Bayesian non parametrics and semiparametrics there has been renewed interest in approaches to this problem. Here we describe a new approach to developing such tests and introduce a class of such tests that take advantage of developments in Bayesian non parametric computing. This class of tests uses the connection between the Dirichlet process (DP) prior and the Wilcoxon rank sum test but extends this idea to the DP mixture prior. Here tests are developed that have appropriate frequentist sampling procedures for large samples but have the potential to outperform the usual frequentist tests. Extensions to interval and right censoring are considered and an application to a high-dimensional data set obtained from an RNA-Seq investigation demonstrates the practical utility of the method.  相似文献   

Abstract. We propose an objective Bayesian method for the comparison of all Gaussian directed acyclic graphical models defined on a given set of variables. The method, which is based on the notion of fractional Bayes factor (BF), requires a single default (typically improper) prior on the space of unconstrained covariance matrices, together with a prior sample size hyper‐parameter, which can be set to its minimal value. We show that our approach produces genuine BFs. The implied prior on the concentration matrix of any complete graph is a data‐dependent Wishart distribution, and this in turn guarantees that Markov equivalent graphs are scored with the same marginal likelihood. We specialize our results to the smaller class of Gaussian decomposable undirected graphical models and show that in this case they coincide with those recently obtained using limiting versions of hyper‐inverse Wishart distributions as priors on the graph‐constrained covariance matrices.  相似文献   

Models incorporating “latent” variables have been commonplace in financial, social, and behavioral sciences. Factor model, the most popular latent model, explains the continuous observed variables in a smaller set of latent variables (factors) in a matter of linear relationship. However, complex data often simultaneously display asymmetric dependence, asymptotic dependence, and positive (negative) dependence between random variables, which linearity and Gaussian distributions and many other extant distributions are not capable of modeling. This article proposes a nonlinear factor model that can model the above-mentioned variable dependence features but still possesses a simple form of factor structure. The random variables, marginally distributed as unit Fréchet distributions, are decomposed into max linear functions of underlying Fréchet idiosyncratic risks, transformed from Gaussian copula, and independent shared external Fréchet risks. By allowing the random variables to share underlying (latent) pervasive risks with random impact parameters, various dependence structures are created. This innovates a new promising technique to generate families of distributions with simple interpretations. We dive in the multivariate extreme value properties of the proposed model and investigate maximum composite likelihood methods for the impact parameters of the latent risks. The estimates are shown to be consistent. The estimation schemes are illustrated on several sets of simulated data, where comparisons of performance are addressed. We employ a bootstrap method to obtain standard errors in real data analysis. Real application to financial data reveals inherent dependencies that previous work has not disclosed and demonstrates the model’s interpretability to real data. Supplementary materials for this article are available online.  相似文献   

A strictly stationary time series is modelled directly, once the variables' realizations fit into a table: no knowledge of a distribution is required other than the prior discretization. A multiplicative model with combined random ‘Auto-Regressive’ and ‘Moving-Average’ parts is considered for the serial dependence. Based on a multi-sequence of unobserved series that serve as differences and differences of differences from the main building block, a causal version is obtained; a condition that secures an exponential rate of convergence for its expected random coefficients is presented. For the remainder, writing the conditional probability as a function of past conditional probabilities, is within reach: subject to the presence of the moving-average segment in the original equation, what could be a long process of elimination with mathematical arguments concludes with a new derivation that does not support a simplistic linear dependence on the lagged probability values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号