期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fitting finite mixture models using iterative Monte Carlo classification

Jing Xu Jun Ma 《统计学通讯:理论与方法》2017,46(13):6684-6693

Parameters of a finite mixture model are often estimated by the expectation–maximization (EM) algorithm where the observed data log-likelihood function is maximized. This paper proposes an alternative approach for fitting finite mixture models. Our method, called the iterative Monte Carlo classification (IMCC), is also an iterative fitting procedure. Within each iteration, it first estimates the membership probabilities for each data point, namely the conditional probability of a data point belonging to a particular mixing component given that the data point value is obtained, it then classifies each data point into a component distribution using the estimated conditional probabilities and the Monte Carlo method. It finally updates the parameters of each component distribution based on the classified data. Simulation studies were conducted to compare IMCC with some other algorithms for fitting mixture normal, and mixture t, densities. 相似文献

2.

Quantifying uncertainty in transdimensional Markov chain Monte Carlo using discrete Markov models

Heck Daniel W. Overstall Antony M. Gronau Quentin F. Wagenmakers Eric-Jan 《Statistics and Computing》2019,29(4):631-643

Bayesian analysis often concerns an evaluation of models with different dimensionality as is necessary in, for example, model selection or mixture models. To facilitate this evaluation, transdimensional Markov chain Monte Carlo (MCMC) relies on sampling a discrete indexing variable to estimate the posterior model probabilities. However, little attention has been paid to the precision of these estimates. If only few switches occur between the models in the transdimensional MCMC output, precision may be low and assessment based on the assumption of independent samples misleading. Here, we propose a new method to estimate the precision based on the observed transition matrix of the model-indexing variable. Assuming a first-order Markov model, the method samples from the posterior of the stationary distribution. This allows assessment of the uncertainty in the estimated posterior model probabilities, model ranks, and Bayes factors. Moreover, the method provides an estimate for the effective sample size of the MCMC output. In two model selection examples, we show that the proposed approach provides a good assessment of the uncertainty associated with the estimated posterior model probabilities.

相似文献

3.

Bayesian analysis of mixture modelling using the multivariate t distribution

Lin Tsung I. Lee Jack C. Ni Huey F. 《Statistics and Computing》2004,14(2):119-130

A finite mixture model using the multivariate t distribution has been shown as a robust extension of normal mixtures. In this paper, we present a Bayesian approach for inference about parameters of t-mixture models. The specifications of prior distributions are weakly informative to avoid causing nonintegrable posterior distributions. We present two efficient EM-type algorithms for computing the joint posterior mode with the observed data and an incomplete future vector as the sample. Markov chain Monte Carlo sampling schemes are also developed to obtain the target posterior distribution of parameters. The advantages of Bayesian approach over the maximum likelihood method are demonstrated via a set of real data. 相似文献

4.

Alternatives to post‐processing posterior predictive p values

Jrund Gsemyr Ida Scheel 《Scandinavian Journal of Statistics》2019,46(4):1252-1273

The posterior predictive p value (ppp) was invented as a Bayesian counterpart to classical p values. The methodology can be applied to discrepancy measures involving both data and parameters and can, hence, be targeted to check for various modeling assumptions. The interpretation can, however, be difficult since the distribution of the ppp value under modeling assumptions varies substantially between cases. A calibration procedure has been suggested, treating the ppp value as a test statistic in a prior predictive test. In this paper, we suggest that a prior predictive test may instead be based on the expected posterior discrepancy, which is somewhat simpler, both conceptually and computationally. Since both these methods require the simulation of a large posterior parameter sample for each of an equally large prior predictive data sample, we furthermore suggest to look for ways to match the given discrepancy by a computation‐saving conflict measure. This approach is also based on simulations but only requires sampling from two different distributions representing two contrasting information sources about a model parameter. The conflict measure methodology is also more flexible in that it handles non‐informative priors without difficulty. We compare the different approaches theoretically in some simple models and in a more complex applied example. 相似文献

5.

Bayesian Accelerated Life Testing under Competing Weibull Causes of Failure

Soumya Roy Chiranjit Mukhopadhyay 《统计学通讯:理论与方法》2014,43(10-12):2429-2451

Consider a J-component series system which is put on Accelerated Life Test (ALT) involving K stress variables. First, a general formulation of ALT is provided for log-location-scale family of distributions. A general stress translation function of location parameter of the component log-lifetime distribution is proposed which can accommodate standard ones like Arrhenius, power-rule, log-linear model, etc., as special cases. Later, the component lives are assumed to be independent Weibull random variables with a common shape parameter. A full Bayesian methodology is then developed by letting only the scale parameters of the Weibull component lives depend on the stress variables through the general stress translation function. Priors on all the parameters, namely the stress coefficients and the Weibull shape parameter, are assumed to be log-concave and independent of each other. This assumption is to facilitate Gibbs sampling from the joint posterior. The samples thus generated from the joint posterior is then used to obtain the Bayesian point and interval estimates of the system reliability at usage condition. 相似文献

6.

Analyzing Quantitative Trait Loci for the Arabidopsis thaliana using Markov Chain Monte Carlo Model Composition with restricted and unrestricted model spaces

Edward L. Boone Susan J. Simmons Keying Ye Ann E. Stapleton 《Statistical Methodology》2006,3(1):69

Quantitative Trait Loci (QTL) mapping is a growing field in statistical genetics. However, dealing with this type of data from a statistical perspective is often perilous. In this paper we extend and apply a Markov Chain Monte Carlo Model Composition (MC³) technique to a data set of the Arabidopsis thaliana plant for locating the QTL mapping associated with cotyledon opening. The posterior model probabilities as well as the marginal posterior probabilities of each locus belonging to the model are presented. Furthermore, we show how the MC³ method can be used to deal with the situation where the sample size is less than the number of parameters in a model using a restricted model space approach. 相似文献

7.

A Simulation Comparison of Approximate Tests for Fixed Effects in Random Coefficients Growth Curve Models

Julia Volaufova Lynn Roy Lamotte 《统计学通讯:模拟与计算》2013,42(2):344-359

Often, the response variables on sampling units are observed repeatedly over time. The sampling units may come from different populations, such as treatment groups. This setting is routinely modeled by a random coefficients growth curve model, and the techniques of general linear mixed models are applied to address the primary research aim. An alternative approach is to reduce each subject’s data to summary measures, such as within-subject averages or regression coefficients. One may then test for equality of means of the summary measures (or functions of them) among treatment groups. Here, we compare by simulation the performance characteristics of three approximate tests based on summary measures and one based on the full data, focusing mainly on accuracy of p-values. We find that performances of these procedures can be quite different for small samples in several different configurations of parameter values. The summary-measures approach performed at least as well as the full-data mixed models approach. 相似文献

8.

An Objective Bayesian Criterion to Determine Model Prior Probabilities

下载免费PDF全文

Cristiano Villa Stephen Walker 《Scandinavian Journal of Statistics》2015,42(4):947-966

We discuss the problem of selecting among alternative parametric models within the Bayesian framework. For model selection problems, which involve non‐nested models, the common objective choice of a prior on the model space is the uniform distribution. The same applies to situations where the models are nested. It is our contention that assigning equal prior probability to each model is over simplistic. Consequently, we introduce a novel approach to objectively determine model prior probabilities, conditionally, on the choice of priors for the parameters of the models. The idea is based on the notion of the worth of having each model within the selection process. At the heart of the procedure is the measure of this worth using the Kullback–Leibler divergence between densities from different models. 相似文献

9.

Bayesian analysis of threshold autoregressions

Lyle D. Broemeling Peyton Cook 《统计学通讯:理论与方法》2013,42(9):2459-2482

A nonasymptotic Bayesian approach is developed for analysis of data from threshold autoregressive processes with two regimes. Using the conditional likelihood function, the marginal posterior distribution for each of the parameters is derived along with posterior means and variances. A test for linear functions of the autoregressive coefficients is presented. The approach presented uses a posterior p-value averaged over the values of the threshold. The one-step ahead predictive distribution is derived along with the predictive mean and variance. In addition, equivalent results are derived conditional upon a value of the threshold. A numerical example is presented to illustrate the approach. 相似文献

10.

Remedying the Neyman–Scott phenomenon in model discrimination

《Journal of Statistical Computation and Simulation》2012,82(6):749-757

The objective of this paper is to investigate through simulation the possible presence of the incidental parameters problem when performing frequentist model discrimination with stratified data. In this context, model discrimination amounts to considering a structural parameter taking values in a finite space, with k points, k≥2. This setting seems to have not yet been considered in the literature about the Neyman–Scott phenomenon. Here we provide Monte Carlo evidence of the severity of the incidental parameters problem also in the model discrimination setting and propose a remedy for a special class of models. In particular, we focus on models that are scale families in each stratum. We consider traditional model selection procedures, such as the Akaike and Takeuchi information criteria, together with the best frequentist selection procedure based on maximization of the marginal likelihood induced by the maximal invariant, or of its Laplace approximation. Results of two Monte Carlo experiments indicate that when the sample size in each stratum is fixed and the number of strata increases, correct selection probabilities for traditional model selection criteria may approach zero, unlike what happens for model discrimination based on exact or approximate marginal likelihoods. Finally, two examples with real data sets are given. 相似文献

11.

R package rjmcmc: reversible jump MCMC using post‐processing

Nicholas Gelling Matthew R. Schofield Richard J. Barker 《Australian & New Zealand Journal of Statistics》2019,61(2):189-212

The rjmcmc package for R implements the post‐processing reversible jump Markov chain Monte Carlo (MCMC) algorithm of Barker & Link. MCMC output from each of the models is used to estimate posterior model probabilities and Bayes factors. Automatic differentiation is used to simplify implementation. The package is demonstrated on two examples. 相似文献

12.

A Multiple Comparison Procedure Based on a Variant of the Schwarz Information Criterion in a Mixed Model

Junfeng Shang 《统计学通讯:理论与方法》2013,42(6):1095-1109

Repeated measurements are collected in a variety of situations and are generally characterized by a mixed model where the correlation within the subject is specified by the random effects. In such a mixed model, we propose a multiple comparison procedure based on a variant of the Schwarz information criterion (SIC; Schwarz, 1978 Schwarz , G. ( 1978 ). Estimating the dimension of a model . Ann. Statist. 6 : 461 – 464 .[Crossref], [Web of Science ®] , [Google Scholar]). The derivation of SIC indicates that SIC serves as an asymptotic approximation to a transformation of the Bayesian posterior probability of a candidate model. Therefore, an approximated posterior probability for a candidate model can be calculated based upon SIC. We suggest a variant of SIC which includes the terms which are asymptotically negligible in the derivation of SIC. The variant improves upon the performance of SIC in small and moderate sample-size applications. Based upon the proposed variant, the corresponding posterior probability can be calculated for each candidate model. A hypothesis testing for multiple comparisons involves one or more models in the candidate class, the posterior probability of the hypothesis testing is therefore evaluated as the sum of the posterior probabilities for the models associated with the testing. The approximated posterior probability based on the variant accommodates the effect of the prior on each model in the candidate class, and therefore is more effectively approximated than that based on SIC for conducting multiple comparisons. We derive the computational formula of the approximated posterior probability based on the variant in the mixed model. The applications in two real data sets demonstrate that the proposed procedure based on the SIC variant can perform effectively in multiple comparisons. 相似文献

13.

Clustering time-course microarray data using functional Bayesian infinite mixture model

Claudia Angelini Marianna Pensky 《Journal of applied statistics》2012,39(1):129-149

This paper presents a new Bayesian, infinite mixture model based, clustering approach, specifically designed for time-course microarray data. The problem is to group together genes which have “similar” expression profiles, given the set of noisy measurements of their expression levels over a specific time interval. In order to capture temporal variations of each curve, a non-parametric regression approach is used. Each expression profile is expanded over a set of basis functions and the sets of coefficients of each curve are subsequently modeled through a Bayesian infinite mixture of Gaussian distributions. Therefore, the task of finding clusters of genes with similar expression profiles is then reduced to the problem of grouping together genes whose coefficients are sampled from the same distribution in the mixture. Dirichlet processes prior is naturally employed in such kinds of models, since it allows one to deal automatically with the uncertainty about the number of clusters. The posterior inference is carried out by a split and merge MCMC sampling scheme which integrates out parameters of the component distributions and updates only the latent vector of the cluster membership. The final configuration is obtained via the maximum a posteriori estimator. The performance of the method is studied using synthetic and real microarray data and is compared with the performances of competitive techniques. 相似文献

14.

Model based labeling for mixture models 总被引：1，自引：0，他引：1

Weixin Yao 《Statistics and Computing》2012,22(2):337-347

Label switching is one of the fundamental problems for Bayesian mixture model analysis. Due to the permutation invariance of the mixture posterior, we can consider that the posterior of a m-component mixture model is a mixture distribution with m! symmetric components and therefore the object of labeling is to recover one of the components. In order to do labeling, we propose to first fit a symmetric m!-component mixture model to the Markov chain Monte Carlo (MCMC) samples and then choose the label for each sample by maximizing the corresponding classification probabilities, which are the probabilities of all possible labels for each sample. Both parametric and semi-parametric ways are proposed to fit the symmetric mixture model for the posterior. Compared to the existing labeling methods, our proposed method aims to approximate the posterior directly and provides the labeling probabilities for all possible labels and thus has a model explanation and theoretical support. In addition, we introduce a situation in which the “ideally” labeled samples are available and thus can be used to compare different labeling methods. We demonstrate the success of our new method in dealing with the label switching problem using two examples. 相似文献

15.

Posterior sampling in two classes of multivariate fractionally integrated models: corrigendum to Ravishanker,N. and B. K. Ray (1997) Australian Journal of Statistics 39 (3), 295–311

Ross Doppelt Keith O'Hara 《Australian & New Zealand Journal of Statistics》2019,61(1):85-87

We discuss posterior sampling for two distinct multivariate generalisations of the univariate autoregressive integrated moving average (ARIMA) model with fractional integration. The existing approach to Bayesian estimation, introduced by Ravishanker & Ray, claims to provide a posterior‐sampling algorithm for fractionally integrated vector autoregressive moving averages (FIVARMAs). We show that this algorithm produces posterior draws for vector autoregressive fractionally integrated moving averages (VARFIMAs), a model of independent interest that has not previously received attention in the Bayesian literature. 相似文献

16.

A general long-term aging model with different underlying activation mechanisms: Modeling,Bayesian estimation,and case influence diagnostics

Adriano K. Suzuki Gladys D. C. Barriga Francisco Louzada Vicente G. Cancho 《统计学通讯:理论与方法》2017,46(6):3080-3098

In this paper we propose a general cure rate aging model. Our approach enables different underlying activation mechanisms which lead to the event of interest. The number of competing causes of the event of interest is assumed to follow a logarithmic distribution. The model is parameterized in terms of the cured fraction which is then linked to covariates. We explore the use of Markov chain Monte Carlo methods to develop a Bayesian analysis for the proposed model. Moreover, some discussions on the model selection to compare the fitted models are given, as well as case deletion influence diagnostics are developed for the joint posterior distribution based on the ψ-divergence, which has several divergence measures as particular cases, such as the Kullback–Leibler (K-L), J-distance, L₁ norm, and χ²-square divergence measures. Simulation studies are performed and experimental results are illustrated based on a real malignant melanoma data. 相似文献

17.

Predicting Turning Points Through the Integration of Multiple Models

David T. Li Jeffrey H. Dorfman 《商业与经济统计学杂志》2013,31(4):421-428

A new method for forming composite turning-point (or other qualitative) forecasts is proposed. Rather than forming composite forecasts by the standard Bayesian approach with weights proportional to each model's posterior odds, weights are assigned to the individual models in proportion to the probability of each model's having the correct turning-point prediction. These probabilities are generated by logit models estimated with data on the models' past turning-point forecasts. An empirical application to gross national product/gross domestic product forecasting of 18 Organization for Economic Cooperation and Development countries demonstrates the potential benefits of the procedure 相似文献

18.

Two-stage approaches to the analysis of occupancy data I: the homogeneous case (analysis of occupancy data)

Natalie Karavarsamis Richard M. Huggins 《统计学通讯:理论与方法》2020,49(19):4751-4761

Abstract

Occupancy models are used in statistical ecology to estimate species dispersion. The two components of an occupancy model are the detection and occupancy probabilities, with the main interest being in the occupancy probabilities. We show that for the homogeneous occupancy model there is an orthogonal transformation of the parameters that gives a natural two-stage inference procedure based on a conditional likelihood. We then extend this to a partial likelihood that gives explicit estimators of the model parameters. By allowing the separate modeling of the detection and occupancy probabilities, the extension of the two-stage approach to more general models has the potential to simplify the computational routines used there. 相似文献

19.

Bayesian inference and diagnostics in zero-inflated generalized power series regression model

Gladys D. Cacsire Barriga Dipak K. Dey 《统计学通讯:理论与方法》2013,42(22):6553-6568

ABSTRACT

The paper provides a Bayesian analysis for the zero-inflated regression models based on the generalized power series distribution. The approach is based on Markov chain Monte Carlo methods. The residual analysis is discussed and case-deletion influence diagnostics are developed for the joint posterior distribution, based on the ψ-divergence, which includes several divergence measures such as the Kullback–Leibler, J-distance, L₁ norm, and χ²-square in zero-inflated general power series models. The methodology is reflected in a data set collected by wildlife biologists in a state park in California. 相似文献

20.

Prior Density Selection as a Particular Case of Bayesian Model Selection: A Predictive Approach

Julián de la Horra María Teresa Rodríguez-Bernal 《统计学通讯:理论与方法》2013,42(8):1387-1396

A Bayesian model consists of two elements: a sampling model and a prior density. The problem of selecting a prior density is nothing but the problem of selecting a Bayesian model where the sampling model is fixed. A predictive approach is used through a decision problem where the loss function is the squared L ² distance between the sampling density and the posterior predictive density, because the aim of the method is to choose the prior that provides a posterior predictive density as good as possible. An algorithm is developed for solving the problem; this algorithm is based on Lavine's linearization technique. 相似文献