首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Categorical data frequently arise in applications in the Social Sciences. In such applications, the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomial or product multinomial data as observations of independent Poisson variables. For multinomial data, Lindley (1964) [20] showed that this approach leads to valid Bayesian posterior inferences when the prior density for the Poisson cell means factorises in a particular way. We develop this result to provide a general framework for the analysis of multinomial or product multinomial data using a Poisson log-linear model. Valid finite population inferences are also available, which can be particularly important in modelling social data. We then focus particular attention on multivariate normal prior distributions for the log-linear model parameters. Here, an improper prior distribution for certain Poisson model parameters is required for valid multinomial analysis, and we derive conditions under which the resulting posterior distribution is proper. We also consider the construction of prior distributions across models, and for model parameters, when uncertainty exists about the appropriate form of the model. We present classes of Poisson and multinomial models, invariant under certain natural groups of permutations of the cells. We demonstrate that, if prior belief concerning the model parameters is also invariant, as is the case in a ‘reference’ analysis, then the choice of prior distribution is considerably restricted. The analysis of multivariate categorical data in the form of a contingency table is considered in detail. We illustrate the methods with two examples.  相似文献   

2.
This paper studies the mu1tinomial model 2x2 contingency table data with some cell counts missing .Various hypotheses of interest including row-column independence are tested by using Bayes factors which represent the ratio of the posterior odds to the prior odds for the null hypothesis. The Dirichlet-Beta family of prior distributions is considered for the multinomial parameters cond itional on the complement of the null hypothesis. The Bayes factor for the incomplete data is a mixture of the Bayes factors for different possibilities for the full data.  相似文献   

3.
This article describes a convenient method of selecting Metropolis– Hastings proposal distributions for multinomial logit models. There are two key ideas involved. The first is that multinomial logit models have a latent variable representation similar to that exploited by Albert and Chib (J Am Stat Assoc 88:669–679, 1993) for probit regression. Augmenting the latent variables replaces the multinomial logit likelihood function with the complete data likelihood for a linear model with extreme value errors. While no conjugate prior is available for this model, a least squares estimate of the parameters is easily obtained. The asymptotic sampling distribution of the least squares estimate is Gaussian with known variance. The second key idea in this paper is to generate a Metropolis–Hastings proposal distribution by conditioning on the estimator instead of the full data set. The resulting sampler has many of the benefits of so-called tailored or approximation Metropolis–Hastings samplers. However, because the proposal distributions are available in closed form they can be implemented without numerical methods for exploring the posterior distribution. The algorithm converges geometrically ergodically, its computational burden is minor, and it requires minimal user input. Improvements to the sampler’s mixing rate are investigated. The algorithm is also applied to partial credit models describing ordinal item response data from the 1998 National Assessment of Educational Progress. Its application to hierarchical models and Poisson regression are briefly discussed.  相似文献   

4.
A Bayesian method is proposed for estimating the cell probabilities of several multinomial distributions. Parameters of different distributions are taken to be a priori exchangeable. The prior specification is based upon mixtures of a hierarchical distribution, referred to as the multivariate “Dirichlet-Dirichlet” distribution. The analysis is facilitated by a multinomial approximation relating to the multinomial-Dirichlet distribution. The posterior estimates depend upon measures of entropy for the various distributions and shrink the individual observed proportions towards values obtained by pooling the data across the distributions. As well as incorporating prior information they are particularly useful when some of the cell frequencies are zero. We use them to investigate a numerical classification of males of various vocations, according to cause of death.  相似文献   

5.
In this paper, we propose novel methods of quantifying expert opinion about prior distributions for multinomial models. Two different multivariate priors are elicited using median and quartile assessments of the multinomial probabilities. First, we start by eliciting a univariate beta distribution for the probability of each category. Then we elicit the hyperparameters of the Dirichlet distribution, as a tractable conjugate prior, from those of the univariate betas through various forms of reconciliation using least-squares techniques. However, a multivariate copula function will give a more flexible correlation structure between multinomial parameters if it is used as their multivariate prior distribution. So, second, we use beta marginal distributions to construct a Gaussian copula as a multivariate normal distribution function that binds these marginals and expresses the dependence structure between them. The proposed method elicits a positive-definite correlation matrix of this Gaussian copula. The two proposed methods are designed to be used through interactive graphical software written in Java.  相似文献   

6.
This article presents a Bayesian analysis of a multinomial probit model by building on previous work that specified priors on identified parameters. The main contribution of our article is to propose a prior on the covariance matrix of the latent utilities that permits elements of the inverse of the covariance matrix to be identically zero. This allows a parsimonious representation of the covariance matrix when such parsimony exists. The methodology is applied to both simulated and real data, and its ability to obtain more efficient estimators of the covariance matrix and regression coefficients is assessed using simulated data.  相似文献   

7.
The objective of this article is to propose a method of exploring the mechanism of expectation formation based on qualitative survey data. The survey data are regarded as a sample from a multinomial distribution whose parameters are time-variant functions of inflation expectations. The parameters are estimated using a Bayesian recursive approach, which is a generalization of the Kalman filtering technique. For illustrative purposes, the method is applied to Japanese data. One notable finding from the empirical analysis is that the expectation formation process of Japanese enterprises has varied greatly over time.  相似文献   

8.
When available data comprise a number of sampled households in each of a number of income classes, the likelihood function is obtained from a multinomial distribution with the income class population proportions as the unknown parameters. Two methods for going from this likelihood function to a posterior distribution on the Gini coefficient are investigated. In the first method, two alternative assumptions about the underlying income distribution are considered, namely a lognormal distribution and the Singh–Maddala (1976) income distribution. In these cases the likelihood function is reparameterized and the Gini coefficient is a nonlinear function of the income distribution parameters. The Metropolis algorithm is used to find the corresponding posterior distributions of the Gini coefficient from a sample of Bangkok households. The second method does not require an assumption about the nature of the income distribution, but uses (a) triangular prior distributions, and (b) beta prior distributions, on the location of mean income within each income class. By sampling from these distributions, and the Dirichlet posterior distribution of the income class proportions, alternative posterior distributions of the Gini coefficient are calculated.  相似文献   

9.
Stylometry refers to the statistical analysis of literary style of authors based on the characteristics of expression in their writings. We propose an approach to stylometry based on a Bayesian Dirichlet process mixture model using multinomial word frequency data. The parameters of the multinomial distribution of word frequency data are the “word prints” of the author. Our approach is based on model-based clustering of the vectors of probability values of the multinomial distribution. The resultant clusters identify different writing styles that assist in author attribution for disputed works in a corpus. As a test case, the methodology is applied to the problem of authorship attribution involving the Federalist papers. Our results are consistent with previous stylometric analyses of these papers.  相似文献   

10.
In this article, we present a Bayesian modeling for response variables restricted to the interval (0, 1), such as proportions and rates, using the simplex distribution for cases in which data have a longitudinal form, taking random effects into account. In order to investigate the stability of posterior distribution, we study through sensitivity analysis, the effect of three different uniparametric prior distributions for variance parameters of random effect on the final estimation. For this purpose, we consider homogeneous and heterogeneous structures for parameters in location and dispersion submodels. Models and results are illustrated with simulated and real data application.  相似文献   

11.
We extend the standard multivariate mixed model by incorporating a smooth time effect and relaxing distributional assumptions. We propose a semiparametric Bayesian approach to multivariate longitudinal data using a mixture of Polya trees prior distribution. Usually, the distribution of random effects in a longitudinal data model is assumed to be Gaussian. However, the normality assumption may be suspect, particularly if the estimated longitudinal trajectory parameters exhibit multimodality and skewness. In this paper we propose a mixture of Polya trees prior density to address the limitations of the parametric random effects distribution. We illustrate the methodology by analyzing data from a recent HIV-AIDS study.  相似文献   

12.
The mixed random effect model is commonly used in longitudinal data analysis within either frequentist or Bayesian framework. Here we consider a case, in which we have prior knowledge on partial parameters, while no such information on the rest of the parameters. Thus, we use the hybrid approach on the random-effects model with partial parameters. The parameters are estimated via Bayesian procedure, and the rest of parameters by the frequentist maximum likelihood estimation (MLE), simultaneously on the same model. In practice, we often know partial prior information such as, covariates of age, gender, etc. These information can be used, and accurate estimations in mixed random-effects model can be obtained. A series of simulation studies were performed to compare the results with the commonly used random-effects model with and without partial prior information. The results in hybrid estimation (HYB) and MLE were very close to each other. The estimated θ values in with partial prior information model (HYB) were more closer to true θ values, and showed less variances than without partial prior information in MLE. To compare with true θ values, the mean square of errors are much less in HYB than in MLE. This advantage of HYB is very obvious in longitudinal data with a small sample size. The methods of HYB and MLE are applied to a real longitudinal data for illustration purposes.  相似文献   

13.
A method of regularized discriminant analysis for discrete data, denoted DRDA, is proposed. This method is related to the regularized discriminant analysis conceived by Friedman (1989) in a Gaussian framework for continuous data. Here, we are concerned with discrete data and consider the classification problem using the multionomial distribution. DRDA has been conceived in the small-sample, high-dimensional setting. This method has a median position between multinomial discrimination, the first-order independence model and kernel discrimination. DRDA is characterized by two parameters, the values of which are calculated by minimizing a sample-based estimate of future misclassification risk by cross-validation. The first parameter is acomplexity parameter which provides class-conditional probabilities as a convex combination of those derived from the full multinomial model and the first-order independence model. The second parameter is asmoothing parameter associated with the discrete kernel of Aitchison and Aitken (1976). The optimal complexity parameter is calculated first, then, holding this parameter fixed, the optimal smoothing parameter is determined. A modified approach, in which the smoothing parameter is chosen first, is discussed. The efficiency of the method is examined with other classical methods through application to data.  相似文献   

14.
In this paper, we consider joint modelling of repeated measurements and competing risks failure time data. For competing risks time data, a semiparametric mixture model in which proportional hazards model are specified for failure time models conditional on cause and a multinomial model for the marginal distribution of cause conditional on covariates. We also derive a score test based on joint modelling of repeated measurements and competing risks failure time data to identify longitudinal biomarkers or surrogates for a time to event outcome in competing risks data.  相似文献   

15.
Abstract

The paper elicits subjectively the Dirichlet prior hyperparameters based on the realistic opinion collected from the experts. The procedure used for subjective elicitation considers several stages such as the choice of experts, formation of some relevant questions to be asked to the experts for getting their opinion, pooling of opinion, quantification of information and then the formation of exact prior distribution through quantile assessment based on an iterative procedure. The resulting prior distribution is used to provide the Bayes analysis assuming multinomial sampling plan. The results are illustrated by means of a data set involving two life style factors of gallbladder carcinoma patients. The results convey the message that matches closely with the opinion given by the medical experts.  相似文献   

16.
Summary.  The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.  相似文献   

17.
We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to follow-up. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a non-future dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links non-identifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by non-identified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.  相似文献   

18.
Recent work by Miller and Landis (1991) discusses generalized variance component models for polytomous responses. This work is adapted to longitudinal models for repeated measures of individuals having polytomous responses. In this setting, individuals are considered to be “clusters”. The resulting simplifications are discussed. First, each response has a multinomial distribution with N=l. Second, observed cluster proportions in the variance component estimates must be replaced by their expectations. This technique accommodates patients with missing data in a sequence of repeated observations.  相似文献   

19.
Summary Robust Bayesian analysis deals simultaneously with a class of possible prior distributions, instead of a single distribution. This paper concentrates on the surprising results that can be obtained when applying the theory to problems of testing precise hypotheses when the “objective” class of prior distributions is assumed. First, an example is given demonstrating the serious inadequacy of P-values for this problem. Next, it is shown how the approach can provide statistical quantification of Occam's Razor, the famous principle of science that advocates choice of the simpler of two hypothetical explanations of data. Finally, the theory is applied to multinomial testing. Research supported by the National Science Foundation, Grant DMS-8923071, and by NASA Contract NAS5-29285 for the hubble Space Telescope.  相似文献   

20.
This article considers the problem of testing the validity of the assumption that the underlying distribution of life is Pareto. For complete and censored samples, the relationship between the Pareto and the exponential distributions could be of vital importance to test for the validity of this assumption. For grouped uncensored data the classical Pearson χ2 test based on the multinomial model can be used. Attention is confined in this article to handle grouped data with withdrawals within intervals. Graphical as well as analytical procedures will be presented. Maximum likelihood estimators for the parameters of the Pareto distribution based on grouped data will be derived.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号