首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 640 毫秒
1.
We propose Bayesian methods with five types of priors to estimate cell probabilities in an incomplete multi-way contingency table under nonignorable nonresponse. In this situation, the maximum likelihood (ML) estimates often fall in the boundary solution, causing the ML estimates to become unstable. To deal with such a multi-way table, we present an EM algorithm which generalizes the previous algorithm used for incomplete one-way tables. Three of the five types of priors were previously introduced while the other two are newly proposed to reflect different response patterns between respondents and nonrespondents. Data analysis and simulation studies show that Bayesian estimates based on the old three priors can be worse than the ML regardless of occurrence of boundary solution, contrary to previous studies. The Bayesian estimates from the two new priors are most preferable when a boundary solution occurs. We provide an illustrating example using data for a study of the relationship between a mother's smoking and her newborn's weight.  相似文献   

2.
We develop strategies for Bayesian modelling as well as model comparison, averaging and selection for compartmental models with particular emphasis on those that occur in the analysis of positron emission tomography (PET) data. Both modelling and computational issues are considered. Biophysically inspired informative priors are developed for the problem at hand, and by comparison with default vague priors it is shown that the proposed modelling is not overly sensitive to prior specification. It is also shown that an additive normal error structure does not describe measured PET data well, despite being very widely used, and that within a simple Bayesian framework simultaneous parameter estimation and model comparison can be performed with a more general noise model. The proposed approach is compared with standard techniques using both simulated and real data. In addition to good, robust estimation performance, the proposed technique provides, automatically, a characterisation of the uncertainty in the resulting estimates which can be considerable in applications such as PET.  相似文献   

3.
Algebraic Markov Bases and MCMC for Two-Way Contingency Tables   总被引:3,自引:0,他引:3  
ABSTRACT.  The Diaconis–Sturmfels algorithm is a method for sampling from conditional distributions, based on the algebraic theory of toric ideals. This algorithm is applied to categorical data analysis through the notion of Markov basis. An application of this algorithm is a non-parametric Monte Carlo approach to the goodness of fit tests for contingency tables. In this paper, we characterize or compute the Markov bases for some log-linear models for two-way contingency tables using techniques from Computational Commutative Algebra, namely Gröbner bases. This applies to a large set of cases including independence, quasi-independence, symmetry, quasi-symmetry. Three examples of quasi-symmetry and quasi-independence from Fingleton ( Models of category counts , Cambridge University Press, Cambridge, 1984) and Agresti ( An Introduction to categorical data analysis , Wiley, New York, 1996) illustrate the practical applicability and the relevance of this algebraic methodology.  相似文献   

4.
Bivariate exponential models have often been used for the analysis of competing risks data involving two correlated risk components. Competing risks data consist only of the time to failure and cause of failure. In situations where there is positive probability of simultaneous failure, possibly the most widely used model is the Marshall–Olkin (J. Amer. Statist. Assoc. 62 (1967) 30) bivariate lifetime model. This distribution is not absolutely continuous as it involves a singularity component. However, the likelihood function based on the competing risks data is then identifiable, and any inference, Bayesian or frequentist, can be carried out in a straightforward manner. For the analysis of absolutely continuous bivariate exponential models, standard approaches often run into difficulty due to the lack of a fully identifiable likelihood (Basu and Ghosh; Commun. Statist. Theory Methods 9 (1980) 1515). To overcome the nonidentifiability, the usual frequentist approach is based on an integrated likelihood. Such an approach is implicit in Wada et al. (Calcutta Statist. Assoc. Bull. 46 (1996) 197) who proved some related asymptotic results. We offer in this paper an alternative Bayesian approach. Since systematic prior elicitation is often difficult, the present study focuses on Bayesian analysis with noninformative priors. It turns out that with an appropriate reparameterization, standard noninformative priors such as Jeffreys’ prior and its variants can be applied directly even though the likelihood is not fully identifiable. Two noninformative priors are developed that consist of Laplace's prior for nonidentifiable parameters and Laplace's and Jeffreys's priors for identifiable parameters. The resulting Bayesian procedures possess some frequentist optimality properties as well. Finally, these Bayesian methods are illustrated with analyses of a data set originating out of a lung cancer clinical trial conducted by the Eastern Cooperative Oncology Group.  相似文献   

5.
Structural models—or dynamic linear models as they are known in the Bayesian literature—have been widely used to model and predict time series using a decomposition in non observable components. Due to the direct interpretation of the parameters, structural models are a powerful and simple methodology to analyze time series in several areas, such as economy, climatology, environmental sciences, among others. The parameters of such models can be estimated either using maximum likelihood or Bayesian procedures, generally implemented using conjugate priors, and there are plenty of works in the literature employing both methods. But are there situations where one of these approaches should be preferred? In this work, instead of conjugate priors for the hyperparameters, the Jeffreys prior is used in the Bayesian approach, along with the uniform prior, and the results are compared to the maximum likelihood method, in an extensive Monte Carlo study. Interval estimation is also evaluated and, to this purpose, bootstrap confidence intervals are introduced in the context of structural models and their performance is compared to the asymptotic and credibility intervals. A real time series of a Brazilian electric company is used as illustration.  相似文献   

6.
In a 2 × 2 contingency table, when the sample size is small, there may be a number of cells that contain few or no observations, usually referred to as sparse data. In such cases, a common recommendation in the conventional frequentist methods is adding a small constant to every cell of the observed table to find the estimates of the unknown parameters. However, this approach is based on asymptotic properties of the estimates and may work poorly for small samples. An alternative approach would be to use Bayesian methods in order to provide better insight into the problem of sparse data coupled with fewer centers, which would otherwise be difficult to carry out the analysis. In this article, an attempt has been made to use hierarchical Bayesian model to a multicenter data on the effect of a surgical treatment with standard foot care among leprosy patients with posterior tibial nerve damage which is summarized as seven 2 × 2 tables. Monte Carlo Markov Chain (MCMC) techniques are applied in estimating the parameters of interest under sparse data setup.  相似文献   

7.
The implementation of the Bayesian paradigm to model comparison can be problematic. In particular, prior distributions on the parameter space of each candidate model require special care. While it is well known that improper priors cannot be routinely used for Bayesian model comparison, we claim that also the use of proper conventional priors under each model should be regarded as suspicious, especially when comparing models having different dimensions. The basic idea is that priors should not be assigned separately under each model; rather they should be related across models, in order to acquire some degree of compatibility, and thus allow fairer and more robust comparisons. In this connection, the intrinsic prior as well as the expected posterior prior (EPP) methodology represent a useful tool. In this paper we develop a procedure based on EPP to perform Bayesian model comparison for discrete undirected decomposable graphical models, although our method could be adapted to deal also with directed acyclic graph models. We present two possible approaches. One based on imaginary data, and one which makes use of a limited number of actual data. The methodology is illustrated through the analysis of a 2×3×4 contingency table.  相似文献   

8.
The Bayesian CART (classification and regression tree) approach proposed by Chipman, George and McCulloch (1998) entails putting a prior distribution on the set of all CART models and then using stochastic search to select a model. The main thrust of this paper is to propose a new class of hierarchical priors which enhance the potential of this Bayesian approach. These priors indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other. Past methods for tree shrinkage have searched for trees without shrinking, and applied shrinkage to the identified tree only after the search. By using hierarchical priors in the stochastic search, the proposed method searches for shrunk trees that fit well and improves the tree through shrinkage of predictions.  相似文献   

9.
The analysis of incomplete contingency tables is a practical and an interesting problem. In this paper, we provide characterizations for the various missing mechanisms of a variable in terms of response and non-response odds for two and three dimensional incomplete tables. Log-linear parametrization and some distinctive properties of the missing data models for the above tables are discussed. All possible cases in which data on one, two or all variables may be missing are considered. We study the missingness of each variable in a model, which is more insightful for analyzing cross-classified data than the missingness of the outcome vector. For sensitivity analysis of the incomplete tables, we propose easily verifiable procedures to evaluate the missing at random (MAR), missing completely at random (MCAR) and not missing at random (NMAR) assumptions of the missing data models. These methods depend only on joint and marginal odds computed from fully and partially observed counts in the tables, respectively. Finally, some real-life datasets are analyzed to illustrate our results, which are confirmed based on simulation studies.  相似文献   

10.
Summary.  We deal with contingency table data that are used to examine the relationships between a set of categorical variables or factors. We assume that such relationships can be adequately described by the cond`itional independence structure that is imposed by an undirected graphical model. If the contingency table is large, a desirable simplified interpretation can be achieved by combining some categories, or levels, of the factors. We introduce conditions under which such an operation does not alter the Markov properties of the graph. Implementation of these conditions leads to Bayesian model uncertainty procedures based on reversible jump Markov chain Monte Carlo methods. The methodology is illustrated on a 2×3×4 and up to a 4×5×5×2×2 contingency table.  相似文献   

11.
Summary.  We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focused on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty in our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set.  相似文献   

12.
In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not allow for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main effect A or B. This paper develops mathematical representations of this and other relations between predictors, which may then be incorporated in a model selection procedure. A Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information. Priors relevant to arbitrary interactions and polynomials, dummy variables for categorical factors, competing predictors, and restrictions on the size of the models are developed. Since the relations developed are for priors, they may be incorporated in any Bayesian variable selection algorithm for any type of linear model. The application of the methods here is illustrated via the stochastic search variable selection algorithm of George and McCulloch (1993), which is modified to utilize the new priors. The performance of the approach is illustrated with two constructed examples and a computer performance dataset.  相似文献   

13.
Summary A method of inputting prior opinion in contingency tables is described. The method can be used to incorporate beliefs of independence or symmetry but extensions are straightforward. Logistic normal distributions that express such beliefs are used as priors of the cell probabilities and posterior estimates are derived. Empirical Bayes methods are also discussed and approximate posterior variances are provided. The methods are illustrated by a numerical example.  相似文献   

14.
Most models for incomplete data are formulated within the selection model framework. Pattern-mixture models are increasingly seen as a viable alternative, both from an interpretational as well as from a computational point of view (Little 1993, Hogan and Laird 1997, Ekholm and Skinner 1998). Whereas most applications are either for continuous normally distributed data or for simplified categorical settings such as contingency tables, we show how a multivariate odds ratio model (Molenberghs and Lesaffre 1994, 1998) can be used to fit pattern-mixture models to repeated binary outcomes with continuous covariates. Apart from point estimation, useful methods for interval estimation are presented and data from a clinical study are analyzed to illustrate the methods.  相似文献   

15.

Influence diagnostics are investigated in this study. In particular, an approach based on the generalized linear mixed model setting is presented for formulating ordered categorical counts in stratified contingency tables. Deletion diagnostics and their first-order approximations are developed for assessing the stratum-specific influence on parameter estimates in the models. To illustrate the proposed model diagnostic technique, the method is applied to analyze two sets of data: a clinical trial and a survey study. The two examples demonstrate that the presence of influential strata may substantially change the results in ordinal contingency table analysis.  相似文献   

16.
The estimation problem of the parameters of a mixed geometric lifetime model, using Bayesian approach and Type I group censored sample, will be investigated in the case of two subpopulations. The Bayes estimates are derived for squared error, minimum expected, general entropy and linex loss functions under informative and diffuse priors. A practical example given by Nelson (W.B. Nelson, Hazard plotting methods for analysis of the life data with different failure models, J. Qual. Technol. 2 (1970), pp. 126–149) is considered. A simulation study is carried out along with risk.  相似文献   

17.
In this paper, we develop a methodology for the dynamic Bayesian analysis of generalized odds ratios in contingency tables. It is a standard practice to assume a normal distribution for the random effects in the dynamic system equations. Nevertheless, the normality assumption may be unrealistic in some applications and hence the validity of inferences can be dubious. Therefore, we assume a multivariate skew-normal distribution for the error terms in the system equation at each step. Moreover, we introduce a moving average approach to elicit the hyperparameters. Both simulated data and real data are analyzed to illustrate the application of this methodology.  相似文献   

18.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

19.
Cell counts in contingency tables can be smoothed using loglinear models. Recently, sampling-based methods such as Markov chain Monte Carlo (MCMC) have been introduced, making it possible to sample from posterior distributions. The novelty of the approach presented here is that all conditional distributions can be specified directly, so that straight-forward Gibbs sampling is possible. Thus, the model is constructed in a way that makes burn-in and checking convergence a relatively minor issue. The emphasis of this paper is on smoothing cell counts in contingency tables, and not so much on estimation of regression parameters. Therefore, the prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage, and a vague prior for hyperparameters at the second stage. The smoothed counts tend to compromise between the observed data and a log-linear model. The methods are demonstrated with a sparse data table taken from a multi-center clinical trial. The research for the first author was supported by Brain Pool program of the Korean Federation of Science and Technology Societies. The research for the second author was partially supported by KOSEF through Statistical Research Center for Complex Systems at Seoul National University.  相似文献   

20.
In this paper the Bayesian analysis of incomplete categorical data under informative general censoring proposed by Paulino and Pereira (1995) is revisited. That analysis is based on Dirichlet priors and can be applied to any missing data pattern. However, the known properties of the posterior distributions are scarce and therefore severe limitations to the posterior computations remain. Here is shown how a Monte Carlo simulation approach based on an alternative parameterisation can be used to overcome the former computational difficulties. The proposed simulation approach makes available the approximate estimation of general parametric functions and can be implemented in a very straightforward way.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号