期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian methods for contingency tables using Gibbs sampling

Paul E. Green Taesung Park 《Statistical Papers》2004,45(1):33-50

Cell counts in contingency tables can be smoothed using loglinear models. Recently, sampling-based methods such as Markov chain Monte Carlo (MCMC) have been introduced, making it possible to sample from posterior distributions. The novelty of the approach presented here is that all conditional distributions can be specified directly, so that straight-forward Gibbs sampling is possible. Thus, the model is constructed in a way that makes burn-in and checking convergence a relatively minor issue. The emphasis of this paper is on smoothing cell counts in contingency tables, and not so much on estimation of regression parameters. Therefore, the prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage, and a vague prior for hyperparameters at the second stage. The smoothed counts tend to compromise between the observed data and a log-linear model. The methods are demonstrated with a sparse data table taken from a multi-center clinical trial. The research for the first author was supported by Brain Pool program of the Korean Federation of Science and Technology Societies. The research for the second author was partially supported by KOSEF through Statistical Research Center for Complex Systems at Seoul National University. 相似文献

2.

Bayesian reconstruction of contingency tables with partially categorized data

Patrick C. Gibbons Edward Greenberg 《统计学通讯:理论与方法》2013,42(12):3349-3359

This paper suggests a Bayesian approach to the reconstruction of a 2 × 2 contingency table where some of the observations are only partially categorized and others are fully categorized. In contrast, most previous Bayesian and non-Bayesian analyses of the partially categorized data problem have been concerned with estimation of the parameters that generated the data. We show in an example that estimates may not be extremely sensitive to the weight placed on prior information relative to the sample data. 相似文献

3.

Bayesian inference on contingency tables with uncertainty about independence for small areas

Sang Gyu Kwak Balgobin Nandram 《Journal of applied statistics》2018,45(12):2145-2163

相似文献

4.

The stability of several measures of association in small contingency tables

Thomas W. O'Gorman Robert F. Woolson 《统计学通讯:理论与方法》2013,42(3):1141-1155

Measures of association are often used to describe the relationship between row and column variables in two—dimensional contingency tables. It is not uncommon in biomedical research to categorize continuous variables to obtain a two—dimensional table. In these situations it is desirable that the measure of association not be too sensitive to changes in the number of categories or to the choice of cut points. To accomplish this objective we attempt to find a measure of association that closely approximates the corresponding measure of association for the underlying distribution.Measures that are close to the underlying measure for various table sizes andcutpoints are called stable measures. 相似文献

5.

Partitioning chi-squape in contingency tables: A teaching approach

Campbell B. Read 《统计学通讯:理论与方法》2013,42(6):553-562

A representation of sums and differences of the form 2n log n, the lnn function, is introduced to express likelihood-ratio chi-square test statistics in contingency table analysis. This is a concise explicit form to display when partitioning chi-square statistics in accordance with hierarchical models. The lnn representation gives students insights into the construction of test statistics, and assists in relating identical forms under differing model sets. Hierarchies are presented for independence and equi-probability in two-way tables, for symmetry in correlated square tables, for independence-and-homogeneity of two-way responses across levels of a factor, and for mutual independence in three-way tables, along with relevant partitions of chi-square. 相似文献

6.

Bayesian analysis of contingency tables: a simulation and graphics-based approach

P. Vounatsou A. F. M. Smith 《Statistics and Computing》1996,6(3):277-287

In this paper we present a simulation and graphics-based model checking and model comparison methodology for the Bayesian analysis of contingency tables. We illustrate the approach by testing the hypotheses of independence and symmetry on complete and incomplete simulated tables. 相似文献

7.

Clustering for contingency tables: boxes and partitions

Boris Mirkin 《Statistics and Computing》1996,6(3):217-229

The correspondence analysis (CA) method appears to be an effective tool for analysis of interrelations between rows and columns in two-way contingency data. A discrete version of the method, box clustering, is developed in the paper using an approximation version of the CA model extended to the case when CA factor values are required to be Boolean. Several properties of the proposed SEFIT-BOX algorithm are proved to facilitate interpretation of its output. It is also shown that two known partitioning algorithms (applied within row or column sets only) could be considered as locally optimal algorithms for fitting the model, and extensions of these algorithms to a simultaneous row and column partitioning problem are proposed. 相似文献

8.

Catanova for multidimensional contingency tables: Nominal-scale response

Robert J. Anderson J. Richard Landis 《统计学通讯:理论与方法》2013,42(11):1191-1206

This paper extends an analysis of variance for categorical data (CATANOVA) procedure to multidimensional contingency tables involving several factors and a response variable measured on a nominal scale. Using an appropriate measure of total variation for multinomial data, partial and multiple association measures are developed as R² quantities which parallel the analogous statistics in multiple linear regression for quantitative data. In addition, test statistics are derived in terms of these R² criteria. Finally, this CATANOVA approach is illustrated within the context of 2 three-way contingency table from a multicenter clinicaltrial. 相似文献

9.

Bayesian estimation of log odds ratios over two-way contingency tables with intraclass correlated cells

Haydar Demirhan 《Journal of applied statistics》2013,40(10):2303-2316

In this article, a Bayesian approach is proposed for the estimation of log odds ratios and intraclass correlations over a two-way contingency table, including intraclass correlated cells. Required likelihood functions of log odds ratios are obtained, and determination of prior structures is discussed. Hypothesis testing for log odds ratios and intraclass correlations by using the posterior simulations is outlined. Because the proposed approach includes no asymptotic theory, it is useful for the estimation and hypothesis testing of log odds ratios in the presence of certain intraclass correlation patterns. A family health status and limitations data set is analyzed by using the proposed approach in order to figure out the impact of intraclass correlations on the estimates and hypothesis tests of log odds ratios. Although intraclass correlations are small in the data set, we obtain that even small intraclass correlations can significantly affect the estimates and test results, and our approach is useful for the estimation and testing of log odds ratios in the presence of intraclass correlations. 相似文献

10.

Outlier detection in contingency tables using decomposable graphical models

Mads Lindskou Poul Svante Eriksen Torben Tvedebrink 《Scandinavian Journal of Statistics》2020,47(2):347-360

For high-dimensional data, it is a tedious task to determine anomalies such as outliers. We present a novel outlier detection method for high-dimensional contingency tables. We use the class of decomposable graphical models to model the relationship among the variables of interest, which can be depicted by an undirected graph called the interaction graph. Given an interaction graph, we derive a closed-form expression of the likelihood ratio test (LRT) statistic and an exact distribution for efficient simulation of the test statistic. An observation is declared an outlier if it deviates significantly from the approximated distribution of the test statistic under the null hypothesis. We demonstrate the use of the LRT outlier detection framework on genetic data modeled by Chow–Liu trees. 相似文献

11.

Quantifying conditional probability tables in Bayesian networks: Bayesian regression for scenario-based encoding of elicited expert assessments on feral pig habitat

Ibrahim Alkhairy Samantha Low-Choy Justine Murray Junhu Wang Anthony Pettitt 《Journal of applied statistics》2020,47(10):1848

相似文献

12.

Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke 总被引：7，自引：0，他引：7

Chris T. Volinsky David Madigan Adrian E. Raftery & Richard A. Kronmal 《Journal of the Royal Statistical Society. Series C, Applied statistics》1997,46(4):433-448

In the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for strokes, we apply Bayesian model averaging to the selection of variables in Cox proportional hazard models. We use an extension of the leaps-and-bounds algorithm for locating the models that are to be averaged over and make available S-PLUS software to implement the methods. Bayesian model averaging provides a posterior probability that each variable belongs in the model, a more directly interpretable measure of variable importance than a P -value. P -values from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable and do not account for model uncertainty. We introduce the partial predictive score to evaluate predictive performance. For the Cardiovascular Health Study, Bayesian model averaging predictively outperforms standard model selection and does a better job of assessing who is at high risk for a stroke. 相似文献

13.

The multinomial logistic regression model for predicting the discharge status after liver transplantation: estimation and diagnostics analysis

E. M. Hashimoto E. M. M. Ortega G. M. Cordeiro A. K. Suzuki M. W. Kattan 《Journal of applied statistics》2020,47(12):2159

The multinomial logistic regression model (MLRM) can be interpreted as a natural extension of the binomial model with logit link function to situations where the response variable can have three or more possible outcomes. In addition, when the categories of the response variable are nominal, the MLRM can be expressed in terms of two or more logistic models and analyzed in both frequentist and Bayesian approaches. However, few discussions about post modeling in categorical data models are found in the literature, and they mainly use Bayesian inference. The objective of this work is to present classic and Bayesian diagnostic measures for categorical data models. These measures are applied to a dataset (status) of patients undergoing kidney transplantation. 相似文献

14.

Proposals for 2001 samples of anonymized records: An assessment of disclosure risk

Angela Dale & Mark Elliot 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(3):427-447

In 1991 Marsh and co-workers made the case for a sample of anonymized records (SAR) from the 1991 census of population. The case was accepted by the Office for National Statistics (then the Office of Population Censuses and Surveys) and a request was made by the Economic and Social Research Council to purchase the SARs. Two files were released for Great Britain—a 2% sample of individuals and a 1% sample of households. Subsequently similar samples were released for Northern Ireland. Since their release, the files have been heavily used for research and there has been no known breach of confidentiality. There is a considerable demand for similar files from the 2001 census, with specific requests for a larger sample size and lower population threshold for the individual SAR. This paper reassesses the analysis of Marsh and co-workers of the risk of identification of an individual or household in a sample of microdata from the 1991 census and also uses alternative ways of assessing risks with the 1991 SARs. The results of both the reassessment and the new analyses are reassuring and allow us to take the 1991 SARs as a base-line against which to assess proposals for changes to the size and structure of samples from the 2001 census. 相似文献

15.

The probability of identification: applying ideas from forensic statistics to disclosure risk assessment

C. J. Skinner 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2007,170(1):195-212

Summary. The paper establishes a correspondence between statistical disclosure control and forensic statistics regarding their common use of the concept of 'probability of identification'. The paper then seeks to investigate what lessons for disclosure control can be learnt from the forensic identification literature. The main lesson that is considered is that disclosure risk assessment cannot, in general, ignore the search method that is employed by an intruder seeking to achieve disclosure. The effects of using several search methods are considered. Through consideration of the plausibility of assumptions and 'worst case' approaches, the paper suggests how the impact of search method can be handled. The paper focuses on foundations of disclosure risk assessment, providing some justification for some modelling assumptions underlying some existing record level measures of disclosure risk. The paper illustrates the effects of using various search methods in a numerical example based on microdata from a sample from the 2001 UK census. 相似文献

16.

Bayesian Binomial Regression: Predicting Survival at a Trauma Center

Edward J. Bedrick Ronald Christensen Wesley Johnson 《The American statistician》2013,67(3):211-218

Standard methods for analyzing binomial regression data rely on asymptotic inferences. Bayesian methods can be performed using simple computations, and they apply for any sample size. We provide a relatively complete discussion of Bayesian inferences for binomial regression with emphasis on inferences for the probability of “success.” Furthermore, we illustrate diagnostic tools, perform model selection among nonnested models, and examine the sensitivity of the Bayesian methods. 相似文献

17.

Independence in multi-way contingency tables: S.N. Roy's breakthroughs and later developments

Alan Agresti Anna Gottard 《Journal of statistical planning and inference》2007

In the mid-1950s S.N. Roy and his students contributed two landmark articles to the contingency table literature [Roy, S.N., Kastenbaum, M.A., 1956. On the hypothesis of no “interaction” in a multiway contingency table. Ann. Math. Statist. 27, 749–757; Roy, S.N., Mitra, S.K., 1956. An introduction to some nonparametric generalizations of analysis of variance and multivariate analysis. Biometrika 43, 361–376]. The first article generalized concepts of interaction from 2×2×2

2 \times 2 \times 2

contingency tables to three-way tables of arbitrary size and to larger tables. In the second article, which is the source of our primary focus, various notions of independence were clarified for three-way contingency tables, Roy's union–intersection test was applied to construct chi-squared tests of hypotheses about the structure of such tables, and the chi-squared statistics were shown not to depend on the distinction between response and explanatory variables. This work pre-dates by many years later developments that expressed such results in the context of loglinear models. It pre-dates by a quarter century the development of graphical models. We summarize the main results in these key articles and discuss the connection between them and the later developments of loglinear modeling and of graphical modeling. We also mention ways in which these later developments have themselves been further generalized. 相似文献

18.

Bayesian joint modelling of benefit and risk in drug development

下载免费PDF全文

Maria J. Costa Thomas Drury 《Pharmaceutical statistics》2018,17(3):248-263

To gain regulatory approval, a new medicine must demonstrate that its benefits outweigh any potential risks, ie, that the benefit‐risk balance is favourable towards the new medicine. For transparency and clarity of the decision, a structured and consistent approach to benefit‐risk assessment that quantifies uncertainties and accounts for underlying dependencies is desirable. This paper proposes two approaches to benefit‐risk evaluation, both based on the idea of joint modelling of mixed outcomes that are potentially dependent at the subject level. Using Bayesian inference, the two approaches offer interpretability and efficiency to enhance qualitative frameworks. Simulation studies show that accounting for correlation leads to a more accurate assessment of the strength of evidence to support benefit‐risk profiles of interest. Several graphical approaches are proposed that can be used to communicate the benefit‐risk balance to project teams. Finally, the two approaches are illustrated in a case study using real clinical trial data. 相似文献

19.

Comparison of different estimation methods in growth curve models for categorical data: A simulation study

Maryam Salari Farid Zayeri Ramin Daneshvar 《统计学通讯:模拟与计算》2018,47(6):1811-1830

This simulation study aims at investigating the performance of maximum likelihood and weighted least-square estimation approaches in growth curve models with categorical data. The goodness-of-fit indices were compared with a number of scenarios (different trajectories, sample sizes, replications, and number of categories). The results show that when the number of categories and replications are small, using weighted least-square estimating methods leads to better goodness-of-fit indices. However, when the number of categories and replications are large, both maximum likelihood and weighted least squares in estimating methods will result in similar fit indices. 相似文献

20.

Bayesian variable selection in logistic regression: predicting company earnings direction

Richard Gerlach Ron Bird & Anthony Hall 《Australian & New Zealand Journal of Statistics》2002,44(2):155-168

This paper presents a Bayesian technique for the estimation of a logistic regression model including variable selection. As in Ou & Penman (1989), the model is used to predict the direction of company earnings, one year ahead, from a large set of accounting variables from financial statements. To estimate the model, the paper presents a Markov chain Monte Carlo sampling scheme that includes the variable selection technique of Smith & Kohn (1996) and the non-Gaussian estimation method of Mira & Tierney (2001). The technique is applied to data for companies in the United States and Australia. The results obtained compare favourably to the technique used by Ou & Penman (1989) for both regions. 相似文献