首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Graphical methods have been previously proposed for studying the cell contributions to the chi-square statistics in two-way contingency tables. Clustering techniques are suggested for analyzing the differences among the frequency distributions of either the columns or the rows of the contingency cables. Modifications are proposed to the other methods: first, varying widths of bars according to the expected cell counts, then the gannria probability plot of the individual chi-square terms.  相似文献   

We describe and illustrate approaches to data augmentation in multi-way contingency tables for which partial information, in the form of subsets of marginal totals, is available. In such problems, interest lies in questions of inference about the parameters of models underlying the table together with imputation for the individual cell entries. We discuss questions of structure related to the implications for inference on cell counts arising from assumptions about log-linear model forms, and a class of simple and useful prior distributions on the parameters of log-linear models. We then discuss “local move” and “global move” Metropolis–Hastings simulation methods for exploring the posterior distributions for parameters and cell counts, focusing particularly on higher-dimensional problems. As a by-product, we note potential uses of the “global move” approach for inference about numbers of tables consistent with a prescribed subset of marginal counts. Illustration and comparison of MCMC approaches is given, and we conclude with discussion of areas for further developments and current open issues.  相似文献   


The display of the data by means of contingency tables is used in different approaches to statistical inference, for example, to broach the test of homogeneity of independent multinomial distributions. We develop a Bayesian procedure to test simple null hypotheses versus bilateral alternatives in contingency tables. Given independent samples of two binomial distributions and taking a mixed prior distribution, we calculate the posterior probability that the proportion of successes in the first population is the same as in the second. This posterior probability is compared with the p-value of the classical method, obtaining a reconciliation between both results, classical and Bayesian. The obtained results are generalized for r × s tables.  相似文献   

A general methodology is presented for finding suitable Poisson log-linear models with applications to multiway contingency tables. Mixtures of multivariate normal distributions are used to model prior opinion when a subset of the regression vector is believed to be nonzero. This prior distribution is studied for two- and three-way contingency tables, in which the regression coefficients are interpretable in terms of odds ratios in the table. Efficient and accurate schemes are proposed for calculating the posterior model probabilities. The methods are illustrated for a large number of two-way simulated tables and for two three-way tables. These methods appear to be useful in selecting the best log-linear model and in estimating parameters of interest that reflect uncertainty in the true model.  相似文献   

Summary A method of inputting prior opinion in contingency tables is described. The method can be used to incorporate beliefs of independence or symmetry but extensions are straightforward. Logistic normal distributions that express such beliefs are used as priors of the cell probabilities and posterior estimates are derived. Empirical Bayes methods are also discussed and approximate posterior variances are provided. The methods are illustrated by a numerical example.  相似文献   

The analysis of incomplete contingency tables is a practical and an interesting problem. In this paper, we provide characterizations for the various missing mechanisms of a variable in terms of response and non-response odds for two and three dimensional incomplete tables. Log-linear parametrization and some distinctive properties of the missing data models for the above tables are discussed. All possible cases in which data on one, two or all variables may be missing are considered. We study the missingness of each variable in a model, which is more insightful for analyzing cross-classified data than the missingness of the outcome vector. For sensitivity analysis of the incomplete tables, we propose easily verifiable procedures to evaluate the missing at random (MAR), missing completely at random (MCAR) and not missing at random (NMAR) assumptions of the missing data models. These methods depend only on joint and marginal odds computed from fully and partially observed counts in the tables, respectively. Finally, some real-life datasets are analyzed to illustrate our results, which are confirmed based on simulation studies.  相似文献   

Algebraic Markov Bases and MCMC for Two-Way Contingency Tables   总被引:3,自引:0,他引:3  
ABSTRACT.  The Diaconis–Sturmfels algorithm is a method for sampling from conditional distributions, based on the algebraic theory of toric ideals. This algorithm is applied to categorical data analysis through the notion of Markov basis. An application of this algorithm is a non-parametric Monte Carlo approach to the goodness of fit tests for contingency tables. In this paper, we characterize or compute the Markov bases for some log-linear models for two-way contingency tables using techniques from Computational Commutative Algebra, namely Gröbner bases. This applies to a large set of cases including independence, quasi-independence, symmetry, quasi-symmetry. Three examples of quasi-symmetry and quasi-independence from Fingleton ( Models of category counts , Cambridge University Press, Cambridge, 1984) and Agresti ( An Introduction to categorical data analysis , Wiley, New York, 1996) illustrate the practical applicability and the relevance of this algebraic methodology.  相似文献   

One important component of model selection using generalized linear models (GLM) is the choice of a link function. We propose using approximate Bayes factors to assess the improvement in fit over a GLM with canonical link when a parametric link family is used. The approximate Bayes factors are calculated using the Laplace approximations given in [32], together with a reference set of prior distributions. This methodology can be used to differentiate between different parametric link families, as well as allowing one to jointly select the link family and the independent variables. This involves comparing nonnested models and so standard significance tests cannot be used. The approach also accounts explicitly for uncertainty about the link function. The methods are illustrated using parametric link families studied in [12] for two data sets involving binomial responses. The first author was supported by Sonderforschungsbereich 386 Statistische Analyse Diskreter Strukturen, and the second author by NIH Grant 1R01CA094212-01 and ONR Grant N00014-01-10745.  相似文献   

Summary In this paper we introduce a class of prior distributions for contingency tables with given marginals. We are interested in the structrre of concordance/discordance of such tables. There is actually a minor limitation in that the marginals are required to assume only rational values. We do argue, though, that this is not a serious drawback for all applicatory purposes. The posterior and predictive distributions given anM-sample are computed. Examples of Bayesian estimates of some classical indices of concordance are also given. Moreover, we show how to use simulation in order to overcome some difficulties which arise in the computation of the posterior distribution.  相似文献   

Summary.  We analyse input–output tables to see what structural changes have occurred in the Irish economy over time. First we produce a consistent set of input–output tables by aligning classifications and deriving a sequence of supply tables. The resulting tables are then smoothed to make the underlying distributions symmetric. We then compare the smoothed tables by using biproportional adjustment. We identify and analyse structural change that has taken place in the Irish economy since 1975.  相似文献   

We consider a likelihood ratio test of independence for large two-way contingency tables having both structural (non-random) and sampling (random) zeros in many cells. The solution of this problem is not available using standard likelihood ratio tests. One way to bypass this problem is to remove the structural zeroes from the table and implement a test on the remaining cells which incorporate the randomness in the sampling zeros; the resulting test is a test of quasi-independence of the two categorical variables. This test is based only on the positive counts in the contingency table and is valid when there is at least one sampling (random) zero. The proposed (likelihood ratio) test is an alternative to the commonly used ad hoc procedures of converting the zero cells to positive ones by adding a small constant. One practical advantage of our procedure is that there is no need to know if a zero cell is structural zero or a sampling zero. We model the positive counts using a truncated multinomial distribution. In fact, we have two truncated multinomial distributions; one for the null hypothesis of independence and the other for the unrestricted parameter space. We use Monte Carlo methods to obtain the maximum likelihood estimators of the parameters and also the p-value of our proposed test. To obtain the sampling distribution of the likelihood ratio test statistic, we use bootstrap methods. We discuss many examples, and also empirically compare the power function of the likelihood ratio test relative to those of some well-known test statistics.  相似文献   

This paper studies the mu1tinomial model 2x2 contingency table data with some cell counts missing .Various hypotheses of interest including row-column independence are tested by using Bayes factors which represent the ratio of the posterior odds to the prior odds for the null hypothesis. The Dirichlet-Beta family of prior distributions is considered for the multinomial parameters cond itional on the complement of the null hypothesis. The Bayes factor for the incomplete data is a mixture of the Bayes factors for different possibilities for the full data.  相似文献   

We develop two methods to construct confidence bands for the receiver operating characteristic (ROC) curve without estimating the densities of the underlying distributions. The first method is based on the smoothed bootstrap while the second method uses the Bonferroni inequality. As an illustration, we provide confidence bands for the ROC curve using data on Duchanne Muscular Dystrophy.  相似文献   

A new technique for the detection of outliers in contingency tables is introduced, where outliers are unusual cell counts with respect to classical loglinear Poisson models. Subsets of cell counts called minimal patterns are defined, corresponding to non-singular design matrices and leading to potentially uncontaminated maximum-likelihood estimates of the model parameters and thereby the expected cell counts. A criterion to easily produce minimal patterns in the two-way case under independence is derived, based on the analysis of the positions of the chosen cells. A simulation study and a couple of real-data examples are presented to illustrate the performance of the newly developed outlier identification algorithm, and to compare it with other existing methods.  相似文献   

One-sided two-stage prediction intervals for a normal population are extended to a third sampling stage. Procedures and tables are given for two situations. In the first situation, methods for obtaining such intervals are presented, and tables for calculating such prediction intervals are provided. In the second situation, a two-stage prediction interval has been applied, and a third stage is now required. Sample sizes are given for the third stage.  相似文献   

A new approach is presented for testing independence in contingency tables with clustered observations. The approach is based on the framework of generalized linear mixed models. Under the multinomial logistic link function, the category counts are modelled with random cluster effects and a modified likelihood ratio statistic is used for testing independence. The method is applicable to multi-way tables, and can accommodate multiple levels of clustering. It is illustrated using a benchmark dataset.  相似文献   

A log-linear model is defined for multiway contingency tables with negative multinomial frequency counts. The maximum likelihood estimator of the model parameters and the estimator covariance matrix is given. The likelihood ratio test for the general log-linear hypothesis also is presented.  相似文献   

Testing for the difference in the strength of bivariate association in two independent contingency tables is an important issue that finds applications in various disciplines. Currently, many of the commonly used tests are based on single-index measures of association. More specifically, one obtains single-index measurements of association from two tables and compares them based on asymptotic theory. Although they are usually easy to understand and use, often much of the information contained in the data is lost with single-index measures. Accordingly, they fail to fully capture the association in the data. To remedy this shortcoming, we introduce a new summary statistic measuring various types of association in a contingency table. Based on this new summary statistic, we propose a likelihood ratio test comparing the strength of association in two independent contingency tables. The proposed test examines the stochastic order between summary statistics. We derive its asymptotic null distribution and demonstrate that the least favorable distributions are chi-bar distributions. We numerically compare the power of the proposed test to that of the tests based on single-index measures. Finally, we provide two examples illustrating the new summary statistics and the related tests.  相似文献   

A Monte Carlo exact conditional test of quasi-independence in two-way incomplete contingency tables is proposed. The null distribution of a random table under quasiindependence is derived. This distribution depends only on the counts in the cells of interest and not on the counts in the remaining cells. This result is used to improve the efficiency of a proposed simulate-and-reject Monte Carlo procedure for estimating the attained significance level.  相似文献   

We consider the square contingency tables which arise when the same method of classification is applied twice. The hypothesis of marginal homogeneity is then relevant! and can be tested by various methods Models are discussed which contain marginal homogeneity as a special case. They include a class based on univariate and bivariate Dirichlet distributions. The question of ordered categories is briefly discussed. Applications are made to data on unaided distance vision.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号