期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Collapsibility of contingency tables based on conditional models

P. Vellaisamy V. Vijay 《Journal of statistical planning and inference》2010

Strict collapsibility and model collapsibility are two important concepts associated with the dimension reduction of a multidimensional contingency table, without losing the relevant information. In this paper, we obtain some necessary and sufficient conditions for the strict collapsibility of the full model, with respect to an interaction factor or a set of interaction factors, based on the interaction parameters of the conditional/layer log-linear models. For hierarchical log-linear models, we present also necessary and sufficient conditions for the full model to be model collapsible, based on the conditional interaction parameters. We discuss both the cases where one variable or a set of variables is conditioned. The connections between the strict collapsibility and the model collapsibility are also pointed out. Our results are illustrated through suitable examples, including a real life application. 相似文献

2.

Marginal inhomogeneity models for square contingency tables with nominal categories

Nobuko Miyamoto Kouji Tahata Hirokazu Ebie Sadao Tomizawa 《Journal of applied statistics》2006,33(2):203-215

For the analysis of square contingency tables with nominal categories, this paper proposes two kinds of models that indicate the structure of marginal inhomogeneity. One model states that the absolute values of log odds of the row marginal probability to the corresponding column marginal probability for each category i are constant for every i. The other model states that, on the condition that an observation falls in one of the off-diagonal cells in the square table, the absolute values of log odds of the conditional row marginal probability to the corresponding conditional column marginal probability for each category i are constant for every i. These models are used when the marginal homogeneity model does not hold, and the values of parameters in the models are useful for seeing the degree of departure from marginal homogeneity for the data on a nominal scale. Examples are given. 相似文献

3.

A probabilistic model for nonsymmetric correspondence analysis and prediction in contingency tables

Roberta Siciliano Ab Mooijart Peter G. M. van der Heijden 《Statistical Methods and Applications》1993,2(1):85-106

Summary Nonsymmetric correspondence analysis is a model meant for the analysis of the dependence in a two-way continengy table, and is an alternative to correspondence analysis. Correspondence analysis is based on the decomposition of Pearson's Ф²-index Nonsymmetric correspondence analysis is based on the decomposition of Goodman-Kruskal's τ-index for predicatablity. In this paper, we approach nonsymmetric correspondence analysis as a statistical model based on a probability distribution. We provide algorithms for the maximum likelihood and the least-squares estimation with linear constraints upon model parameters. The nonsymmetric correspondence analysis model has many properties that can be useful for prediction analysis in contingency tables. Predictability measures are introduced to identify the categories of the response variable that can be best predicted, as well as the categories of the explanatory variable having the highest predictability power. We describe the interpretation of model parameters in two examples. In the end, we discuss the relations of nonsymmetric correspondence analysis with other reduced-rank models. 相似文献

4.

Quasi local odds symmetry model for square contingency table with ordinal categories

Gökçen Altun 《Journal of Statistical Computation and Simulation》2019,89(15):2899-2913

This paper proposes a new model for square contingency tables. The proposed model tests the equality of local odds ratios between the one side of the main diagonal and corresponding other side and it represents the non-symmetric structure of the square contingency table. The proposed model is compared with twenty-five models introduced for analysing the square contingency tables for both symmetric and non-symmetric structures. The results show that the proposed model provides best fit performance than other existing models for square contingency tables. 相似文献

5.

Model determination for categorical data with factor level merging

Petros Dellaportas Claudia Tarantola 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(2):269-283

Summary. We deal with contingency table data that are used to examine the relationships between a set of categorical variables or factors. We assume that such relationships can be adequately described by the cond`itional independence structure that is imposed by an undirected graphical model. If the contingency table is large, a desirable simplified interpretation can be achieved by combining some categories, or levels, of the factors. We introduce conditions under which such an operation does not alter the Markov properties of the graph. Implementation of these conditions leads to Bayesian model uncertainty procedures based on reversible jump Markov chain Monte Carlo methods. The methodology is illustrated on a 2×3×4 and up to a 4×5×5×2×2 contingency table. 相似文献

6.

Visualizing main effects and interaction in multiple non-symmetric correspondence analysis

Luigi D'Ambra Antonello D'Ambra 《Journal of applied statistics》2012,39(10):2165-2175

Non-symmetric correspondence analysis (NSCA) is a useful technique for analysing a two-way contingency table. Frequently, the predictor variables are more than one; in this paper, we consider two categorical variables as predictor variables and one response variable. Interaction represents the joint effects of predictor variables on the response variable. When interaction is present, the interpretation of the main effects is incomplete or misleading. To separate the main effects and the interaction term, we introduce a method that, starting from the coordinates of multiple NSCA and using a two-way analysis of variance without interaction, allows a better interpretation of the impact of the predictor variable on the response variable. The proposed method has been applied on a well-known three-way contingency table proposed by Bockenholt and Bockenholt in which they cross-classify subjects by person's attitude towards abortion, number of years of education and religion. We analyse the case where the variables education and religion influence a person's attitude towards abortion. 相似文献

7.

Classical multilevel and Bayesian approaches to population size estimation using multiple lists 总被引：3，自引：0，他引：3

S. E. Fienberg M. S. Johnson & B. W. Junker 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1999,162(3):383-405

One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modelling approaches to capture–recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co-workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log-linear model; however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the 'size' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and the Bayesian contexts. 相似文献

8.

A NOTE ON MULTIVARIATE LOGISTIC MODELS FOR CONTINGENCY TABLES

Goran Kauermann 《Australian & New Zealand Journal of Statistics》1997,39(3):261-276

The log-linear model is a tool widely accepted for modelling discrete data given in a contingency table. Although its parameters reflect the interaction structure in the joint distribution of all variables, it does not give information about structures appearing in the margins of the table. This is in contrast to multivariate logistic parameters, recently introduced by Glonek & McCullagh (1995), which have as parameters the highest order log odds ratios derived from the joint table and from each marginal table. Glonek & McCullagh give the link between the cell probabilities and the multivariate logistic parameters, in an algebraic fashion. The present paper focuses on this link, showing that it is derived by general parameter transformations in exponential families. In particular, the connection between the natural, the expectation and the mixed parameterization in exponential families (Barndorff-Nielsen, 1978) is used; this also yields the derivatives of the likelihood equation and shows properties of the Fisher matrix. The paper emphasises the analysis of independence hypotheses in margins of a contingency table. 相似文献

9.

Decomposition of the main effects and interaction term by using orthogonal polynomials in multiple non symmetrical correspondence analysis

Antonello D'Ambra Pietro Amenta Anna Crisci 《统计学通讯:理论与方法》2017,46(20):10179-10188

The multiple non symmetric correspondence analysis (MNSCA) is a useful technique for analyzing a two-way contingency table. In more complex cases, the predictor variables are more than one. In this paper, the MNSCA, along with the decomposition of the Gray–Williams Tau index, in main effects and interaction term, is used to analyze a contingency table with two predictor categorical variables and an ordinal response variable. The Multiple-Tau index is a measure of association that contains both main effects and interaction term. The main effects represent the change in the response variables due to the change in the level/categories of the predictor variables, considering the effects of their addition, while the interaction effect represents the combined effect of predictor categorical variables on the ordinal response variable. Moreover, for ordinal scale variables, we propose a further decomposition in order to check the existence of power components by using Emerson's orthogonal polynomials. 相似文献

10.

Bayesian Model Choice in Exponential Survival Models

《统计学通讯:理论与方法》2013,42(12):2311-2330

ABSTRACT

Log-linear models for the distribution on a contingency table are represented as the intersection of only two kinds of log-linear models. One assuming that a certain group of the variables, if conditioned on all other variables, has a jointly independent distribution and another one assuming that a certain group of the variables, if conditioned on all other variables, has no highest order interaction. The subsets entering into these models are uniquely determined by the original log-linear model. This canonical representation suggests considering joint conditional independence and conditional no highest order association as the elementary building blocks of log-linear models. 相似文献

11.

Bayesian selection of log-linear models

James H. Albert 《Revue canadienne de statistique》1996,24(3):327-347

A general methodology is presented for finding suitable Poisson log-linear models with applications to multiway contingency tables. Mixtures of multivariate normal distributions are used to model prior opinion when a subset of the regression vector is believed to be nonzero. This prior distribution is studied for two- and three-way contingency tables, in which the regression coefficients are interpretable in terms of odds ratios in the table. Efficient and accurate schemes are proposed for calculating the posterior model probabilities. The methods are illustrated for a large number of two-way simulated tables and for two three-way tables. These methods appear to be useful in selecting the best log-linear model and in estimating parameters of interest that reflect uncertainty in the true model. 相似文献

12.

网络社区发现算法在流动表建模中的设计与应用

孙旭等《统计研究》2019,36(7):119-128

代际流动表可以统计子代与其父代社会地位配对数据的交互频数,反映了社会资源占有的优劣势在父子两代人之间的比较。对财富、阶级、特权等社会基本特征演变的实证考察,均依赖于代际流动表的量化分析。对数线性模型是流动表建模分析的基本工具,通过对列联表单元格频数进行拟合,可以识别流动表行分类与列分类之间的强弱交互效应,刻画父子社会地位间的交互结构。本文利用复杂网络社区发现算法分析父子社会地位的关联结构,针对简约对数线性模型拟合精度不够的问题,提出一种新的建模思路：利用社区发现算法对简约对数线性模型的残差列联表进行关联关系挖掘,将发现的社区效应作为附加参数约束引入原对数线性模型,以改善数据的拟合情况。由于该方法只在原简约对数线性模型中增加了一个参数约束,因此仍可以保证建模结果的简洁性及理论意义,同时社区效应补充了原对数线性模型对经验数据结构的解读。论文用此方法对来源于中国综合社会调查数据的经验代际职业流动表进行建模分析,较好地解释了子代职业阶层与父代职业阶层间的关联模式。相似文献

13.

Bayesian model comparison based on expected posterior priors for discrete decomposable graphical models

Guido Consonni Monia Lupparelli 《Journal of statistical planning and inference》2009,139(12):4154-4164

The implementation of the Bayesian paradigm to model comparison can be problematic. In particular, prior distributions on the parameter space of each candidate model require special care. While it is well known that improper priors cannot be routinely used for Bayesian model comparison, we claim that also the use of proper conventional priors under each model should be regarded as suspicious, especially when comparing models having different dimensions. The basic idea is that priors should not be assigned separately under each model; rather they should be related across models, in order to acquire some degree of compatibility, and thus allow fairer and more robust comparisons. In this connection, the intrinsic prior as well as the expected posterior prior (EPP) methodology represent a useful tool. In this paper we develop a procedure based on EPP to perform Bayesian model comparison for discrete undirected decomposable graphical models, although our method could be adapted to deal also with directed acyclic graph models. We present two possible approaches. One based on imaginary data, and one which makes use of a limited number of actual data. The methodology is illustrated through the analysis of a 2×3×4 contingency table. 相似文献

14.

Geoadditive models 总被引：7，自引：0，他引：7

E. E. Kammann M. P. Wand 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(1):1-18

Summary. A study into geographical variability of reproductive health outcomes (e.g. birth weight) in Upper Cape Cod, Massachusetts, USA, benefits from geostatistical mapping or kriging . However, also observed are some continuous covariates (e.g. maternal age) that exhibit pronounced non-linear relationships with the response variable. To account for such effects properly we merge kriging with additive models to obtain what we call geoadditive models . The merging becomes effortless by expressing both as linear mixed models. The resulting mixed model representation for the geoadditive model allows for fitting and diagnosis using standard methodology and software. 相似文献

15.

Exact tests for two-way symmetriccontingency tables

McDONALD JOHN W. DeROURE DAVID C. MICHAELIDES DANIUS T. 《Statistics and Computing》1998,8(4):391-399

A two-way contingency table in which both variables have the same categories is termed a symmetric table. In many applications, because of the social processes involved, most of the observations lie on the main diagonal and the off-diagonal counts are small. For these tables, the model of independence is implausible and interest is then focussed on the off-diagonal cells and the models of quasi-independence and quasi-symmetry. For ordinal variables, a linear-by-linear association model can be used to model the interaction structure. For sparse tables, large-sample goodness-of-fit tests are often unreliable and one should use an exact test. In this paper, we review exact tests and the computing problems involved. We propose new recursive algorithms for exact goodness-of-fit tests of quasi-independence, quasi-symmetry, linear-by-linear association and some related models. We propose that all computations be carried out using symbolic computation and rational arithmetic in order to calculate the exact p-values accurately and describe how we implemented our proposals. Two examples are presented. 相似文献

16.

Log-linear model analysis of the semi-symmetric intraclass contingency table

H. J. Khamiss 《统计学通讯:理论与方法》2013,42(23):2723-2752

This paper addresses the problem of analyzing a three-way contingency table that is upper-triangular, and a priori symmetric within layers. The log-linear model is modified to handle this kind of table, and maximum likelihood estimation is carried out for the modified log-linear model. This leads to an expression of the maximum likelihood estimates exclusively in terms of the observed cell counts. It is skin this analysis is equivalent to an application of the gone log-linear model to an artificially complete table, obtain. by splitting the off-diagonal cells in half within layers. This analysis is used in analyzing the results of a study done to determine the effect of the sex-linked dwarfing gene in male chickens on resistance to E. coli infection; the conclusion differs from that of a previous analysis of the same data (see Norwood and Hinkelmann 1978). It is found, in fact, that the structure of association among the two allele variables and the disease variable is somewhat more complex than previously proposed. A second example is taken from Ishii (1960). Finally, collapsibility conditions for the modified log-linear model, as well as various other sampling plans and limitations to the testing procedure, are discussed. 相似文献

17.

Independence in Contingency Tables Using Simplicial Geometry

Juan José Egozcue Vera Pawlowsky-Glahn Matthias Templ Karel Hron 《统计学通讯:理论与方法》2013,42(18):3978-3996

Frequently, contingency tables are generated in a multinomial sampling. Multinomial probabilities are then organized in a table assigning probabilities to each cell. A probability table can be viewed as an element in the simplex. The Aitchison geometry of the simplex identifies independent probability tables as a linear subspace. An important consequence is that, given a probability table, the nearest independent table is obtained by orthogonal projection onto the independent subspace. The nearest independent table is identified as that obtained by the product of geometric marginals, which do not coincide with the standard marginals, except in the independent case. The original probability table is decomposed into orthogonal tables, the independent and the interaction tables. The underlying model is log-linear, and a procedure to test independence of a contingency table, based on a multinomial simulation, is developed. Its performance is studied on an illustrative example. 相似文献

18.

Influence diagnostics for stratified ordinal contingency tables

《Journal of Statistical Computation and Simulation》2012,82(5):405-415

Influence diagnostics are investigated in this study. In particular, an approach based on the generalized linear mixed model setting is presented for formulating ordered categorical counts in stratified contingency tables. Deletion diagnostics and their first-order approximations are developed for assessing the stratum-specific influence on parameter estimates in the models. To illustrate the proposed model diagnostic technique, the method is applied to analyze two sets of data: a clinical trial and a survey study. The two examples demonstrate that the presence of influential strata may substantially change the results in ordinal contingency table analysis. 相似文献

19.

Measuring and estimating the interaction between exposures on a dichotomous outcome for observational studies

Xiaoqin Wang Weimin Ye 《Journal of applied statistics》2017,44(14):2483-2498

In observational studies for the interaction between exposures on a dichotomous outcome of a certain population, usually one parameter of a regression model is used to describe the interaction, leading to one measure of the interaction. In this article we use the conditional risk of an outcome given exposures and covariates to describe the interaction and obtain five different measures of the interaction, that is, difference between the marginal risk differences, ratio of the marginal risk ratios, ratio of the marginal odds ratios, ratio of the conditional risk ratios, and ratio of the conditional odds ratios. These measures reflect different aspects of the interaction. By using only one regression model for the conditional risk, we obtain the maximum-likelihood (ML)-based point and interval estimates of these measures, which are most efficient due to the nature of ML. We use the ML estimates of the model parameters to obtain the ML estimates of these measures. We use the approximate normal distribution of the ML estimates of the model parameters to obtain approximate non-normal distributions of the ML estimates of these measures and then confidence intervals of these measures. The method can be easily implemented and is presented via a medical example. 相似文献

20.

On the Use of a Log-Rate Model for Survey-Weighted Categorical Data

Thomas M. Loughin Christopher R. Bilder 《统计学通讯:理论与方法》2013,42(15):2661-2669

For the analysis of survey-weighted categorical data, one recommended method of analysis is a log-rate model. For each cell in a contingency table, the survey weights are averaged across subjects and incorporated into an offset for a loglinear model. Supposedly, one can then proceed with the analysis of unweighted observed cell counts. We provide theoretical and simulation-based evidence to show that the log-rate analysis is not an effective statistical analysis method and should not be used in general. The root of the problem is in its failure to properly account for variability in the individual weights within cells of a contingency table. This results in goodness-of-fit tests that have higher-than-nominal error rates and confidence intervals for odds ratios that have lower-than-nominal coverage. 相似文献