期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Approach to Multicentre Sparse Data

M. Subbiah B. Kishore Kumar 《统计学通讯:模拟与计算》2013,42(4):687-696

In a 2 × 2 contingency table, when the sample size is small, there may be a number of cells that contain few or no observations, usually referred to as sparse data. In such cases, a common recommendation in the conventional frequentist methods is adding a small constant to every cell of the observed table to find the estimates of the unknown parameters. However, this approach is based on asymptotic properties of the estimates and may work poorly for small samples. An alternative approach would be to use Bayesian methods in order to provide better insight into the problem of sparse data coupled with fewer centers, which would otherwise be difficult to carry out the analysis. In this article, an attempt has been made to use hierarchical Bayesian model to a multicenter data on the effect of a surgical treatment with standard foot care among leprosy patients with posterior tibial nerve damage which is summarized as seven 2 × 2 tables. Monte Carlo Markov Chain (MCMC) techniques are applied in estimating the parameters of interest under sparse data setup. 相似文献

2.

Collapsibility of contingency tables based on conditional models

P. Vellaisamy V. Vijay 《Journal of statistical planning and inference》2010

Strict collapsibility and model collapsibility are two important concepts associated with the dimension reduction of a multidimensional contingency table, without losing the relevant information. In this paper, we obtain some necessary and sufficient conditions for the strict collapsibility of the full model, with respect to an interaction factor or a set of interaction factors, based on the interaction parameters of the conditional/layer log-linear models. For hierarchical log-linear models, we present also necessary and sufficient conditions for the full model to be model collapsible, based on the conditional interaction parameters. We discuss both the cases where one variable or a set of variables is conditioned. The connections between the strict collapsibility and the model collapsibility are also pointed out. Our results are illustrated through suitable examples, including a real life application. 相似文献

3.

Log-linear models for correlated marginal totals of a contingency table

Michael Haber 《统计学通讯:理论与方法》2013,42(12):2845-2856

The marginal totals of a contingency table can be rearranged to form a new table. If at least twoof these totals include the same cell of the original table, the new table cannot be treated as anordinary contingency table. An iterative method is proposed to calculate maximum likelihood estimators for the expected cell frequencies of the original table under the assumption that some marginal totals (or more generally, some linear functions) of these expected frequencies satisfy a log-linear model.In some cases, a table of correlated marginal totals is treated as if it was an ordinary contingency table. The effects of ignoring the special structure of the marginal table on thedistributionof the goodness-of-fit test statistics are discussed and illustrated, with special reference to stationary Markov chains. 相似文献

4.

Bayesian models for relative archaeological chronology building 总被引：1，自引：0，他引：1

Caitlin E. Buck & Sujit K. Sahu 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(4):423-440

For many years, archaeologists have postulated that the numbers of various artefact types found within excavated features should give insight about their relative dates of deposition even when stratigraphic information is not present. A typical data set used in such studies can be reported as a cross-classification table (often called an abundance matrix or, equivalently, a contingency table) of excavated features against artefact types. Each entry of the table represents the number of a particular artefact type found in a particular archaeological feature. Methodologies for attempting to identify temporal sequences on the basis of such data are commonly referred to as seriation techniques. Several different procedures for seriation including both parametric and non-parametric statistics have been used in an attempt to reconstruct relative chronological orders on the basis of such contingency tables. We develop some possible model-based approaches that might be used to aid in relative, archaeological chronology building. We use the recently developed Markov chain Monte Carlo method based on Langevin diffusions to fit some of the models proposed. Predictive Bayesian model choice techniques are then employed to ascertain which of the models that we develop are most plausible. We analyse two data sets taken from the literature on archaeological seriation. 相似文献

5.

A spatial model for the needle losses of pine-trees in the forests of Baden-Württemberg: an application of Bayesian structured additive regression

Nicole H. Augustin Stefan Lang Monica Musio Klaus von Wilpert 《Journal of the Royal Statistical Society. Series C, Applied statistics》2007,56(1):29-50

Summary. The data that are analysed are from a monitoring survey which was carried out in 1994 in the forests of Baden-Württemberg, a federal state in the south-western region of Germany. The survey is part of a large monitoring scheme that has been carried out since the 1980s at different spatial and temporal resolutions to observe the increase in forest damage. One indicator for tree vitality is tree defoliation, which is mainly caused by intrinsic factors, age and stand conditions, but also by biotic (e.g. insects) and abiotic stresses (e.g. industrial emissions). In the survey, needle loss of pine-trees and many potential covariates are recorded at about 580 grid points of a 4 km × 4 km grid. The aim is to identify a set of predictors for needle loss and to investigate the relationships between the needle loss and the predictors. The response variable needle loss is recorded as a percentage in 5% steps estimated by eye using binoculars and categorized into healthy trees (10% or less), intermediate trees (10–25%) and damaged trees (25% or more). We use a Bayesian cumulative threshold model with non-linear functions of continuous variables and a random effect for spatial heterogeneity. For both the non-linear functions and the spatial random effect we use Bayesian versions of P -splines as priors. Our method is novel in that it deals with several non-standard data requirements: the ordinal response variable (the categorized version of needle loss), non-linear effects of covariates, spatial heterogeneity and prediction with missing covariates. The model is a special case of models with a geoadditive or more generally structured additive predictor. Inference can be based on Markov chain Monte Carlo techniques or mixed model technology. 相似文献

6.

Disaggregated spatial modelling for areal unit categorical data

Tassone EC Miranda ML Gelfand AE 《Journal of the Royal Statistical Society. Series C, Applied statistics》2010,59(1):175-190

Summary. We consider joint spatial modelling of areal multivariate categorical data assuming a multiway contingency table for the variables, modelled by using a log-linear model, and connected across units by using spatial random effects. With no distinction regarding whether variables are response or explanatory, we do not limit inference to conditional probabilities, as in customary spatial logistic regression. With joint probabilities we can calculate arbitrary marginal and conditional probabilities without having to refit models to investigate different hypotheses. Flexible aggregation allows us to investigate subgroups of interest; flexible conditioning enables not only the study of outcomes given risk factors but also retrospective study of risk factors given outcomes. A benefit of joint spatial modelling is the opportunity to reveal disparities in health in a richer fashion, e.g. across space for any particular group of cells, across groups of cells at a particular location, and, hence, potential space–group interaction. We illustrate with an analysis of birth records for the state of North Carolina and compare with spatial logistic regression. 相似文献

7.

Relationships Between Full and Layer Models with Applications to Level Merging

Vivek Vijay 《统计学通讯:理论与方法》2013,42(4):745-761

Analysis of a large dimensional contingency table is quite involved. Models corresponding to layers of a contingency table are easier to analyze than the full model. Relationships between the interaction parameters of the full log-linear model and that of its corresponding layer models are obtained. These relationships are not only useful to reduce the analysis but also useful to interpret various hierarchical models. We obtain these relationships for layers of one variable, and extend the results for the case when layers of more than one variable are considered. We also establish, under conditional independence, relationships between the interaction parameters of the full model and that of the corresponding marginal models. We discuss the concept of merging of factor levels based on these interaction parameters. Finally, we use the relationships between layer models and full model to obtain conditions for level merging based on layer interaction parameters. Several examples are discussed to illustrate the results. 相似文献

8.

An Example of the Large Sample Behavior of the Midrange

James D. Broffitt 《The American statistician》2013,67(2):69-70

A data set in the form of a 2 × 2 × 2 contingency table is presented and analyzed in detail. For instructional purposes, the analysis of the data can be used to illustrate some basic concepts in the loglinear model approach to the analysis of multidimensional contingency tables. 相似文献

9.

Quasi‐Symmetric Graphical Log‐Linear Models

ANNA GOTTARD GIOVANNI MARIA MARCHETTI ALAN AGRESTI 《Scandinavian Journal of Statistics》2011,38(3):447-465

Abstract. We propose an extension of graphical log‐linear models to allow for symmetry constraints on some interaction parameters that represent homologous factors. The conditional independence structure of such quasi‐symmetric (QS) graphical models is described by an undirected graph with coloured edges, in which a particular colour corresponds to a set of equality constraints on a set of parameters. Unlike standard QS models, the proposed models apply with contingency tables for which only some variables or sets of the variables have the same categories. We study the graphical properties of such models, including conditions for decomposition of model parameters and of maximum likelihood estimates. 相似文献

10.

Exact inference in contingency tables via stochastic approximation Monte Carlo

《Journal of the Korean Statistical Society》2014,43(1):31-45

Monte Carlo methods for the exact inference have received much attention recently in complete or incomplete contingency table analysis. However, conventional Markov chain Monte Carlo, such as the Metropolis–Hastings algorithm, and importance sampling methods sometimes generate the poor performance by failing to produce valid tables. In this paper, we apply an adaptive Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm (SAMC; Liang, Liu, & Carroll, 2007), to the exact test of the goodness-of-fit of the model in complete or incomplete contingency tables containing some structural zero cells. The numerical results are in favor of our method in terms of quality of estimates. 相似文献

11.

Analysis of repeated categorical responses from fully and partially cross-classified data

Michael Haber Catherine C. H. Chen C. David Williamson 《统计学通讯:理论与方法》2013,42(10):3293-3313

Many follow-up studies involve categorical data measured on the same individual at different times. Frequently, some of the individuals are missing one or more of the measurements. This results in a contingency table with both fully and partially cross-classified data. Two models can be used to analyze data of this type: (i) The multiple-sample model, in which all the study subjects with the same configuration of missing observations are considered a separate sample. (ii) The single-sample model, which assumes that the missing observations are the result of a mechanism causing subjects to lose the informtion from one or some of the measurements. In this work we compare the two approaches and show that under certain conditions, the two models yield the same maximum likelihood estimates of the cell probabilities in the underlying contingency table. 相似文献

12.

Marginal Nested Interactions for Contingency Tables

Manuela Cazzaro Roberto Colombi 《统计学通讯:理论与方法》2014,43(13):2799-2814

We introduce a new definition of generalized marginal interactions, called marginal nested interactions, which includes baseline, local, continuation and reverse continuation logits and odds ratios as special cases. The significant aspect of this definition is the inclusion of new types of logits and odds ratios that can handle non-ordinal, ordinal and partially ordered categorical variables in a flexible and appropriate way. It is proved also that the marginal nested interactions define a saturated model of a multi-way contingency table. 相似文献

13.

Minimal Markov Basis for Tests of Main Effect Models for 2p-1 Fractional Factorial Designs of Resolution p

Satoshi Aoki 《统计学通讯:模拟与计算》2015,44(9):2371-2386

We consider conditional exact tests of factor effects in design of experiments for discrete response variables. Similarly to the analysis of contingency tables, Markov chain Monte Carlo methods can be used to perform exact tests, especially when large-sample approximations of the null distributions are poor and the enumeration of the conditional sample space is infeasible. In order to construct a connected Markov chain over the appropriate sample space, one approach is to compute a Markov basis. Theoretically, a Markov basis can be characterized as a generator of a well-specified toric ideal in a polynomial ring and is computed by computational algebraic software. However, the computation of a Markov basis sometimes becomes infeasible, even for problems of moderate sizes. In the present article, we obtain the closed-form expression of minimal Markov bases for the main effect models of 2^{p ? 1} fractional factorial designs of resolution p. 相似文献

14.

Markov switching model of nonlinear autoregressive with skew-symmetric innovations

Arezo Hajrajabi 《Journal of Statistical Computation and Simulation》2019,89(4):559-575

We consider data generating structures which can be represented as a Markov switching of nonlinear autoregressive model with considering skew-symmetric innovations such that switching between the states is controlled by a hidden Markov chain. We propose semi-parametric estimators for the nonlinear functions of the proposed model based on a maximum likelihood (ML) approach and study sufficient conditions for geometric ergodicity of the process. Also, an Expectation-Maximization type optimization for obtaining the ML estimators are presented. A simulation study and a real world application are also performed to illustrate and evaluate the proposed methodology. 相似文献

15.

A Multistage Chi-square Test for Measurement of the Degree of Association in Two-way Table

Jianping Zhu 《统计学通讯:模拟与计算》2016,45(4):1197-1212

The classical chi-square test for independence cannot convey additional information and the degree of the association of two factors in two-way contingency table. Besides, measures of association by contingency coefficient need to be used with care, because the association measures depend on the number of rows r and the number of columns c. This article proposes a multistage chi-square test to measure the degree of the association between the two factors in two-way contingency table. We also give simulation and real examples to assess the performance of the proposed method. The results show that our proposed method can effectively investigate the degree of association of two factors in two-way contingency table. 相似文献

16.

Bayesian methods for contingency tables using Gibbs sampling

Paul E. Green Taesung Park 《Statistical Papers》2004,45(1):33-50

Cell counts in contingency tables can be smoothed using loglinear models. Recently, sampling-based methods such as Markov chain Monte Carlo (MCMC) have been introduced, making it possible to sample from posterior distributions. The novelty of the approach presented here is that all conditional distributions can be specified directly, so that straight-forward Gibbs sampling is possible. Thus, the model is constructed in a way that makes burn-in and checking convergence a relatively minor issue. The emphasis of this paper is on smoothing cell counts in contingency tables, and not so much on estimation of regression parameters. Therefore, the prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage, and a vague prior for hyperparameters at the second stage. The smoothed counts tend to compromise between the observed data and a log-linear model. The methods are demonstrated with a sparse data table taken from a multi-center clinical trial. The research for the first author was supported by Brain Pool program of the Korean Federation of Science and Technology Societies. The research for the second author was partially supported by KOSEF through Statistical Research Center for Complex Systems at Seoul National University. 相似文献

17.

A Note on Collapsibility in DAG Models of Contingency Tables

SUNG-HO KIM SEONG-HO KIM 《Scandinavian Journal of Statistics》2006,33(3):575-590

Abstract. Necessary and sufficient conditions for collapsibility of a directed acyclic graph (DAG) model for a contingency table are derived. By applying the conditions, we can easily check collapsibility over any variable in a given model either by using the joint probability distribution or by using the graph of the model structure. It is shown that collapsibility over a set of variables can be checked in a sequential manner. Furthermore, a DAG is compared with its moral graph in the context of collapsibility. 相似文献

18.

Comparing Fits of Latent Trait and Latent Class Models Applied to Sparse Binary Data: An Illustration with Human Resource Management Data

Lilian M. De Menezes Ana Lasaosa 《Journal of applied statistics》2007,34(3):303-319

This paper addresses the problem of comparing the fit of latent class and latent trait models when the indicators are binary and the contingency table is sparse. This problem is common in the analysis of data from large surveys, where many items are associated with an unobservable variable. A study of human resource data illustrates: (1) how the usual goodness-of-fit tests, model selection and cross-validation criteria can be inconclusive; (2) how model selection and evaluation procedures from time series and economic forecasting can be applied to extend residual analysis in this context. 相似文献

19.

Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations 总被引：7，自引：0，他引：7

Håvard Rue Sara Martino Nicolas Chopin 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(2):319-392

Summary. Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models , where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged. 相似文献

20.

Markov chain Monte Carlo exact tests for incomplete two-way contingency tables

《Journal of Statistical Computation and Simulation》2012,82(10):787-812

We consider testing the quasi-independence hypothesis for two-way contingency tables which contain some structural zero cells. For sparse contingency tables where the large sample approximation is not adequate, the Markov chain Monte Carlo exact tests are powerful tools. To construct a connected chain over the two-way contingency tables with fixed sufficient statistics and an arbitrary configuration of structural zero cells, an algebraic algorithm proposed by Diaconis and Sturmfels [Diaconis, P. and Sturmfels, B. (1998). The Annals of statistics, 26, pp. 363–397.] can be used. However, their algorithm does not seem to be a satisfactory answer, because the Markov basis produced by the algorithm often contains many redundant elements and is hard to interpret. We derive an explicit characterization of a minimal Markov basis, prove its uniqueness, and present an algorithm for obtaining the unique minimal basis. A computational example and the discussion on further basis reduction for the case of positive sufficient statistics are also given. 相似文献