首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Abstract. We propose an extension of graphical log‐linear models to allow for symmetry constraints on some interaction parameters that represent homologous factors. The conditional independence structure of such quasi‐symmetric (QS) graphical models is described by an undirected graph with coloured edges, in which a particular colour corresponds to a set of equality constraints on a set of parameters. Unlike standard QS models, the proposed models apply with contingency tables for which only some variables or sets of the variables have the same categories. We study the graphical properties of such models, including conditions for decomposition of model parameters and of maximum likelihood estimates.  相似文献   

2.
Combining statistical models is an useful approach in all the research area where a global picture of the problem needs to be constructed by binding together evidence from different sources [M.S. Massa and S.L. Lauritzen Combining Statistical Models, M. Viana and H. Wynn, eds., American Mathematical Society, Providence, RI, 2010, pp. 239–259]. In this paper, we investigate the effectiveness of combining a fixed number of Gaussian graphical models respecting some consistency assumptions in problems of model building. In particular, we use the meta-Markov combination of Gaussian graphical models as detailed in Massa and Lauritzen and compare model selection results obtained by combining selections over smaller sets of variables with selection results over all variables of interest. In order to do so, we carry out some simulation studies in which different criteria are considered for the selection procedures. We conclude that the combination performs, generally, better than global estimation, is computationally simpler by virtue of having fewer and simpler models to work on, and has an intuitive appeal to a wide variety of contexts.  相似文献   

3.
In this paper we consider the regression problem for random sets of the Boolean-model type. Regression modeling of the Boolean random sets using some explanatory variables are classified according to the type of these variables as propagation, growth or propagation-growth models. The maximum likelihood estimation of the parameters for the propagation model is explained in detail for some specific link functions using three methods. These three methods of estimation are also compared in a simulation study.  相似文献   

4.
Graphical models capture the conditional independence structure among random variables via existence of edges among vertices. One way of inferring a graph is to identify zero partial correlation coefficients, which is an effective way of finding conditional independence under a multivariate Gaussian setting. For more general settings, we propose kernel partial correlation which extends partial correlation with a combination of two kernel methods. First, a nonparametric function estimation is employed to remove effects from other variables, and then the dependence between remaining random components is assessed through a nonparametric association measure. The proposed approach is not only flexible but also robust under high levels of noise owing to the robustness of the nonparametric approaches.  相似文献   

5.
Markov networks are popular models for discrete multivariate systems where the dependence structure of the variables is specified by an undirected graph. To allow for more expressive dependence structures, several generalizations of Markov networks have been proposed. Here, we consider the class of contextual Markov networks which takes into account possible context‐specific independences among pairs of variables. Structure learning of contextual Markov networks is very challenging due to the extremely large number of possible structures. One of the main challenges has been to design a score, by which a structure can be assessed in terms of model fit related to complexity, without assuming chordality. Here, we introduce the marginal pseudo‐likelihood as an analytically tractable criterion for general contextual Markov networks. Our criterion is shown to yield a consistent structure estimator. Experiments demonstrate the favourable properties of our method in terms of predictive accuracy of the inferred models.  相似文献   

6.
ABSTRACT

Log-linear models for the distribution on a contingency table are represented as the intersection of only two kinds of log-linear models. One assuming that a certain group of the variables, if conditioned on all other variables, has a jointly independent distribution and another one assuming that a certain group of the variables, if conditioned on all other variables, has no highest order interaction. The subsets entering into these models are uniquely determined by the original log-linear model. This canonical representation suggests considering joint conditional independence and conditional no highest order association as the elementary building blocks of log-linear models.  相似文献   

7.
As new technologies permit the generation of hitherto unprecedented volumes of data (e.g. genome-wide association study data), researchers struggle to keep up with the added complexity and time commitment required for its analysis. For this reason, model selection commonly relies on machine learning and data-reduction techniques, which tend to afford models with obscure interpretations. Even in cases with straightforward explanatory variables, the so-called ‘best’ model produced by a given model-selection technique may fail to capture information of vital importance to the domain-specific questions at hand. Herein we propose a new concept for model selection, feasibility, for use in identifying multiple models that are in some sense optimal and may unite to provide a wider range of information relevant to the topic of interest, including (but not limited to) interaction terms. We further provide an R package and associated Shiny Applications for use in identifying or validating feasible models, the performance of which we demonstrate on both simulated and real-life data.  相似文献   

8.
The log-linear model is a tool widely accepted for modelling discrete data given in a contingency table. Although its parameters reflect the interaction structure in the joint distribution of all variables, it does not give information about structures appearing in the margins of the table. This is in contrast to multivariate logistic parameters, recently introduced by Glonek & McCullagh (1995), which have as parameters the highest order log odds ratios derived from the joint table and from each marginal table. Glonek & McCullagh give the link between the cell probabilities and the multivariate logistic parameters, in an algebraic fashion. The present paper focuses on this link, showing that it is derived by general parameter transformations in exponential families. In particular, the connection between the natural, the expectation and the mixed parameterization in exponential families (Barndorff-Nielsen, 1978) is used; this also yields the derivatives of the likelihood equation and shows properties of the Fisher matrix. The paper emphasises the analysis of independence hypotheses in margins of a contingency table.  相似文献   

9.
Some general remarks are made about likelihood factorizations, distinguishing parameter-based factorizations and concentration-graph factorizations. Two parametric families of distributions for mixed discrete and continuous variables are discussed. Conditions on graphs are given for the circumstances under which their joint analysis can be split into separate analyses, each involving a reduced set of component variables and parameters. The result shows marked differences between the two families although both involve the same necessary condition on prime graphs. This condition is both necessary and sufficient for simplified estimation in Gaussian and for discrete log linear models.  相似文献   

10.
We extend the log‐mean linear parameterization for binary data to discrete variables with arbitrary number of levels and show that also in this case it can be used to parameterize bi‐directed graph models. Furthermore, we show that the log‐mean linear parameterization allows one to simultaneously represent marginal independencies among variables and marginal independencies that only appear when certain levels are collapsed into a single one. We illustrate the application of this property by means of an example based on genetic association studies involving single‐nucleotide polymorphisms. More generally, this feature provides a natural way to reduce the parameter count, while preserving the independence structure, by means of substantive constraints that give additional insight into the association structure of the variables. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics  相似文献   

11.
Graphical Markov models use undirected graphs (UDGs), acyclic directed graphs (ADGs), or (mixed) chain graphs to represent possible dependencies among random variables in a multivariate distribution. Whereas a UDG is uniquely determined by its associated Markov model, this is not true for ADGs or for general chain graphs (which include both UDGs and ADGs as special cases). This paper addresses three questions regarding the equivalence of graphical Markov models: when is a given chain graph Markov equivalent (1) to some UDG? (2) to some (at least one) ADG? (3) to some decomposable UDG? The answers are obtained by means of an extension of Frydenberg’s (1990) elegant graph-theoretic characterization of the Markov equivalence of chain graphs.  相似文献   

12.
Strict collapsibility and model collapsibility are two important concepts associated with the dimension reduction of a multidimensional contingency table, without losing the relevant information. In this paper, we obtain some necessary and sufficient conditions for the strict collapsibility of the full model, with respect to an interaction factor or a set of interaction factors, based on the interaction parameters of the conditional/layer log-linear models. For hierarchical log-linear models, we present also necessary and sufficient conditions for the full model to be model collapsible, based on the conditional interaction parameters. We discuss both the cases where one variable or a set of variables is conditioned. The connections between the strict collapsibility and the model collapsibility are also pointed out. Our results are illustrated through suitable examples, including a real life application.  相似文献   

13.
Maximal correlation has several desirable properties as a measure of dependence, including the fact that it vanishes if and only if the variables are independent. Except for a few special cases, it is hard to evaluate maximal correlation explicitly. We focus on two-dimensional contingency tables and discuss a procedure for estimating maximal correlation, which we use for constructing a test of independence. We compare the maximal correlation test with other tests of independence by Monte Carlo simulations. When the underlying continuous variables are dependent but uncorrelated, we point out some cases for which the new test is more powerful.  相似文献   

14.
We focus on the problem of selection of a subset of the variables so as to preserve the multivariate data structure that a principal-components analysis of the initial variables would reveal. We propose a new method based on some adapted Gaussian graphical models. This method is then compared with those developed by Bonifas et al. (1984) and Krzanowski (1987a, b). It appears that the criteria for all methods consider the same correlation submatrices and often lead to similar results. The proposed approach offers some guidance as to the number of variables to be selected. In particular, Akaike's information criterion is used.  相似文献   

15.
We propose a class of multidimensional Item Response Theory models for polytomously-scored items with ordinal response categories. This class extends an existing class of multidimensional models for dichotomously-scored items in which the latent abilities are represented by a random vector assumed to have a discrete distribution, with support points corresponding to different latent classes in the population. In the proposed approach, we allow for different parameterizations for the conditional distribution of the response variables given the latent traits, which depend on the type of link function and the constraints imposed on the item parameters. Moreover, we suggest a strategy for model selection that is based on a series of steps consisting of selecting specific features, such as the dimension of the model (number of latent traits), the number of latent classes, and the specific parameterization. In order to illustrate the proposed approach, we analyze a dataset from a study on anxiety and depression on a sample of oncological patients.  相似文献   

16.
A bank offering unsecured personal loans may be interested in several related outcome variables, including defaulting on the repayments, early repayment or failing to take up an offered loan. Current predictive models used by banks typically consider such variables individually. However, the fact that they are related to each other, and to many interrelated potential predictor variables, suggests that graphical models may provide an attractive alternative solution. We developed such a model for a data set of 15 variables measured on a set of 14 000 applications for unsecured personal loans. The resulting global model of behaviour enabled us to identify several previously unsuspected relationships of considerable interest to the bank. For example, we discovered important but obscure relationships between taking out insurance, prior delinquency with a credit card and delinquency with the loan.  相似文献   

17.
In this paper, the Rayleigh–Lindley (RL) distribution is introduced, obtained by compounding the Rayleigh and Lindley discrete distributions, where the compounding procedure follows an approach similar to the one previously studied by Adamidis and Loukas in some other contexts. The resulting distribution is a two-parameter model, which is competitive with other parsimonious models such as the gamma and Weibull distributions. We study some properties of this new model such as the moments and the mean residual life. The estimation was approached via EM algorithm. The behavior of these estimators was studied in finite samples through a simulation study. Finally, we report two real data illustrations in order to show the performance of the proposed model versus other common two-parameter models in the literature. The main conclusion is that the model proposed can be a valid alternative to other competing models well established in the literature.  相似文献   

18.
Extended log-linear models (ELMs) are the natural generalization of log-linear models when the positivity assumption is relaxed. The hypergraph language, which is currently used to specify the syntax of ELMs, both provides an insight into key notions of the theory of ELMs such as collapsibility and decomposability, and allows to work out efficient algorithms to solve some problems of inference. This is the case for the three search problems addressed in this paper and referred to as the approximation problem, the selective-reduction problem and the synthesis problem. The approximation problem consists in finding the smallest decomposable ELM that contains a given ELM and is such that the given ELM is collapsible onto each of its generators. The selective-reduction problem consists in deleting the maximum number of generators of a given ELM in such a way that the resulting ELM is a submodel and none of certain variables of interest is missing. The synthesis problem consists in finding a minimal ELM containing the intersection of ELMs specified by given independence relations. We show that each of the three search problems above can be reduced to an equivalent search problem on hypergraphs, which can be solved in polynomial time.  相似文献   

19.
In this article we obtain some novel results on pairwise quasi-asymptotically independent (pQAI) random variables. Concretely speaking, let X1, …, Xn be n real-valued pQAI random variables, and W1, …, Wn be another n non negative and arbitrarily dependent random variables, but independent of X1, …, Xn. Under some mild conditions, we prove that W1X1, …, WnXn are still pQAI as well. Our result is in a general setting whether the primary random variables X1, …, Xn are heavy-tailed or not. Finally, a special case of above result is applied to risk theory for investigating the finite-time ruin probability for a discrete-time risk model with a wide type of dependence structure.  相似文献   

20.
Gaussian graphical models represent the backbone of the statistical toolbox for analyzing continuous multivariate systems. However, due to the intrinsic properties of the multivariate normal distribution, use of this model family may hide certain forms of context-specific independence that are natural to consider from an applied perspective. Such independencies have been earlier introduced to generalize discrete graphical models and Bayesian networks into more flexible model families. Here, we adapt the idea of context-specific independence to Gaussian graphical models by introducing a stratification of the Euclidean space such that a conditional independence may hold in certain segments but be absent elsewhere. It is shown that the stratified models define a curved exponential family, which retains considerable tractability for parameter estimation and model selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号