首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary We propose a new class of prior distributions for the analysis of discrete graphical models. Such a class, obtained following a conditional approach, generalizes the hyper Dirichlet distributions of Dawid and Lauritzen (1993), since it can be extended to non decomposable graphical models. The two classes are compared in terms of model selection, with an application to a medical data-set illustrating the performance of the two resulting procedures. The proposed class turns out to select simpler, more par-simonious structures.  相似文献   

2.
We focus on the problem of selection of a subset of the variables so as to preserve the multivariate data structure that a principal-components analysis of the initial variables would reveal. We propose a new method based on some adapted Gaussian graphical models. This method is then compared with those developed by Bonifas et al. (1984) and Krzanowski (1987a, b). It appears that the criteria for all methods consider the same correlation submatrices and often lead to similar results. The proposed approach offers some guidance as to the number of variables to be selected. In particular, Akaike's information criterion is used.  相似文献   

3.
The graphical lasso has now become a useful tool to estimate high-dimensional Gaussian graphical models, but its practical applications suffer from the problem of choosing regularization parameters in a data-dependent way. In this article, we propose a model-averaged method for estimating sparse inverse covariance matrices for Gaussian graphical models. We consider the graphical lasso regularization path as the model space for Bayesian model averaging and use Markov chain Monte Carlo techniques for the regularization path point selection. Numerical performance of our method is investigated using both simulated and real datasets, in comparison with some state-of-art model selection procedures.  相似文献   

4.
This paper analyzes the impact of some kinds of contaminant on model selection in graphical Gaussian models. We investigate four different kinds of contaminants, in order to consider the effect of gross errors, model deviations, and model misspecification. The aim of the work is to assess against which kinds of contaminant a model selection procedure for graphical Gaussian models has a more robust behavior. The analysis is based on simulated data. The simulation study shows that relatively few contaminated observations in even just one of the variables can have a significant impact on correct model selection, especially when the contaminated variable is a node in a separating set of the graph.  相似文献   

5.
This article describes a propagation scheme for Bayesian networks with conditional Gaussian distributions that does not have the numerical weaknesses of the scheme derived in Lauritzen (Journal of the American Statistical Association 87: 1098–1108, 1992).The propagation architecture is that of Lauritzen and Spiegelhalter (Journal of the Royal Statistical Society, Series B 50: 157– 224, 1988).In addition to the means and variances provided by the previous algorithm, the new propagation scheme yields full local marginal distributions. The new scheme also handles linear deterministic relationships between continuous variables in the network specification.The computations involved in the new propagation scheme are simpler than those in the previous scheme and the method has been implemented in the most recent version of the HUGIN software.  相似文献   

6.
Conditional Gaussian graphical models are a reparametrization of the multivariate linear regression model which explicitly exhibits (i) the partial covariances between the predictors and the responses, and (ii) the partial covariances between the responses themselves. Such models are particularly suitable for interpretability since partial covariances describe direct relationships between variables. In this framework, we propose a regularization scheme to enhance the learning strategy of the model by driving the selection of the relevant input features by prior structural information. It comes with an efficient alternating optimization procedure which is guaranteed to converge to the global minimum. On top of showing competitive performance on artificial and real datasets, our method demonstrates capabilities for fine interpretation, as illustrated on three high-dimensional datasets from spectroscopy, genetics, and genomics.  相似文献   

7.
Abstract

Covariance estimation and selection for multivariate datasets in a high-dimensional regime is a fundamental problem in modern statistics. Gaussian graphical models are a popular class of models used for this purpose. Current Bayesian methods for inverse covariance matrix estimation under Gaussian graphical models require the underlying graph and hence the ordering of variables to be known. However, in practice, such information on the true underlying model is often unavailable. We therefore propose a novel permutation-based Bayesian approach to tackle the unknown variable ordering issue. In particular, we utilize multiple maximum a posteriori estimates under the DAG-Wishart prior for each permutation, and subsequently construct the final estimate of the inverse covariance matrix. The proposed estimator has smaller variability and yields order-invariant property. We establish posterior convergence rates under mild assumptions and illustrate that our method outperforms existing approaches in estimating the inverse covariance matrices via simulation studies.  相似文献   

8.
While conjugate Bayesian inference in decomposable Gaussian graphical models is largely solved, the non-decomposable case still poses difficulties concerned with the specification of suitable priors and the evaluation of normalizing constants. In this paper we derive the DY-conjugate prior ( Diaconis & Ylvisaker, 1979 ) for non-decomposable models and show that it can be regarded as a generalization to an arbitrary graph G of the hyper inverse Wishart distribution ( Dawid & Lauritzen, 1993 ). In particular, if G is an incomplete prime graph it constitutes a non-trivial generalization of the inverse Wishart distribution. Inference based on marginal likelihood requires the evaluation of a normalizing constant and we propose an importance sampling algorithm for its computation. Examples of structural learning involving non-decomposable models are given. In order to deal efficiently with the set of all positive definite matrices with non-decomposable zero-pattern we introduce the operation of triangular completion of an incomplete triangular matrix. Such a device turns out to be extremely useful both in the proof of theoretical results and in the implementation of the Monte Carlo procedure.  相似文献   

9.
The Gaussian graphical model (GGM) is one of the well-known modelling approaches to describe biological networks under the steady-state condition via the precision matrix of data. In literature there are different methods to infer model parameters based on GGM. The neighbourhood selection with the lasso regression and the graphical lasso method are the most common techniques among these alternative estimation methods. But they can be computationally demanding when the system's dimension increases. Here, we suggest a non-parametric statistical approach, called the multivariate adaptive regression splines (MARS) as an alternative of GGM. To compare the performance of both models, we evaluate the findings of normal and non-normal data via the specificity, precision, F-measures and their computational costs. From the outputs, we see that MARS performs well, resulting in, a plausible alternative approach with respect to GGM in the construction of complex biological systems.  相似文献   

10.
多图模型表示来自于不同类的同一组随机变量间的相关关系,结点表示随机变量,边表示变量之间的直接联系,各类的图模型反映了各自相关结构特征和类间共同的信息。用多图模型联合估计方法,将来自不同个体的数据按其特征分类,假设每类中各变量间的相依结构服从同一个高斯图模型,应用组Lasso方法和图Lasso方法联合估计每类的图模型结构。数值模拟验证了多图模型联合估计方法的有效性。用多图模型和联合估计方法对中国15个省份13个宏观经济指标进行相依结构分析,结果表明,不同经济发展水平省份的宏观经济变量间存在共同的相关联系,反映了中国现阶段经济发展的特征;每一类的相关结构反映了各类省份经济发展独有的特征。  相似文献   

11.
Gaussian graphical models represent the backbone of the statistical toolbox for analyzing continuous multivariate systems. However, due to the intrinsic properties of the multivariate normal distribution, use of this model family may hide certain forms of context-specific independence that are natural to consider from an applied perspective. Such independencies have been earlier introduced to generalize discrete graphical models and Bayesian networks into more flexible model families. Here, we adapt the idea of context-specific independence to Gaussian graphical models by introducing a stratification of the Euclidean space such that a conditional independence may hold in certain segments but be absent elsewhere. It is shown that the stratified models define a curved exponential family, which retains considerable tractability for parameter estimation and model selection.  相似文献   

12.
13.
A new methodology for model determination in decomposable graphical Gaussian models (Dawid and Lauritzen in Ann. Stat. 21(3), 1272?C1317, 1993) is developed. The Bayesian paradigm is used and, for each given graph, a hyper-inverse Wishart prior distribution on the covariance matrix is considered. This prior distribution depends on hyper-parameters. It is well-known that the models??s posterior distribution is sensitive to the specification of these hyper-parameters and no completely satisfactory method is registered. In order to avoid this problem, we suggest adopting an empirical Bayes strategy, that is a strategy for which the values of the hyper-parameters are determined using the data. Typically, the hyper-parameters are fixed to their maximum likelihood estimations. In order to calculate these maximum likelihood estimations, we suggest a Markov chain Monte Carlo version of the Stochastic Approximation EM algorithm. Moreover, we introduce a new sampling scheme in the space of graphs that improves the add and delete proposal of Armstrong et al. (Stat. Comput. 19(3), 303?C316, 2009). We illustrate the efficiency of this new scheme on simulated and real datasets.  相似文献   

14.
We present an objective Bayes method for covariance selection in Gaussian multivariate regression models having a sparse regression and covariance structure, the latter being Markov with respect to a directed acyclic graph (DAG). Our procedure can be easily complemented with a variable selection step, so that variable and graphical model selection can be performed jointly. In this way, we offer a solution to a problem of growing importance especially in the area of genetical genomics (eQTL analysis). The input of our method is a single default prior, essentially involving no subjective elicitation, while its output is a closed form marginal likelihood for every covariate‐adjusted DAG model, which is constant over each class of Markov equivalent DAGs; our procedure thus naturally encompasses covariate‐adjusted decomposable graphical models. In realistic experimental studies, our method is highly competitive, especially when the number of responses is large relative to the sample size.  相似文献   

15.
A model company     
HUGIN Expert is a small company writing software that can be used to create expert systems, using probability in the guise of graphical models. Steffen Lauritzen describes his part in the genesis and development of the company.  相似文献   

16.
The high-dimensional data arises in diverse fields of sciences, engineering and humanities. Variable selection plays an important role in dealing with high dimensional statistical modelling. In this article, we study the variable selection of quadratic approximation via the smoothly clipped absolute deviation (SCAD) penalty with a diverging number of parameters. We provide a unified method to select variables and estimate parameters for various of high dimensional models. Under appropriate conditions and with a proper regularization parameter, we show that the estimator has consistency and sparsity, and the estimators of nonzero coefficients enjoy the asymptotic normality as they would have if the zero coefficients were known in advance. In addition, under some mild conditions, we can obtain the global solution of the penalized objective function with the SCAD penalty. Numerical studies and a real data analysis are carried out to confirm the performance of the proposed method.  相似文献   

17.
Most existing reduced-form macroeconomic multivariate time series models employ elliptical disturbances, so that the forecast densities produced are symmetric. In this article, we use a copula model with asymmetric margins to produce forecast densities with the scope for severe departures from symmetry. Empirical and skew t distributions are employed for the margins, and a high-dimensional Gaussian copula is used to jointly capture cross-sectional and (multivariate) serial dependence. The copula parameter matrix is given by the correlation matrix of a latent stationary and Markov vector autoregression (VAR). We show that the likelihood can be evaluated efficiently using the unique partial correlations, and estimate the copula using Bayesian methods. We examine the forecasting performance of the model for four U.S. macroeconomic variables between 1975:Q1 and 2011:Q2 using quarterly real-time data. We find that the point and density forecasts from the copula model are competitive with those from a Bayesian VAR. During the recent recession the forecast densities exhibit substantial asymmetry, avoiding some of the pitfalls of the symmetric forecast densities from the Bayesian VAR. We show that the asymmetries in the predictive distributions of GDP growth and inflation are similar to those found in the probabilistic forecasts from the Survey of Professional Forecasters. Last, we find that unlike the linear VAR model, our fitted Gaussian copula models exhibit nonlinear dependencies between some macroeconomic variables. This article has online supplementary material.  相似文献   

18.
One of the most basic topics in many introductory statistical methods texts is inference for a population mean, μ. The primary tool for confidence intervals and tests is the Student t sampling distribution. Although the derivation requires independent identically distributed normal random variables with constant variance, σ2, most authors reassure the readers about some robustness to the normality and constant variance assumptions. Some point out that if one is concerned about assumptions, one may statistically test these prior to reliance on the Student t. Most software packages provide optional test results for both (a) the Gaussian assumption and (b) homogeneity of variance. Many textbooks advise only informal graphical assessments, such as certain scatterplots for independence, others for constant variance, and normal quantile–quantile plots for the adequacy of the Gaussian model. We concur with this recommendation. As convincing evidence against formal tests of (a), such as the Shapiro–Wilk, we offer a simulation study of the tails of the resulting conditional sampling distributions of the Studentized mean. We analyze the results of systematically screening all samples from normal, uniform, exponential, and Cauchy populations. This pretest does not correct the erroneous significance levels and makes matters worse for the exponential. In practice, we conclude that graphical diagnostics are better than a formal pretest. Furthermore, rank or permutation methods are recommended for exact validity in the symmetric case.  相似文献   

19.
Although the t-type estimator is a kind of M-estimator with scale optimization, it has some advantages over the M-estimator. In this article, we first propose a t-type joint generalized linear model as a robust extension to the classical joint generalized linear models for modeling data containing extreme or outlying observations. Next, we develop a t-type pseudo-likelihood (TPL) approach, which can be viewed as a robust version to the existing pseudo-likelihood (PL) approach. To determine which variables significantly affect the variance of the response variable, we then propose a unified penalized maximum TPL method to simultaneously select significant variables for the mean and dispersion models in t-type joint generalized linear models. Thus, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the mean and dispersion models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies are conducted to illustrate the proposed methods.  相似文献   

20.
Abstract.  The Andersson–Madigan–Perlman (AMP) Markov property is a recently proposed alternative Markov property (AMP) for chain graphs. In the case of continuous variables with a joint multivariate Gaussian distribution, it is the AMP rather than the earlier introduced Lauritzen–Wermuth–Frydenberg Markov property that is coherent with data-generation by natural block-recursive regressions. In this paper, we show that maximum likelihood estimates in Gaussian AMP chain graph models can be obtained by combining generalized least squares and iterative proportional fitting to an iterative algorithm. In an appendix, we give useful convergence results for iterative partial maximization algorithms that apply in particular to the described algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号