首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For high-dimensional data, it is a tedious task to determine anomalies such as outliers. We present a novel outlier detection method for high-dimensional contingency tables. We use the class of decomposable graphical models to model the relationship among the variables of interest, which can be depicted by an undirected graph called the interaction graph. Given an interaction graph, we derive a closed-form expression of the likelihood ratio test (LRT) statistic and an exact distribution for efficient simulation of the test statistic. An observation is declared an outlier if it deviates significantly from the approximated distribution of the test statistic under the null hypothesis. We demonstrate the use of the LRT outlier detection framework on genetic data modeled by Chow–Liu trees.  相似文献   

2.
The late-2000s financial crisis stressed the need to understand the world financial system as a network of countries, where cross-border financial linkages play a fundamental role in the spread of systemic risks. Financial network models, which take into account the complex interrelationships between countries, seem to be an appropriate tool in this context. To improve the statistical performance of financial network models, we propose to generate them by means of multivariate graphical models. We then introduce Bayesian graphical models, which can take model uncertainty into account, and dynamic Bayesian graphical models, which provide a convenient framework to model temporal cross-border data, decomposing the model into autoregressive and contemporaneous networks. The article shows how the application of the proposed models to the Bank of International Settlements locational banking statistics allows the identification of four distinct groups of countries, that can be considered central in systemic risk contagion.  相似文献   

3.
This paper considers a hierarchical Bayesian analysis of regression models using a class of Gaussian scale mixtures. This class provides a robust alternative to the common use of the Gaussian distribution as a prior distribution in particular for estimating the regression function subject to uncertainty about the constraint. For this purpose, we use a family of rectangular screened multivariate scale mixtures of Gaussian distribution as a prior for the regression function, which is flexible enough to reflect the degrees of uncertainty about the functional constraint. Specifically, we propose a hierarchical Bayesian regression model for the constrained regression function with uncertainty on the basis of three stages of a prior hierarchy with Gaussian scale mixtures, referred to as a hierarchical screened scale mixture of Gaussian regression models (HSMGRM). We describe distributional properties of HSMGRM and an efficient Markov chain Monte Carlo algorithm for posterior inference, and apply the proposed model to real applications with constrained regression models subject to uncertainty.  相似文献   

4.
In this paper we discuss graphical models for mixed types of continuous and discrete variables with incomplete data. We use a set of hyperedges to represent an observed data pattern. A hyperedge is a set of variables observed for a group of individuals. In a mixed graph with two types of vertices and two types of edges, dots and circles represent discrete and continuous variables respectively. A normal graph represents a graphical model and a hypergraph represents an observed data pattern. In terms of the mixed graph, we discuss decomposition of mixed graphical models with incomplete data, and we present a partial imputation method which can be used in the EM algorithm and the Gibbs sampler to speed their convergence. For a given mixed graphical model and an observed data pattern, we try to decompose a large graph into several small ones so that the original likelihood can be factored into a product of likelihoods with distinct parameters for small graphs. For the case that a graph cannot be decomposed due to its observed data pattern, we can impute missing data partially so that the graph can be decomposed.  相似文献   

5.
Gaussian graphical models represent the backbone of the statistical toolbox for analyzing continuous multivariate systems. However, due to the intrinsic properties of the multivariate normal distribution, use of this model family may hide certain forms of context-specific independence that are natural to consider from an applied perspective. Such independencies have been earlier introduced to generalize discrete graphical models and Bayesian networks into more flexible model families. Here, we adapt the idea of context-specific independence to Gaussian graphical models by introducing a stratification of the Euclidean space such that a conditional independence may hold in certain segments but be absent elsewhere. It is shown that the stratified models define a curved exponential family, which retains considerable tractability for parameter estimation and model selection.  相似文献   

6.
7.
8.
Statistical model learning problems are traditionally solved using either heuristic greedy optimization or stochastic simulation, such as Markov chain Monte Carlo or simulated annealing. Recently, there has been an increasing interest in the use of combinatorial search methods, including those based on computational logic. Some of these methods are particularly attractive since they can also be successful in proving the global optimality of solutions, in contrast to stochastic algorithms that only guarantee optimality at the limit. Here we improve and generalize a recently introduced constraint-based method for learning undirected graphical models. The new method combines perfect elimination orderings with various strategies for solution pruning and offers a dramatic improvement both in terms of time and memory complexity. We also show that the method is capable of efficiently handling a more general class of models, called stratified/labeled graphical models, which have an astronomically larger model space.  相似文献   

9.
The importance of interval forecasts is reviewed. Several general approaches to calculating such forecasts are described and compared. They include the use of theoretical formulas based on a fitted probability model (with or without a correction for parameter uncertainty), various “approximate” formulas (which should be avoided), and empirically based, simulation, and resampling procedures. The latter are useful when theoretical formulas are not available or there are doubts about some model assumptions. The distinction between a forecasting method and a forecasting model is expounded. For large groups of series, a forecasting method may be chosen in a fairly ad hoc way. With appropriate checks, it may be possible to base interval forecasts on the model for which the method is optimal. It is certainly unsound to use a model for which the method is not optimal, but, strangely, this is sometimes done. Some general comments are made as to why prediction intervals tend to be too narrow in practice to encompass the required proportion of future observations. An example demonstrates the overriding importance of careful model specification. In particular, when data are “nearly nonstationary,” the difference between fitting a stationary and a nonstationary model is critical.  相似文献   

10.
Combining statistical models is an useful approach in all the research area where a global picture of the problem needs to be constructed by binding together evidence from different sources [M.S. Massa and S.L. Lauritzen Combining Statistical Models, M. Viana and H. Wynn, eds., American Mathematical Society, Providence, RI, 2010, pp. 239–259]. In this paper, we investigate the effectiveness of combining a fixed number of Gaussian graphical models respecting some consistency assumptions in problems of model building. In particular, we use the meta-Markov combination of Gaussian graphical models as detailed in Massa and Lauritzen and compare model selection results obtained by combining selections over smaller sets of variables with selection results over all variables of interest. In order to do so, we carry out some simulation studies in which different criteria are considered for the selection procedures. We conclude that the combination performs, generally, better than global estimation, is computationally simpler by virtue of having fewer and simpler models to work on, and has an intuitive appeal to a wide variety of contexts.  相似文献   

11.
In this paper we derive general formulae for the biases to order n ?1 of the parameter estimates in a general class of nonlinear regression models, where n is the sample size. The formulae are related to those of Cordeiro and McCullagh (1991) and Paula (1992) and may be viewed as extensions of their results, Correction factors are derived for the score and deviance component residuals in these models. The practical use of such corrections is illustrated for the log-gamma model.  相似文献   

12.
In statistical modeling, we strive to specify models that resemble data collected in studies or observed from processes. Consequently, distributional specification and parameter estimation are central to parametric models. Graphical procedures, such as the quantile–quantile (QQ) plot, are arguably the most widely used method of distributional assessment, though critics find their interpretation to be overly subjective. Formal goodness of fit tests are available and are quite powerful, but only indicate whether there is a lack of fit, not why there is lack of fit. In this article, we explore the use of the lineup protocol to inject rigor into graphical distributional assessment and compare its power to that of formal distributional tests. We find that lineup tests are considerably more powerful than traditional tests of normality. A further investigation into the design of QQ plots shows that de-trended QQ plots are more powerful than the standard approach as long as the plot preserves distances in x and y to be the same. While we focus on diagnosing nonnormality, our approach is general and can be directly extended to the assessment of other distributions.  相似文献   

13.
ABSTRACT

In ecological studies, individual inference is made based on results from ecological models. Interpretation of the results requires caution since ecological analysis on group level may not hold in the individual level within the groups, leading to ecological fallacy. Using an ecological regression example for analyzing voting behaviors, we highlight that the explicit use of individual-level models is crucial in understanding the results of ecological studies. In particular, we clarify three relevant statistical issues for each individual-level models: assessment of the uncertainty of parameter estimates obtained from a wrong model, the use of shrinkage estimation method for simultaneous estimation of many parameters, and the necessity of sensitivity analysis rather than adhering to one seemingly most compelling assumption.  相似文献   

14.
ABSTRACT

In this article, we develop a new method, called regenerative randomization, for the transient analysis of continuous time Markov models with absorbing states. The method has the same good properties as standard randomization: numerical stability, well-controlled computation error, and ability to specify the computation error in advance. The method has a benign behavior for large t and is significantly less costly than standard randomization for large enough models and large enough t. For a class of models, class C, including typical failure/repair reliability models with exponential failure and repair time distributions and repair in every state with failed components, stronger theoretical results are available assessing the efficiency of the method in terms of “visible” model characteristics. A large example belonging to that class is used to illustrate the performance of the method and to show that it can indeed be much faster than standard randomization.  相似文献   

15.
ABSTRACT

Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing the residual random variance ratio: a cross-evaluation of the two graphical representations will allow to derive some conclusions on the random part specification of the model and a more accurate selection of the final model.  相似文献   

16.
Point process models are a natural approach for modelling data that arise as point events. In the case of Poisson counts, these may be fitted easily as a weighted Poisson regression. Point processes lack the notion of sample size. This is problematic for model selection, because various classical criteria such as the Bayesian information criterion (BIC) are a function of the sample size, n, and are derived in an asymptotic framework where n tends to infinity. In this paper, we develop an asymptotic result for Poisson point process models in which the observed number of point events, m, plays the role that sample size does in the classical regression context. Following from this result, we derive a version of BIC for point process models, and when fitted via penalised likelihood, conditions for the LASSO penalty that ensure consistency in estimation and the oracle property. We discuss challenges extending these results to the wider class of Gibbs models, of which the Poisson point process model is a special case.  相似文献   

17.
In this work, we propose a generalization of the classical Markov-switching ARMA models to the periodic time-varying case. Specifically, we propose a Markov-switching periodic ARMA (MS-PARMA) model. In addition of capturing regime switching often encountered during the study of many economic time series, this new model also captures the periodicity feature in the autocorrelation structure. We first provide some probabilistic properties of this class of models, namely the strict periodic stationarity and the existence of higher-order moments. We thus propose a procedure for computing the autocovariance function where we show that the autocovariances of the MS-PARMA model satisfy a system of equations similar to the PARMA Yule–Walker equations. We propose also an easily implemented algorithm which can be used to obtain parameter estimates for the MS-PARMA model. Finally, a simulation study of the performance of the proposed estimation method is provided.  相似文献   

18.
ABSTRACT

A reparameterisation procedure is investigated for embedded model problems. The procedure is given by solving differential equations determined by indeterminate forms of limit. Some properties are provided for the existence of an embedded model. Note that an embedded model may include another embedded model. We introduce the concept of embedded model of kth generation and discuss the use of one-by-one elimination procedure to construct graphs of embedded models. As examples, we derive embedded models for some distributions, to which existing method cannot be applied. Our method includes the method given by Cheng et al. [1] Cheng, R.C.H., Evans, B.E. and Iles, T.C. 1992. Embedded Models in Non-Linear Regression. J. R. Statist. Soc. B, 54: 877888.  [Google Scholar] as a special case.  相似文献   

19.
《Econometric Reviews》2012,31(1):27-53
Abstract

Transformed diffusions (TDs) have become increasingly popular in financial modeling for their model flexibility and tractability. While existing TD models are predominately one-factor models, empirical evidence often prefers models with multiple factors. We propose a novel distribution-driven nonlinear multifactor TD model with latent components. Our model is a transformation of a underlying multivariate Ornstein–Uhlenbeck (MVOU) process, where the transformation function is endogenously specified by a flexible parametric stationary distribution of the observed variable. Computationally efficient exact likelihood inference can be implemented for our model using a modified Kalman filter algorithm and the transformed affine structure also allows us to price derivatives in semi-closed form. We compare the proposed multifactor model with existing TD models for modeling VIX and pricing VIX futures. Our results show that the proposed model outperforms all existing TD models both in the sample and out of the sample consistently across all categories and scenarios of our comparison.  相似文献   

20.
Both knowledge-based systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Bayesian model averaging, a technique for accounting for model uncertainty.

Second, we describe a technique for eliciting a prior distribution for competing models from domain experts. We explore the predictive performance of both techniques in the context of a urological diagnostic problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号