首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Gaussian graphical model (GGM) is one of the well-known modelling approaches to describe biological networks under the steady-state condition via the precision matrix of data. In literature there are different methods to infer model parameters based on GGM. The neighbourhood selection with the lasso regression and the graphical lasso method are the most common techniques among these alternative estimation methods. But they can be computationally demanding when the system's dimension increases. Here, we suggest a non-parametric statistical approach, called the multivariate adaptive regression splines (MARS) as an alternative of GGM. To compare the performance of both models, we evaluate the findings of normal and non-normal data via the specificity, precision, F-measures and their computational costs. From the outputs, we see that MARS performs well, resulting in, a plausible alternative approach with respect to GGM in the construction of complex biological systems.  相似文献   

2.
We consider the problem of constructing nonlinear regression models with Gaussian basis functions, using lasso regularization. Regularization with a lasso penalty is an advantageous in that it estimates some coefficients in linear regression models to be exactly zero. We propose imposing a weighted lasso penalty on a nonlinear regression model and thereby selecting the number of basis functions effectively. In order to select tuning parameters in the regularization method, we use a deviance information criterion proposed by Spiegelhalter et al. (2002), calculating the effective number of parameters by Gibbs sampling. Simulation results demonstrate that our methodology performs well in various situations.  相似文献   

3.
Regularization and variable selection via the elastic net   总被引:2,自引:0,他引:2  
Summary.  We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors ( p ) is much bigger than the number of observations ( n ). By contrast, the lasso is not a very satisfactory variable selection method in the p ≫ n case. An algorithm called LARS-EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lasso.  相似文献   

4.
Gaussian graphical models represent the backbone of the statistical toolbox for analyzing continuous multivariate systems. However, due to the intrinsic properties of the multivariate normal distribution, use of this model family may hide certain forms of context-specific independence that are natural to consider from an applied perspective. Such independencies have been earlier introduced to generalize discrete graphical models and Bayesian networks into more flexible model families. Here, we adapt the idea of context-specific independence to Gaussian graphical models by introducing a stratification of the Euclidean space such that a conditional independence may hold in certain segments but be absent elsewhere. It is shown that the stratified models define a curved exponential family, which retains considerable tractability for parameter estimation and model selection.  相似文献   

5.
We present an objective Bayes method for covariance selection in Gaussian multivariate regression models having a sparse regression and covariance structure, the latter being Markov with respect to a directed acyclic graph (DAG). Our procedure can be easily complemented with a variable selection step, so that variable and graphical model selection can be performed jointly. In this way, we offer a solution to a problem of growing importance especially in the area of genetical genomics (eQTL analysis). The input of our method is a single default prior, essentially involving no subjective elicitation, while its output is a closed form marginal likelihood for every covariate‐adjusted DAG model, which is constant over each class of Markov equivalent DAGs; our procedure thus naturally encompasses covariate‐adjusted decomposable graphical models. In realistic experimental studies, our method is highly competitive, especially when the number of responses is large relative to the sample size.  相似文献   

6.
We consider the problem of selecting variables in factor analysis models. The $L_1$ regularization procedure is introduced to perform an automatic variable selection. In the factor analysis model, each variable is controlled by multiple factors when there are more than one underlying factor. We treat parameters corresponding to the multiple factors as grouped parameters, and then apply the group lasso. Furthermore, the weight of the group lasso penalty is modified to obtain appropriate estimates and improve the performance of variable selection. Crucial issues in this modeling procedure include the selection of the number of factors and a regularization parameter. Choosing these parameters can be viewed as a model selection and evaluation problem. We derive a model selection criterion for evaluating the factor analysis model via the weighted group lasso. Monte Carlo simulations are conducted to investigate the effectiveness of the proposed procedure. A real data example is also given to illustrate our procedure. The Canadian Journal of Statistics 40: 345–361; 2012 © 2012 Statistical Society of Canada  相似文献   

7.
Conditional Gaussian graphical models are a reparametrization of the multivariate linear regression model which explicitly exhibits (i) the partial covariances between the predictors and the responses, and (ii) the partial covariances between the responses themselves. Such models are particularly suitable for interpretability since partial covariances describe direct relationships between variables. In this framework, we propose a regularization scheme to enhance the learning strategy of the model by driving the selection of the relevant input features by prior structural information. It comes with an efficient alternating optimization procedure which is guaranteed to converge to the global minimum. On top of showing competitive performance on artificial and real datasets, our method demonstrates capabilities for fine interpretation, as illustrated on three high-dimensional datasets from spectroscopy, genetics, and genomics.  相似文献   

8.
The multivariate adaptive regression splines (MARS) model is one of the well-known, additive non-parametric models that can deal with highly correlated and nonlinear datasets successfully. From our previous analyses, we have seen that lasso-type MARS (LMARS) can be a strong alternative of the Gaussian graphical model (GGM) which is a well-known probabilistic method to describe the steady-state behaviour of the complex biological systems via the lasso regression. In this study, we extend our original LMARS model by taking into account the second-order interaction effects of genes as the representative of the feed-forward loop in biological networks. By this way, we can describe both linear and nonlinear activations of the genes in the same model. We evaluate the performance of our new model under different dimensional simulated and real systems, and then compare the accuracy of the estimates with GGM and LMARS outputs. The results show the advantage of this new model over its close alternatives.  相似文献   

9.
The fused lasso penalizes a loss function by the L1 norm for both the regression coefficients and their successive differences to encourage sparsity of both. In this paper, we propose a Bayesian generalized fused lasso modeling based on a normal-exponential-gamma (NEG) prior distribution. The NEG prior is assumed into the difference of successive regression coefficients. The proposed method enables us to construct a more versatile sparse model than the ordinary fused lasso using a flexible regularization term. Simulation studies and real data analyses show that the proposed method has superior performance to the ordinary fused lasso.  相似文献   

10.
This article develops the adaptive elastic net generalized method of moments (GMM) estimator in large-dimensional models with potentially (locally) invalid moment conditions, where both the number of structural parameters and the number of moment conditions may increase with the sample size. The basic idea is to conduct the standard GMM estimation combined with two penalty terms: the adaptively weighted lasso shrinkage and the quadratic regularization. It is a one-step procedure of valid moment condition selection, nonzero structural parameter selection (i.e., model selection), and consistent estimation of the nonzero parameters. The procedure achieves the standard GMM efficiency bound as if we know the valid moment conditions ex ante, for which the quadratic regularization is important. We also study the tuning parameter choice, with which we show that selection consistency still holds without assuming Gaussianity. We apply the new estimation procedure to dynamic panel data models, where both the time and cross-section dimensions are large. The new estimator is robust to possible serial correlations in the regression error terms.  相似文献   

11.
12.
Abstract

Covariance estimation and selection for multivariate datasets in a high-dimensional regime is a fundamental problem in modern statistics. Gaussian graphical models are a popular class of models used for this purpose. Current Bayesian methods for inverse covariance matrix estimation under Gaussian graphical models require the underlying graph and hence the ordering of variables to be known. However, in practice, such information on the true underlying model is often unavailable. We therefore propose a novel permutation-based Bayesian approach to tackle the unknown variable ordering issue. In particular, we utilize multiple maximum a posteriori estimates under the DAG-Wishart prior for each permutation, and subsequently construct the final estimate of the inverse covariance matrix. The proposed estimator has smaller variability and yields order-invariant property. We establish posterior convergence rates under mild assumptions and illustrate that our method outperforms existing approaches in estimating the inverse covariance matrices via simulation studies.  相似文献   

13.
This paper analyzes the impact of some kinds of contaminant on model selection in graphical Gaussian models. We investigate four different kinds of contaminants, in order to consider the effect of gross errors, model deviations, and model misspecification. The aim of the work is to assess against which kinds of contaminant a model selection procedure for graphical Gaussian models has a more robust behavior. The analysis is based on simulated data. The simulation study shows that relatively few contaminated observations in even just one of the variables can have a significant impact on correct model selection, especially when the contaminated variable is a node in a separating set of the graph.  相似文献   

14.
We study the problem of selecting a regularization parameter in penalized Gaussian graphical models. When the goal is to obtain a model with good predictive power, cross-validation is the gold standard. We present a new estimator of Kullback–Leibler loss in Gaussian Graphical models which provides a computationally fast alternative to cross-validation. The estimator is obtained by approximating leave-one-out-cross-validation. Our approach is demonstrated on simulated data sets for various types of graphs. The proposed formula exhibits superior performance, especially in the typical small sample size scenario, compared to other available alternatives to cross-validation, such as Akaike's information criterion and Generalized approximate cross-validation. We also show that the estimator can be used to improve the performance of the Bayesian information criterion when the sample size is small.  相似文献   

15.
Combining statistical models is an useful approach in all the research area where a global picture of the problem needs to be constructed by binding together evidence from different sources [M.S. Massa and S.L. Lauritzen Combining Statistical Models, M. Viana and H. Wynn, eds., American Mathematical Society, Providence, RI, 2010, pp. 239–259]. In this paper, we investigate the effectiveness of combining a fixed number of Gaussian graphical models respecting some consistency assumptions in problems of model building. In particular, we use the meta-Markov combination of Gaussian graphical models as detailed in Massa and Lauritzen and compare model selection results obtained by combining selections over smaller sets of variables with selection results over all variables of interest. In order to do so, we carry out some simulation studies in which different criteria are considered for the selection procedures. We conclude that the combination performs, generally, better than global estimation, is computationally simpler by virtue of having fewer and simpler models to work on, and has an intuitive appeal to a wide variety of contexts.  相似文献   

16.
Bayesian model averaging (BMA) is an effective technique for addressing model uncertainty in variable selection problems. However, current BMA approaches have computational difficulty dealing with data in which there are many more measurements (variables) than samples. This paper presents a method for combining ?1 regularization and Markov chain Monte Carlo model composition techniques for BMA. By treating the ?1 regularization path as a model space, we propose a method to resolve the model uncertainty issues arising in model averaging from solution path point selection. We show that this method is computationally and empirically effective for regression and classification in high-dimensional data sets. We apply our technique in simulations, as well as to some applications that arise in genomics.  相似文献   

17.
This paper focuses on the variable selection for semiparametric varying coefficient partially linear model when the covariates are measured with additive errors and the response is missing. An adaptive lasso estimator and the smoothly clipped absolute deviation estimator as a comparison for the parameters are proposed. With the proper selection of regularization parameter, the sampling properties including the consistency of the two procedures and the oracle properties are established. Furthermore, the algorithms and corresponding standard error formulas are discussed. A simulation study is carried out to assess the finite sample performance of the proposed methods.  相似文献   

18.
Abstract

In this paper we introduce continuous tree mixture model that is the mixture of undirected graphical models with tree structured graphs and is considered as multivariate analysis with a non parametric approach. We estimate its parameters, the component edge sets and mixture proportions through regularized maximum likalihood procedure. Our new algorithm, which uses expectation maximization algorithm and the modified version of Kruskal algorithm, simultaneosly estimates and prunes the mixture component trees. Simulation studies indicate this method performs better than the alternative Gaussian graphical mixture model. The proposed method is also applied to water-level data set and is compared with the results of Gaussian mixture model.  相似文献   

19.
Lasso proved to be an extremely successful technique for simultaneous estimation and variable selection. However lasso has two major drawbacks. First, it does not enforce any grouping effect and secondly in some situation lasso solutions are inconsistent for variable selection. To overcome this inconsistency adaptive lasso is proposed where adaptive weights are used for penalizing different coefficients. Recently a doubly regularized technique namely elastic net is proposed which encourages grouping effect i.e. either selection or omission of the correlated variables together. However elastic net is also inconsistent. In this paper we study adaptive elastic net which does not have this drawback. In this article we specially focus on the grouped selection property of adaptive elastic net along with its model selection complexity. We also shed some light on the bias-variance tradeoff of different regularization methods including adaptive elastic net. An efficient algorithm was proposed in the line of LARS-EN, which is then illustrated with simulated as well as real life data examples.  相似文献   

20.
Regularization methods for simultaneous variable selection and coefficient estimation have been shown to be effective in quantile regression in improving the prediction accuracy. In this article, we propose the Bayesian bridge for variable selection and coefficient estimation in quantile regression. A simple and efficient Gibbs sampling algorithm was developed for posterior inference using a scale mixture of uniform representation of the Bayesian bridge prior. This is the first work to discuss regularized quantile regression with the bridge penalty. Both simulated and real data examples show that the proposed method often outperforms quantile regression without regularization, lasso quantile regression, and Bayesian lasso quantile regression.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号