首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 149 毫秒
1.
ABSTRACT

The optimal learner for prediction modeling varies depending on the underlying data-generating distribution. Super Learner (SL) is a generic ensemble learning algorithm that uses cross-validation to select among a ‘library’ of candidate prediction models. While SL has been widely studied in a number of settings, it has not been thoroughly evaluated in large electronic healthcare databases that are common in pharmacoepidemiology and comparative effectiveness research. In this study, we applied and evaluated the performance of SL in its ability to predict the propensity score (PS), the conditional probability of treatment assignment given baseline covariates, using three electronic healthcare databases. We considered a library of algorithms that consisted of both nonparametric and parametric models. We also proposed a novel strategy for prediction modeling that combines SL with the high-dimensional propensity score (hdPS) variable selection algorithm. Predictive performance was assessed using three metrics: the negative log-likelihood, area under the curve (AUC), and time complexity. Results showed that the best individual algorithm, in terms of predictive performance, varied across datasets. The SL was able to adapt to the given dataset and optimize predictive performance relative to any individual learner. Combining the SL with the hdPS was the most consistent prediction method and may be promising for PS estimation and prediction modeling in electronic healthcare databases.  相似文献   

2.
Roughly speaking, there is one main model of pattern recognition support vector machine, with several variants of lower popularity. On the contrary, among the different multi-class support vector machines which can be found in the literature, none is clearly favoured. On the one hand, they exhibit distinct statistical properties. On the other hand, multiple comparative studies between multi-class support vector machines and decomposition methods have highlighted the fact that each model has its advantages and drawbacks. These observations call for the evaluation of combinations of multi-class support vector machines. In this article, we study the combination of multi-class support vector machines with linear ensemble methods. Their sample complexity is low, which should prevent them from overfitting, and the outputs of two of them are estimates of the class posterior probabilities.  相似文献   

3.
Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.  相似文献   

4.
At present, ensemble learning has exhibited its great power in stabilizing and enhancing the performance of some traditional variable selection methods such as lasso and genetic algorithm. In this paper, a novel bagging ensemble method called BSSW is developed to implement variable ranking and selection in linear regression models. Its main idea is to execute stepwise search algorithm on multiple bootstrap samples. In each trial, a mixed importance measure is assigned to each variable according to the order that it is selected into final model as well as the improvement of model fitting resulted from its inclusion. Based on the importance measure averaged across some bootstrapping trials, all candidate variables are ranked and then decided to be important or not. To extend the scope of application, BSSW is extended to the situation of generalized linear models. Experiments carried out with some simulated and real data indicate that BSSW achieves better performance in most studied cases when compared with several other existing methods.  相似文献   

5.
This paper presents a novel ensemble classifier generation method by integrating the ideas of bootstrap aggregation and Principal Component Analysis (PCA). To create each individual member of an ensemble classifier, PCA is applied to every out-of-bag sample and the computed coefficients of all principal components are stored, and then the principal components calculated on the corresponding bootstrap sample are taken as additional elements of the original feature set. A classifier is trained with the bootstrap sample and some features randomly selected from the new feature set. The final ensemble classifier is constructed by majority voting of the trained base classifiers. The results obtained by empirical experiments and statistical tests demonstrate that the proposed method performs better than or as well as several other ensemble methods on some benchmark data sets publicly available from the UCI repository. Furthermore, the diversity-accuracy patterns of the ensemble classifiers are investigated by kappa-error diagrams.  相似文献   

6.
Several methods based on smoothing or statistical criteria have been used for deriving disaggregated values compatible with observed annual totals. The present method is based on the artificial neural networks. This article evaluates the use of artificial neural networks (ANNs) for the disaggregation of annual US GDP data to quarterly time increments. A feed-forward neural network with back-propagation algorithm for learning was used. An ANN model is introduced and evaluated in this paper. The proposed method is considered as a temporal disaggregation method without related series. A comparison with previous temporal disaggregation methods without related series has been done. The disaggregated quarterly GDP data compared well with observed quarterly data. In addition, they preserved all the basic statistics such as summing to the annual data value, cross correlation structure among quarterly flows, etc.  相似文献   

7.
Feed-forward neural networks—also known as multi-layer perceptrons—are now widely used for regression and classification. In parallel but slightly earlier, a family of methods for flexible regression and discrimination were developed in multivariate statistics, and tree-induction methods have been developed in both machine learning and statistics. We expound and compare these approaches in the context of a number of examples.  相似文献   

8.
选择性集成算法是目前机器学习关注的热点之一。在对一海藻繁殖案例研究的基础上,提出了一种基于k—nleanS聚类技术的快速选择性BaggingTre咚集成算法;同时与传统统计方法和一些常用的机器学习方法相比较,发现该算法具有较小的模型推广误差和更高的预测精度的优点,而且其运行的效率也得到了较大的提高。  相似文献   

9.
Bagging, boosting, and random subspace methods are three most commonly used approaches for constructing ensemble classifiers. In this article, the effect of randomly selected feature subsets (intersectant or disjoint) on bagging and boosting is investigated. The performance of the related ensemble methods are compared by conducting experiments on some UCI benchmark datasets. The results demonstrate that bagging can be generally improved using the randomly selected feature subsets whereas boosting can only be optimized in some cases. Furthermore, the diversity between classifiers in an ensemble is also discussed and related to the prediction accuracy of the ensemble classifier.  相似文献   

10.
11.
Although the effect of missing data on regression estimates has received considerable attention, their effect on predictive performance has been neglected. We studied the performance of three missing data strategies—omission of records with missing values, replacement with a mean and imputation based on regression—on the predictive performance of logistic regression (LR), classification tree (CT) and neural network (NN) models in the presence of data missing completely at random (MCAR). Models were constructed using datasets of size 500 simulated from a joint distribution of binary and continuous predictors including nonlinearities, collinearity and interactions between variables. Though omission produced models that fit better on the data from which the models were developed, imputation was superior on average to omission for all models when evaluating the receiver operating characteristic (ROC) curve area, mean squared error (MSE), pooled variance across outcome categories and calibration X 2 on an independently generated test set. However, in about one-third of simulations, omission performed better. Performance was also more variable with omission including quite a few instances of extremely poor performance. Replacement and imputation generally produced similar results except with neural networks for which replacement, the strategy typically used in neural network algorithms, was inferior to imputation. Missing data affected simpler models much less than they did more complex models such as generalized additive models that focus on local structure For moderate sized datasets, logistic regressions that use simple nonlinear structures such as quadratic terms and piecewise linear splines appear to be at least as robust to randomly missing values as neural networks and classification trees.  相似文献   

12.
唐晓彬等 《统计研究》2020,37(7):104-115
消费者信心指数等宏观经济指标具有时间上的滞后效应和动态变化的多维性,不易精确预测。本文基于机器学习长短时间记忆(Long Short-Term Memory,LSTM)神经网络模型,结合大数据技术挖掘消费者信心指数相关网络搜索数据(User Search,US),进而构建一种LSTM&US预测模型,并将其应用于对我国消费者信心指数的长期、中期与短期的预测研究,同时引入多个基准预测模型进行了对比分析。结果发现:引入网络搜索数据能够提高LSTM神经网络模型的预测性能与预测精度;LSTM&US预测模型具有较好的泛化能力,对不同期限的预测效果均较稳定,其预测性能与预测精度均优于其他六种基准预测模型(LSTM、SVR&US、RFR&US、BP&US、XGB&US和LGB&US);预测结果显示本文提出的LSTM&US预测模型具有一定的实用价值,该预测方法为消费者信心指数的预测与预判提供了一种新的研究思路,丰富了机器学习方法在宏观经济指标预测领域中的理论研究。  相似文献   

13.
Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive data sets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large data sets. Subsemble partitions the full data set into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be a beneficial tool for small- to moderate-sized data sets, and often has better prediction performance than the underlying algorithm fit just once on the full data set. We also describe how to include Subsemble as a candidate in a SuperLearner library, providing a practical way to evaluate the performance of Subsemble relative to the underlying algorithm fit just once on the full data set.  相似文献   

14.
Many tasks in image analysis can be formulated as problems of discrimination or, generally, of pattern recognition. A pattern-recognition system is normally considered to comprise two processing stages: the feature selection and extraction stage, which attempts to reduce the dimensionality of the pattern to be classified, and the classification stage, the purpose of which is to assign the pattern into its perceptually meaningful category. This paper gives an overview of the various approaches to designing statistical pattern recognition schemes. The problem of feature selection and extraction is introduced. The discussion then focuses on statistical decision theoretic rules and their implementation. Both parametric and non-parametric classification methods are covered. The emphasis then switches to decision making in context. Two basic formulations of contextual pattern classification are put forward, and the various methods developed from these two formulations are reviewed. These include the method of hidden Markov chains, the Markov random field approach, Markov meshes, and probabilistic and discrete relaxation.  相似文献   

15.
Many tasks in image analysis can be formulated as problems of discrimination or, generally, of pattern recognition. A pattern-recognition system is normally considered to comprise two processing stages: the feature selection and extraction stage, which attempts to reduce the dimensionality of the pattern to be classified, and the classification stage, the purpose of which is to assign the pattern into its perceptually meaningful category. This paper gives an overview of the various approaches to designing statistical pattern recognition schemes. The problem of feature selection and extraction is introduced. The discussion then focuses on statistical decision theoretic rules and their implementation. Both parametric and non-parametric classification methods are covered. The emphasis then switches to decision making in context. Two basic formulations of contextual pattern classification are put forward, and the various methods developed from these two formulations are reviewed. These include the method of hidden Markov chains, the Markov random field approach, Markov meshes, and probabilistic and discrete relaxation.  相似文献   

16.
SubBag is a technique by combining bagging and random subspace methods to generate ensemble classifiers with good generalization capability. In practice, a hyperparameter K of SubBag—the number of randomly selected features to create each base classifier—should be specified beforehand. In this article, we propose to employ the out-of-bag instances to determine the optimal value of K in SubBag. The experiments conducted with some UCI real-world data sets show that the proposed method can make SubBag achieve the optimal performance in nearly all the considered cases. Meanwhile, it occupied less computational sources than cross validation procedure.  相似文献   

17.
It is now possible to carry out Bayesian image segmentation from a continuum parametric model with an unknown number of regions. However, few suitable parametric models exist. We set out to model processes which have realizations that are naturally described by coloured planar triangulations. Triangulations are already used, to represent image structure in machine vision, and in finite element analysis, for domain decomposition. However, no normalizable parametric model, with realizations that are coloured triangulations, has been specified to date. We show how this must be done, and in particular we prove that a normalizable measure on the space of triangulations in the interior of a fixed simple polygon derives from a Poisson point process of vertices. We show how such models may be analysed by using Markov chain Monte Carlo methods and we present two case-studies, including convergence analysis.  相似文献   

18.
Recently, a new ensemble classification method named Canonical Forest (CF) has been proposed by Chen et al. [Canonical forest. Comput Stat. 2014;29:849–867]. CF has been proven to give consistently good results in many data sets and comparable to other widely used classification ensemble methods. However, CF requires an adopting feature reduction method before classifying high-dimensional data. Here, we extend CF to a high-dimensional classifier by incorporating a random feature subspace algorithm [Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20:832–844]. This extended algorithm is called HDCF (high-dimensional CF) as it is specifically designed for high-dimensional data. We conducted an experiment using three data sets – gene imprinting, oestrogen, and leukaemia – to compare the performance of HDCF with several popular and successful classification methods on high-dimensional data sets, including Random Forest [Breiman L. Random forest. Mach Learn. 2001;45:5–32], CERP [Ahn H, et al. Classification by ensembles from random partitions of high-dimensional data. Comput Stat Data Anal. 2007;51:6166–6179], and support vector machines [Vapnik V. The nature of statistical learning theory. New York: Springer; 1995]. Besides the classification accuracy, we also investigated the balance between sensitivity and specificity for all these four classification methods.  相似文献   

19.
In this study, we combined a Poisson regression model with neural networks (neural network Poisson regression) to relax the traditional Poisson regression assumption of linearity of the Poisson mean as a function of covariates, while including it as a special case. In four simulated examples, we found that the neural network Poisson regression improved the performance of simple Poisson regression if the Poisson mean was nonlinearly related to covariates. We also illustrated the performance of the model in predicting five-year changes in cognitive scores, in association with age and education level; we found that the proposed approach had superior accuracy to conventional linear Poisson regression. As the interpretability of the neural networks is often difficult, its combination with conventional and more readily interpretable approaches under the generalized linear model can benefit applications in biomedicine.  相似文献   

20.
多图模型表示来自于不同类的同一组随机变量间的相关关系,结点表示随机变量,边表示变量之间的直接联系,各类的图模型反映了各自相关结构特征和类间共同的信息。用多图模型联合估计方法,将来自不同个体的数据按其特征分类,假设每类中各变量间的相依结构服从同一个高斯图模型,应用组Lasso方法和图Lasso方法联合估计每类的图模型结构。数值模拟验证了多图模型联合估计方法的有效性。用多图模型和联合估计方法对中国15个省份13个宏观经济指标进行相依结构分析,结果表明,不同经济发展水平省份的宏观经济变量间存在共同的相关联系,反映了中国现阶段经济发展的特征;每一类的相关结构反映了各类省份经济发展独有的特征。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号