首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new data science tool named wavelet-based gradient boosting is proposed and tested. The approach is special case of componentwise linear least squares gradient boosting, and involves wavelet functions of the original predictors. Wavelet-based gradient boosting takes advantages of the approximate \(\ell _1\) penalization induced by gradient boosting to give appropriate penalized additive fits. The method is readily implemented in R and produces parsimonious and interpretable regression fits and classifiers.  相似文献   

2.
In this article we develop a class of stochastic boosting (SB) algorithms, which build upon the work of Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007). They introduce boosting algorithms which correspond to standard boosting (e.g. Bühlmann and Hothorn, Stat. Sci. 22:477–505, 2007) except that the optimization algorithms are randomized; this idea is placed within a Bayesian framework. We show that the inferential procedure in Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007) is incorrect and further develop interpretational, computational and theoretical results which allow one to assess SB’s potential for classification and regression problems. To use SB, sequential Monte Carlo (SMC) methods are applied. As a result, it is found that SB can provide better predictions for classification problems than the corresponding boosting algorithm. A theoretical result is also given, which shows that the predictions of SB are not significantly worse than boosting, when the latter provides the best prediction. We also investigate the method on a real case study from machine learning.  相似文献   

3.
《随机性模型》2013,29(3):341-368
Abstract

We consider a flow of data packets from one source to many destinations in a communication network represented by a random oriented tree. Multicast transmission is characterized by the ability of some tree vertices to replicate received packets depending on the number of destinations downstream. We are interested in characteristics of multicast flows on Galton–Watson trees and trees generated by point aggregates of a Poisson process. Such stochastic settings are intended to represent tree shapes arising in the Internet and in some ad hoc networks. The main result in the branching process case is a functional equation for the joint probability generating function of flow volumes through a given vertex and in the whole tree. We provide conditions for the existence and uniqueness of solution and a method to compute it using Picard iterations. In the point process case, we provide bounds on flow volumes using the technique of stochastic comparison from the theory of continuous percolation. We use these results to derive a number of random trees' characteristics and discuss their applications to analytical evaluation of the load induced on a network by a multicast session.  相似文献   

4.
In this paper we investigate the application of stochastic complexity theory to classification problems. In particular, we define the notion of admissible models as a function of problem complexity, the number of data pointsN, and prior belief. This allows us to derive general bounds relating classifier complexity with data-dependent parameters such as sample size, class entropy and the optimal Bayes error rate. We discuss the application of these results to a variety of problems, including decision tree classifiers, Markov models for image segmentation, and feedforward multilayer neural network classifiers.  相似文献   

5.
Reply     
ABSTRACT

In the class of stochastic volatility (SV) models, leverage effects are typically specified through the direct correlation between the innovations in both returns and volatility, resulting in the dynamic leverage (DL) model. Recently, two asymmetric SV models based on threshold effects have been proposed in the literature. As such models consider only the sign of the previous return and neglect its magnitude, this paper proposes a dynamic asymmetric leverage (DAL) model that accommodates the direct correlation as well as the sign and magnitude of the threshold effects. A special case of the DAL model with zero direct correlation between the innovations is the asymmetric leverage (AL) model. The dynamic asymmetric leverage models are estimated by the Monte Carlo likelihood (MCL) method. Monte Carlo experiments are presented to examine the finite sample properties of the estimator. For a sample size of T = 2000 with 500 replications, the sample means, standard deviations, and root mean squared errors of the MCL estimators indicate only a small finite sample bias. The empirical estimates for S&;P 500 and TOPIX financial returns, and USD/AUD and YEN/USD exchange rates, indicate that the DAL class, including the DL and AL models, is generally superior to threshold SV models with respect to AIC and BIC, with AL typically providing the best fit to the data.  相似文献   

6.
ABSTRACT

The class of bivariate copulas that are invariant under truncation with respect to one variable is considered. A simulation algorithm for the members of the class and a novel construction method are presented. Moreover, inspired by a stochastic interpretation of the members of such a class, a procedure is suggested to check whether the dependence structure of a given data set is truncation invariant. The overall performance of the procedure has been illustrated on both simulated and real data.  相似文献   

7.
梯度Boosting思想在解释Boosting算法的运行机制时基于基学习器张成的空间为连续泛函空间,但是实际上在有限样本条件下形成的基学习器空间不一定是连续的。针对这一问题,从可加模型的角度出发,基于平方损失,提出一种重抽样提升回归树的新方法。该方法是一种加权的加法模型的逐步更新算法。实验结果表明,这种方法可以显著地提升一棵回归树的效果,减小预测误差,并且能得到比L2Boost算法更低的预测误差。  相似文献   

8.
This article presents a new strategy to construct classification trees. According to the proposed scheme, we focused on keeping the record of sequences of each constructed classification tree; both in terms of splitting predictors and their splitting values in an array. So overall we have as many arrays as we have drawn samples. At this stage, a three steps strategy is introduced, which is used to search for the optimum classification tree. The proposed strategy provides comparable or improved results in terms of generalized error rates than tree and rpart (packages available for classification purposes in the R) using four of the well-known evaluation functions, that is, the Gini, the Entropy, the Twoing, and the Exponent-based function to split nodes for many real-life datasets.  相似文献   

9.
Most classification models have presented an imbalanced learning state when dealing with the imbalanced datasets. This article proposes a novel approach for learning from imbalanced datasets, which based on an improved SMOTE (synthetic Minority Over-sampling technique) algorithm. By organically combining the over-sampling and the under-sampling method, this approach aims to choose neighbors targetedly and synthesize samples with different strategy. Experiments show that most classifiers have achieved an ideal performance on the classification problem of the positive and negative class after dealing imbalanced datasets with our algorithm.  相似文献   

10.
One of the major issues in medical field constitutes the correct diagnosis, including the limitation of human expertise in diagnosing the disease in a manual way. Nowadays, the use of machine learning classifiers, such as support vector machines (SVM), in medical diagnosis is increasing gradually. However, traditional classification algorithms can be limited in their performance when they are applied on highly imbalanced data sets, in which negative examples (i.e. negative to a disease) outnumber the positive examples (i.e. positive to a disease). SVM constitutes a significant improvement and its mathematical formulation allows the incorporation of different weights so as to deal with the problem of imbalanced data. In the present work an extensive study of four medical data sets is conducted using a variant of SVM, called proximal support vector machine (PSVM) proposed by Fung and Mangasarian [9 G.M. Fung and O.L. Mangasarian, Proximal support vector machine classifiers, in Proceedings KDD-2001: Knowledge Discovery and Data Mining, F. Provost and R. Srikant, eds., Association for Computing Machinery, San Francisco, CA, New York, 2001, pp. 77–86. Available at ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-02.ps. [Google Scholar]]. Additionally, in order to deal with the imbalanced nature of the medical data sets we applied both a variant of SVM, referred as two-cost support vector machine and a modification of PSVM referred as modified PSVM. Both algorithms incorporate different weights one for each class examples.  相似文献   

11.
In semi-competing risks one considers a terminal event, such as death of a person, and a non-terminal event, such as disease recurrence. We present a model where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while the time to the non-terminal event is represented by the first passage time of the same process to a stochastic threshold S, assumed to be independent of the stochastic process. In order to be explicit, we let the stochastic process be a gamma process, but other processes with independent increments may alternatively be used. For semi-competing risks this appears to be a new modeling approach, being an alternative to traditional approaches based on illness-death models and copula models. In this paper we consider a fully parametric approach. The likelihood function is derived and statistical inference in the model is illustrated on both simulated and real data.  相似文献   

12.

We incorporate new techniques for obtaining unbiased estimators of gradients from single simulations of stochastic systems in optimization procedures. We develop an "enhanced" least squares estimator of the optimum which incorporates information about both the function and its gradient and improves substantially on techniques which use only the function. We also propose a sequential design to use with the enhanced least squares estimator to optimize a regression function when it is evaluated by simulation.  相似文献   

13.
The Bayesian CART (classification and regression tree) approach proposed by Chipman, George and McCulloch (1998) entails putting a prior distribution on the set of all CART models and then using stochastic search to select a model. The main thrust of this paper is to propose a new class of hierarchical priors which enhance the potential of this Bayesian approach. These priors indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other. Past methods for tree shrinkage have searched for trees without shrinking, and applied shrinkage to the identified tree only after the search. By using hierarchical priors in the stochastic search, the proposed method searches for shrunk trees that fit well and improves the tree through shrinkage of predictions.  相似文献   

14.
In this paper, we perform an empirical comparison of the classification error of several ensemble methods based on classification trees. This comparison is performed by using 14 data sets that are publicly available and that were used by Lim, Loh and Shih [Lim, T., Loh, W. and Shih, Y.-S., 2000, A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40, 203–228.]. The methods considered are a single tree, Bagging, Boosting (Arcing) and random forests (RF). They are compared from different perspectives. More precisely, we look at the effects of noise and of allowing linear combinations in the construction of the trees, the differences between some splitting criteria and, specifically for RF, the effect of the number of variables from which to choose the best split at each given node. Moreover, we compare our results with those obtained by Lim et al. [Lim, T., Loh, W. and Shih, Y.-S., 2000, A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 40, 203–228.]. In this study, the best overall results are obtained with RF. In particular, RF are the most robust against noise. The effect of allowing linear combinations and the differences between splitting criteria are small on average, but can be substantial for some data sets.  相似文献   

15.
Classification and regression tree has been useful in medical research to construct algorithms for disease diagnosis or prognostic prediction. Jin et al. 7 Jin, H., Lu, Y., Harris, R. T., Black, D., Stone, K., Hochberg, M. and Genant, H. 2004. Classification algorithms for hip fracture prediction base on recursive partitioning methods. Med. Decis. Mak., 24: 386398. (doi:10.1177/0272989X04267009)[Crossref], [PubMed], [Web of Science ®] [Google Scholar] developed a robust and cost-saving tree (RACT) algorithm with application in classification of hip fracture risk after 5-year follow-up based on the data from the Study of Osteoporotic Fractures (SOF). Although conventional recursive partitioning algorithms have been well developed, they still have some limitations. Binary splits may generate a big tree with many layers, but trinary splits may produce too many nodes. In this paper, we propose a classification approach combining trinary splits and binary splits to generate a trinary–binary tree. A new non-inferiority test of entropy is used to select the binary or trinary splits. We apply the modified method in SOF to construct a trinary–binary classification rule for predicting risk of osteoporotic hip fracture. Our new classification tree has good statistical utility: it is statistically non-inferior to the optimum binary tree and the RACT based on the testing sample and is also cost-saving. It may be useful in clinical applications: femoral neck bone mineral density, age, height loss and weight gain since age 25 can identify subjects with elevated 5-year hip fracture risk without loss of statistical efficiency.  相似文献   

16.
Abstract

This paper concerns a class of stochastic recursive zero-sum differential game problem with recursive utility related to a backward stochastic differential equation (BSDE) with double obstacles. A sufficient condition is provided to obtain the saddle-point strategy under some assumptions. In virtue of the corresponding relationship of doubly reflected BSDE and mixed game problem, a stochastic linear recursive mixed differential game problem is studied to apply our theoretical result, and here the explicit saddle-point strategy as well as the saddle-point stopping time for the mixed game problem are obtained. Besides, a numeral example is also given to demonstrate the result by virtue of partial differential equations (PDEs) computation method.  相似文献   

17.
Kernel density classification and boosting: an L2 analysis   总被引:1,自引:0,他引:1  
Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optimal smoothing parameters are increasing functions of the sample size of the complementary group, and we provide a small simluation study which examines the relative performance of kernel density methods when the final goal is classification.A relative newcomer to the classification portfolio is boosting, and this paper proposes an algorithm for boosting kernel density classifiers. We note that boosting is closely linked to a previously proposed method of bias reduction in kernel density estimation and indicate how it will enjoy similar properties for classification. We show that boosting kernel classifiers reduces the bias whilst only slightly increasing the variance, with an overall reduction in error. Numerical examples and simulations are used to illustrate the findings, and we also suggest further areas of research.  相似文献   

18.
We propose a general framework for regression models with functional response containing a potentially large number of flexible effects of functional and scalar covariates. Special emphasis is put on historical functional effects, where functional response and functional covariate are observed over the same interval and the response is only influenced by covariate values up to the current grid point. Historical functional effects are mostly used when functional response and covariate are observed on a common time interval, as they account for chronology. Our formulation allows for flexible integration limits including, e.g., lead or lag times. The functional responses can be observed on irregular curve-specific grids. Additionally, we introduce different parameterizations for historical effects and discuss identifiability issues.The models are estimated by a component-wise gradient boosting algorithm which is suitable for models with a potentially high number of covariate effects, even more than observations, and inherently does model selection. By minimizing corresponding loss functions, different features of the conditional response distribution can be modeled, including generalized and quantile regression models as special cases. The methods are implemented in the open-source R package FDboost. The methodological developments are motivated by biotechnological data on Escherichia coli fermentations, but cover a much broader model class.  相似文献   

19.
It is well known that statistical classifiers trained from imbalanced data lead to low true positive rates and select inconsistent significant variables. In this article, an improved method is proposed to enhance the classification accuracy for the minority class by differentiating misclassification cost for each group. The overall error rate is replaced by an alternative composite criterion. Furthermore, we propose an approach to estimate the tuning parameter, the composite criterion, and the cut-point simultaneously. Simulations show that the proposed method achieves a high true positive rate on prediction and a good performance on variable selection for both continuous and categorical predictors, even with highly imbalanced data. An illustrative example of the analysis of the suboptimal health state data in traditional Chinese medicine is discussed to show the reasonable application of the proposed method.  相似文献   

20.
A sub threshold signal is transmitted through a channel and may be detected when some noise - with known structure and proportional to some level - is added to the data. There is an optimal noise level, called of stochastic resonance, that corresponds to the minimum variance of the estimators in the problem of recovering unobservable signals. For several noise structures it has been shown the evidence of stochastic resonance effect. Here we study the case when the noise is a Markovian process. We propose consistent estimators of the sub threshold signal and we solve further a problem of hypotheses testing. We also discuss evidence of stochastic resonance for both estimation and hypotheses testing problems via examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号