首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
梯度Boosting思想在解释Boosting算法的运行机制时基于基学习器张成的空间为连续泛函空间,但是实际上在有限样本条件下形成的基学习器空间不一定是连续的。针对这一问题,从可加模型的角度出发,基于平方损失,提出一种重抽样提升回归树的新方法。该方法是一种加权的加法模型的逐步更新算法。实验结果表明,这种方法可以显著地提升一棵回归树的效果,减小预测误差,并且能得到比L2Boost算法更低的预测误差。  相似文献   

2.
Recent results in information theory, see Soofi (1996; 2001) for a review, include derivations of optimal information processing rules, including Bayes' theorem, for learning from data based on minimizing a criterion functional, namely output information minus input information as shown in Zellner (1988; 1991; 1997; 2002). Herein, solution post data densities for parameters are obtained and studied for cases in which the input information is that in (1) a likelihood function and a prior density; (2) only a likelihood function; and (3) neither a prior nor a likelihood function but only input information in the form of post data moments of parameters, as in the Bayesian method of moments approach. Then it is shown how optimal output densities can be employed to obtain predictive densities and optimal, finite sample structural coefficient estimates using three alternative loss functions. Such optimal estimates are compared with usual estimates, e.g., maximum likelihood, two-stage least squares, ordinary least squares, etc. Some Monte Carlo experimental results in the literature are discussed and implications for the future are provided.  相似文献   

3.
The main objective of this paper is to develop an exact Bayesian technique that can be used to assign a multivariate time series realization to one of several autoregressive sources, with unknown coefficients and precision, that might have different orders. The foundation of the proposed technique is to develop the posterior mass function of a classification vector, in an easy form, using the conditional likelihood function. A multivariate time series realization is assigned to the multivariate autoregressive source with the largest posterior probability. A simulation study, with uniform prior mass function, is carried out to demonstrate the performance of the proposed technique and to test its adequacy in handling the multivariate classification problems. The analysis of the numerical results supports the adequacy of the proposed technique in solving the classification problems with multivariate autoregressive sources.  相似文献   

4.
Abstract.  The traditional Cox proportional hazards regression model uses an exponential relative risk function. We argue that under various plausible scenarios, the relative risk part of the model should be bounded, suggesting also that the traditional model often might overdramatize the hazard rate assessment for individuals with unusual covariates. This motivates our working with proportional hazards models where the relative risk function takes a logistic form. We provide frequentist methods, based on the partial likelihood, and then go on to semiparametric Bayesian constructions. These involve a Beta process for the cumulative baseline hazard function and any prior with a density, for example that dictated by a Jeffreys-type argument, for the regression coefficients. The posterior is derived using machinery for Lévy processes, and a simulation recipe is devised for sampling from the posterior distribution of any quantity. Our methods are illustrated on real data. A Bernshtĕn–von Mises theorem is reached for our class of semiparametric priors, guaranteeing asymptotic normality of the posterior processes.  相似文献   

5.
Bayesian optimal designs have received increasing attention in recent years, especially in biomedical and clinical trials. Bayesian design procedures can utilize the available prior information of the unknown parameters so that a better design can be achieved. With this in mind, this article considers the Bayesian A- and D-optimal designs of the two- and three-parameter Gamma regression model. In this regard, we first obtain the Fisher information matrix of the proposed model and then calculate the Bayesian A- and D-optimal designs assuming various prior distributions such as normal, half-normal, gamma, and uniform distribution for the unknown parameters. All of the numerical calculations are handled in R software. The results of this article are useful in medical and industrial researches.  相似文献   

6.
The concavity of some Bayesian D -optimality criteria is investigated and is found in some cases to depend on the prior distribution. In the case of a non-concave criterion, the standard equivalence theorem may fail, but a local version continues to apply.  相似文献   

7.
In this paper we present decomposable priors, a family of priors over structure and parameters of tree belief nets for which Bayesian learning with complete observations is tractable, in the sense that the posterior is also decomposable and can be completely determined analytically in polynomial time. Our result is the first where computing the normalization constant and averaging over a super-exponential number of graph structures can be performed in polynomial time. This follows from two main results: First, we show that factored distributions over spanning trees in a graph can be integrated in closed form. Second, we examine priors over tree parameters and show that a set of assumptions similar to Heckerman, Geiger and Chickering (1995) constrain the tree parameter priors to be a compactly parametrized product of Dirichlet distributions. Besides allowing for exact Bayesian learning, these results permit us to formulate a new class of tractable latent variable models in which the likelihood of a data point is computed through an ensemble average over tree structures.  相似文献   

8.
We discuss Bayesian analyses of traditional normal-mixture models for classification and discrimination. The development involves application of an iterative resampling approach to Monte Carlo inference, commonly called Gibbs sampling, and demonstrates routine application. We stress the benefits of exact analyses over traditional classification and discrimination techniques, including the ease with which such analyses may be performed in a quite general setting, with possibly several normal-mixture components having different covariance matrices, the computation of exact posterior classification probabilities for observed data and for future cases to be classified, and posterior distributions for these probabilities that allow for assessment of second-level uncertainties in classification.  相似文献   

9.
A family of Viterbi Bayesian predictive classifiers has been recently popularized for speech recognition applications with continuous acoustic signals modeled by finite mixture densities embedded in a hidden Markov framework. Here we generalize such classifiers to sequentially observed data from multiple finite alphabets and derive the optimal predictive classifier under exchangeability of the emitted symbols. We demonstrate that the optimal predictive classifier which learns from unlabelled test items improves considerably upon marginal maximum a posteriori rule in the presence of sparse training data. It is shown that the learning process saturates when the amount of test data tends to infinity, such that no further gain in classification accuracy is possible upon arrival of new test items in the long run.  相似文献   

10.
In this article we suggest a definition for the notion of L1-distance that combines probability density functions and prior probabilities. We also obtain the upper and lower bounds for this distance as well as its relation to other measures. Besides, the relationship between the proposed distance and quantities involved in classification problem by Bayesian method will be established. In practice, calculations are performed by Matlab procedures. As an illustration for applications of the obtained results, the article gives here an estimation for the ability to repay bank debt of some companies in Can Tho City, Vietnam.  相似文献   

11.
A common problem in medical statistics is the discrimination between two groups on the basis of diagnostic information. Information on patient characteristics is used to classify individuals into one of two groups: diseased or disease-free. This classification is often with respect to a particular disease. This discrimination has two probabilistic components: (1) the discrimination is not without error, and (2) in many cases the a priori chance of disease can be estimated. Logistic models (Cox 1970; Anderson 1972) provide methods for incorporating both of these components. The a posteriori probability of disease may be estimated for a patient on the basis of both current measurement of patient characteristics and prior information. The parameters of the logistic model may be estimated on the basis of a calibration trial. In practice, not one but several sets of measurements of one characteristic of the patient may be made on a questionable case. These measurements typically are correlated; they are far from independent. How should these correlated measurements be used? This paper presents a method for incorporating several sets of measurements in the classification of a case.  相似文献   

12.
Here we consider a multinomial probit regression model where the number of variables substantially exceeds the sample size and only a subset of the available variables is associated with the response. Thus selecting a small number of relevant variables for classification has received a great deal of attention. Generally when the number of variables is substantial, sparsity-enforcing priors for the regression coefficients are called for on grounds of predictive generalization and computational ease. In this paper, we propose a sparse Bayesian variable selection method in multinomial probit regression model for multi-class classification. The performance of our proposed method is demonstrated with one simulated data and three well-known gene expression profiling data: breast cancer data, leukemia data, and small round blue-cell tumors. The results show that compared with other methods, our method is able to select the relevant variables and can obtain competitive classification accuracy with a small subset of relevant genes.  相似文献   

13.
ABSTRACT

In this paper, we propose an adaptive stochastic gradient boosting tree for classification studies with imbalanced data. The adjustment of cost-sensitivity and the predictive threshold are integrated together with a composite criterion into the original stochastic gradient boosting tree to deal with the issues of the imbalanced data structure. Numerical study shows that the proposed method can significantly enhance the classification accuracy for the minority class with only a small loss in the true negative rate for the majority class. We discuss the relation of the cost-sensitivity to the threshold manipulation using simulations. An illustrative example of the analysis of suboptimal health-state data in traditional Chinese medicine is discussed.  相似文献   

14.
15.
We introduce a new notion of positive dependence of survival times of system components using the multivariate arrangement increasing property. Following the spirit of Barlow and Mendel (J. Amer. Statist. Assoc. 87, 1116–1122), who introduced a new univariate aging notion relative to exchangeable populations of components, we characterize a multivariate positive dependence with respect to exchangeable multicomponent systems. Closure properties of such a class of distributions under some reliability operations are discussed. For an infinite population of systems our definition of multivariate positive dependence can be considered in the frequentist’s paradigm as multivariate totally positive of order 2 with an independence condition. de Finetti(-type) representations for a particular class of survival functions are also given.  相似文献   

16.
This article reviews Bayesian inference from the perspective that the designated model is misspecified. This misspecification has implications in interpretation of objects, such as the prior distribution, which has been the cause of recent questioning of the appropriateness of Bayesian inference in this scenario. The main focus of this article is to establish the suitability of applying the Bayes update to a misspecified model, and relies on representation theorems for sequences of symmetric distributions; the identification of parameter values of interest; and the construction of sequences of distributions which act as the guesses as to where the next observation is coming from. A conclusion is that a clear identification of the fundamental starting point for the Bayesian is described.  相似文献   

17.
We develop a Bayesian framework for estimating the means of two random variables when only the sum of those random variables can be observed. Mixture models are proposed for establishing conjugacy between the joint prior distribution and the distribution for observations. Among other desirable features, conjugate distributions allow Bayesian methods to be applied in sequential decision problems.  相似文献   

18.
Gradient Boosting (GB) was introduced to address both classification and regression problems with great power. People have studied the boosting with L2 loss intensively both in theory and practice. However, the L2 loss is not proper for learning distributional functionals beyond the conditional mean such as conditional quantiles. There are huge amount of literatures studying conditional quantile prediction with various methods including machine learning techniques such like random forests and boosting. Simulation studies reveal that the weakness of random forests lies in predicting centre quantiles and that of GB lies in predicting extremes. Is there an algorithm that enjoys the advantages of both random forests and boosting so that it can perform well over all quantiles? In this article, we propose such a boosting algorithm called random GB which embraces the merits of both random forests and GB. Empirical results will be presented to support the superiority of this algorithm in predicting conditional quantiles.  相似文献   

19.
In high-dimensional setting, componentwise L2boosting has been used to construct sparse model that performs well, but it tends to select many ineffective variables. Several sparse boosting methods, such as, SparseL2Boosting and Twin Boosting, have been proposed to improve the variable selection of L2boosting algorithm. In this article, we propose a new general sparse boosting method (GSBoosting). The relations are established between GSBoosting and other well known regularized variable selection methods in the orthogonal linear model, such as adaptive Lasso, hard thresholds, etc. Simulation results show that GSBoosting has good performance in both prediction and variable selection.  相似文献   

20.
The main object of Bayesian statistical inference is the determination of posterior distributions. Sometimes these laws are given for quantities devoid of empirical value. This serious drawback vanishes when one confines oneself to considering a finite horizon framework. However, assuming infinite exchangeability gives rise to fairly tractable a posteriori quantities, which is very attractive in applications. Hence, with a view to a reconciliation between these two aspects of the Bayesian way of reasoning, in this paper we provide quantitative comparisons between posterior distributions of finitary parameters and posterior distributions of allied parameters appearing in usual statistical models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号