首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
随机森林方法研究综述   总被引:47,自引:0,他引:47  
随机森林(RF)是一种统计学习理论,它是利用bootsrap重抽样方法从原始样本中抽取多个样本,对每个bootsrap样本进行决策树建模,然后组合多棵决策树的预测,通过投票得出最终预测结果。它具有很高的预测准确率,对异常值和噪声具有很好的容忍度,且不容易出现过拟合,在医学、生物信息、管理学等领域有着广泛的应用。为此,介绍了随机森林原理及其有关性质,讨论其最新的发展情况以及一些重要的应用领域。  相似文献   

2.
With a view to the study of, for instance, arterial trees, this paper presents some exact distributional results on finite trees with (reciprocal) inverse Gaussian and gamma resistances. In particular, it is shown that under the specified model the conditional distribution of the minimal sufficient statistic given the total resistance of the tree is a convolution of gamma distributions and two-dimensional reciprocal inverse Gaussian distributions.  相似文献   

3.
This paper investigates the second order properties of a stationary process after random sampling. While a short memory process gives always rise to a short memory one, we prove that long-memory can disappear when the sampling law has heavy enough tails. We prove that under rather general conditions the existence of the spectral density is preserved by random sampling. We also investigate the effects of deterministic sampling on seasonal long-memory.  相似文献   

4.
The stated goal of this paper is to propose the uniformly minimum variance unbiased estimator of odds ratio in case–control studies under inverse sampling design. The problem of estimating odds ratio plays a central role in case–control studies. However, the traditional sampling schemes appear inadequate when the expected frequencies of not exposed cases and exposed controls can be very low. In such a case, it is convenient to use the inverse sampling design, which requires that random drawings shall be continued until a given number of relevant events has emerged. In this paper we prove that a uniformly minimum variance unbiased estimator of odds ratio does not exist under usual binomial sampling, while the standard odds ratio estimator is uniformly minimum variance unbiased under inverse sampling. In addition, we compare these two sampling schemes by means of large-sample theory and small-sample simulation.  相似文献   

5.
In this article, an unbiased estimator for finite population variance is developed under linear systematic sampling with two random starts and an explicit expression for its variance is also obtained. The study is supported by two real life situations. A detailed numerical comparative study has been carried out to compare its average variance with the average variance of the conventional unbiased estimator for finite population variance under simple random sampling for a wide variety of populations. Results based on the study strongly favor the use of the developed estimator for such populations.  相似文献   

6.
For a stratified population under inverse sampling, we propose and study an unbiased estimator for the mean of units belonging to a domain with specific features. An alternative, simpler, ratio-type estimator is also considered. Empirical studies show that strategies based on inverse sampling can be superior to a more traditional strategy based on stratified simple random sampling with a fixed number of draws in each stratum.  相似文献   

7.
We introduce a class of random fields that can be understood as discrete versions of multicolour polygonal fields built on regular linear tessellations. We focus first on a subclass of consistent polygonal fields, for which we show Markovianity and solvability by means of a dynamic representation. This representation is used to design new sampling techniques for Gibbsian modifications of such fields, a class which covers lattice‐based random fields. A flux‐based modification is applied to the extraction of the field tracks network from a Synthetic Aperture Radar image of a rural area.  相似文献   

8.
A general family of estimators, which use the information of two auxiliary variables in the stratified random sampling, is proposed to estimate the population mean of the variable under study. Under stratified random sampling without replacement scheme, the expressions of bias and mean square error (MSE) up to the first- and second-order approximations are derived. The family of estimators in its optimum case is discussed. Also, an empirical study is carried out to show the properties of the proposed estimators.  相似文献   

9.
We consider the issue of sampling from the posterior distribution of exponential random graph (ERG) models and other statistical models with intractable normalizing constants. Existing methods based on exact sampling are either infeasible or require very long computing time. We study a class of approximate Markov chain Monte Carlo (MCMC) sampling schemes that deal with this issue. We also develop a new Metropolis–Hastings kernel to sample sparse large networks from ERG models. We illustrate the proposed methods on several examples.  相似文献   

10.
In many industrial quality control experiments and destructive stress testing, the only available data are successive minima (or maxima)i.e., record-breaking data. There are two sampling schemes used to collect record-breaking data: random sampling and inverse sampling. For random sampling, the total sample size is predetermined and the number of records is a random variable while in inverse-sampling the number of records to be observed is predetermined; thus the sample size is a random variable. The purpose of this papper is to determinevia simulations, which of the two schemes, if any, is more efficient. Since the two schemes are equivalent asymptotically, the simulations were carried out for small to moderate sized record-breaking samples. Simulated biases and mean square errors of the maximum likelihood estimators of the parameters using the two sampling schemes were compared. In general, it was found that if the estimators were well behaved, then there was no significant difference between the mean square errors of the estimates for the two schemes. However, for certain distributions described by both a shape and a scale parameter, random sampling led to estimators that were inconsistent. On the other hand, the estimated obtained from inverse sampling were always consistent. Moreover, for moderated sized record-breaking samples, the total sample size that needs to be observed is smaller for inverse sampling than for random sampling.  相似文献   

11.
Within the context of non-parametric Bayesian inference, Dykstra and Laud (1981) define an extended gamma (EG) process and use it as a prior on increasing hazard rates. The attractive features of the extended gamma (EG) process, among them its capability to index distribution functions that are absolutely continuous, are offset by the intractable nature of the computation that needs to be performed. Sampling based approaches such as the Gibbs Sampler can alleviate these difficulties but the EG processes then give rise to the problem of efficient random variate generation from a class of distributions called D-distributions. In this paper, we describe a novel technique for sampling from such distributions, thereby providing an efficient computation procedure for non-parametric Bayesian inference with a rich class of priors for hazard rates.  相似文献   

12.
综合Adaboost算法的自适应再加权和随机森林算法的未修剪随机变量划分树基模型,文章提出了用于自适应随机森林算法。通过实验数据发现,在训练集较大、贝叶斯误差较小时,模拟自适应再加权会起作用,从而,拟自适应随机森林算法会优于随机森林算法。  相似文献   

13.
This paper presents a new random weighting-based adaptive importance resampling method to estimate the sampling distribution of a statistic. A random weighting-based cross-entropy procedure is developed to iteratively calculate the optimal resampling probability weights by minimizing the Kullback-Leibler distance between the optimal importance resampling distribution and a family of parameterized distributions. Subsequently, the random weighting estimation of the sampling distribution is constructed from the obtained optimal importance resampling distribution. The convergence of the proposed method is rigorously proved. Simulation and experimental results demonstrate that the proposed method can effectively estimate the sampling distribution of a statistic.  相似文献   

14.
本文系统介绍了永久随机数法样本轮换理论,讨论了在等概率、不等概率抽样条件下永久随机数法样本轮换的具体应用,并将其与传统的子样本轮换方法进行比较,希望能够促进永久随机数法样本轮换在经常性抽样调查中的应用和推广。  相似文献   

15.
The efficiency of schemes for sampling an alternating Poisson process (0,1 observations) is evaluated by the inverse ratio of the variance of the proportion estimate, p, to the binomial variance. The variance ratio presented by D.R. Cox (in Renewal Theory) for fixed interval sampling is generalized to accommodate random sampling and random sampling after a time delay equal to a fixed proportion, γ , of the mean time between observations, δ. The result is a sampling design tool that provides quantifications for the effect of various spacings between observations and of fixed vs. random sampling. Direct application is made to thes field of work sampling.  相似文献   

16.
In this article, a chain ratio-product type exponential estimator is proposed for estimating finite population mean in stratified random sampling with two auxiliary variables under double sampling design. Theoretical and empirical results show that the proposed estimator is more efficient than the existing estimators, i.e., usual stratified random sample mean estimator, Chand (1975) chain ratio estimator, Choudhary and Singh (2012) estimator, chain ratio-product-type estimator, Sahoo et al. (1993) difference type estimator, and Kiregyera (1984) regression-type estimator. Two data sets are used to illustrate the performances of different estimators.  相似文献   

17.
A monitoring scheme is proposed to sequentially detect a structural change in random coefficient autoregressive time series of order p (RCA(p)) after a training period of size T. It extends structural change monitoring to RCA(p) time series. The asymptotic properties of our monitoring statistic are established under both the null of no change in parameters and the alternative of a change in coefficient. The finite sample properties are investigated by a simulation study.  相似文献   

18.
The Bertrand paradox is that, whereas we can define in a unique way a point uniformly at random in the interior of a circle, uniformly random chords can be given a variety of competing specifications. This is generalized to spheres, and the distributions of the uniformly random line sections (chords) and plane sections (disks) are tabulated. This includes the large class which are constructed as uniformly random chords of uniformly random disk sections.  相似文献   

19.
用于分类的随机森林和Bagging分类树比较   总被引:1,自引:0,他引:1  
借助试验数据,从两种理论分析角度解释随机森林算法优于Bagging分类树算法的原因。将两种算法表述在两种不同的框架下,消除了这两种算法分析中的一些模糊之处。尤其在第二种分析框架下,更能清楚的看出,之所以随机森林算法优于Bagging分类树算法,是因为随机森林算法对应更小的偏差。  相似文献   

20.
An envelope-rejection method is used to generate random variates from the Watson distribution. The method is compact and is competitive with, if not superior to, the existing sampling algorithms. For the girdle form of the Watson distribution, a faster algorithm is proposed. As a result, Johnson's sampling algorithm for the Bingham distribution is improved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号