期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

随机森林方法研究综述 总被引：47，自引：0，他引：47

方匡南吴见彬朱建平谢邦昌《统计与信息论坛》2011,26(3):32-38

随机森林（RF）是一种统计学习理论,它是利用bootsrap重抽样方法从原始样本中抽取多个样本,对每个bootsrap样本进行决策树建模,然后组合多棵决策树的预测,通过投票得出最终预测结果。它具有很高的预测准确率,对异常值和噪声具有很好的容忍度,且不容易出现过拟合,在医学、生物信息、管理学等领域有着广泛的应用。为此,介绍了随机森林原理及其有关性质,讨论其最新的发展情况以及一些重要的应用领域。相似文献

2.

Exact Distributional Results for Random Resistance Trees

Ole E. Barndorff-Nielsen & Tina Hviid Rydberg 《Scandinavian Journal of Statistics》2000,27(1):129-141

With a view to the study of, for instance, arterial trees, this paper presents some exact distributional results on finite trees with (reciprocal) inverse Gaussian and gamma resistances. In particular, it is shown that under the specified model the conditional distribution of the minimal sufficient statistic given the total resistance of the tree is a convolution of gamma distributions and two-dimensional reciprocal inverse Gaussian distributions. 相似文献

3.

Random sampling of long-memory stationary processes

Anne Philippe Marie-Claude Viano 《Journal of statistical planning and inference》2010

This paper investigates the second order properties of a stationary process after random sampling. While a short memory process gives always rise to a short memory one, we prove that long-memory can disappear when the sampling law has heavy enough tails. We prove that under rather general conditions the existence of the spectral density is preserved by random sampling. We also investigate the effects of deterministic sampling on seasonal long-memory. 相似文献

4.

The uniformly minimum variance unbiased estimator of odds ratio in case–control studies under inverse sampling

Piero?Quatto Email author Antonella?Zambon 《Statistical Papers》2012,53(2):305-309

The stated goal of this paper is to propose the uniformly minimum variance unbiased estimator of odds ratio in case–control studies under inverse sampling design. The problem of estimating odds ratio plays a central role in case–control studies. However, the traditional sampling schemes appear inadequate when the expected frequencies of not exposed cases and exposed controls can be very low. In such a case, it is convenient to use the inverse sampling design, which requires that random drawings shall be continued until a given number of relevant events has emerged. In this paper we prove that a uniformly minimum variance unbiased estimator of odds ratio does not exist under usual binomial sampling, while the standard odds ratio estimator is uniformly minimum variance unbiased under inverse sampling. In addition, we compare these two sampling schemes by means of large-sample theory and small-sample simulation. 相似文献

5.

Finite Population Variance Estimation Under LSS with Multiple Random Starts

S. Sampath 《统计学通讯:理论与方法》2013,42(19):3596-3607

In this article, an unbiased estimator for finite population variance is developed under linear systematic sampling with two random starts and an explicit expression for its variance is also obtained. The study is supported by two real life situations. A detailed numerical comparative study has been carried out to compare its average variance with the average variance of the conventional unbiased estimator for finite population variance under simple random sampling for a wide variety of populations. Results based on the study strongly favor the use of the developed estimator for such populations. 相似文献

6.

INVERSE SAMPLING FOR DOMAIN ESTIMATION IN A STRATIFIED POPULATION

Rahul Mukerjee Sujit K. Basu 《Australian & New Zealand Journal of Statistics》1993,35(3):293-302

For a stratified population under inverse sampling, we propose and study an unbiased estimator for the mean of units belonging to a domain with specific features. An alternative, simpler, ratio-type estimator is also considered. Empirical studies show that strategies based on inverse sampling can be superior to a more traditional strategy based on stratified simple random sampling with a fixed number of draws in each stratum. 相似文献

7.

Discrete Multicolour Random Mosaics with an Application to Network Extraction

M.N.M. van Lieshout 《Scandinavian Journal of Statistics》2013,40(4):734-751

We introduce a class of random fields that can be understood as discrete versions of multicolour polygonal fields built on regular linear tessellations. We focus first on a subclass of consistent polygonal fields, for which we show Markovianity and solvability by means of a dynamic representation. This representation is used to design new sampling techniques for Gibbsian modifications of such fields, a class which covers lattice‐based random fields. A flux‐based modification is applied to the extraction of the field tracks network from a Synthetic Aperture Radar image of a rural area. 相似文献

8.

Family of Estimators of Population Mean Using Two Auxiliary Variables in Stratified Random Sampling

Nursel Koyuncu Cem Kadilar 《统计学通讯:理论与方法》2013,42(14):2398-2417

A general family of estimators, which use the information of two auxiliary variables in the stratified random sampling, is proposed to estimate the population mean of the variable under study. Under stratified random sampling without replacement scheme, the expressions of bias and mean square error (MSE) up to the first- and second-order approximations are derived. The family of estimators in its optimum case is discussed. Also, an empirical study is carried out to show the properties of the proposed estimators. 相似文献

9.

Approximate Bayesian Computation for Exponential Random Graph Models for Large Social Networks

Jing Wang Yves F. Atchadé 《统计学通讯:模拟与计算》2013,42(2):359-377

We consider the issue of sampling from the posterior distribution of exponential random graph (ERG) models and other statistical models with intractable normalizing constants. Existing methods based on exact sampling are either infeasible or require very long computing time. We study a class of approximate Markov chain Monte Carlo (MCMC) sampling schemes that deal with this issue. We also develop a new Metropolis–Hastings kernel to sample sparse large networks from ERG models. We illustrate the proposed methods on several examples. 相似文献

10.

Record-breaking data: a parametric comparison of the inverse-sampling and the random-sampling schemes

《Journal of Statistical Computation and Simulation》2012,82(3):225-238

In many industrial quality control experiments and destructive stress testing, the only available data are successive minima (or maxima)i.e., record-breaking data. There are two sampling schemes used to collect record-breaking data: random sampling and inverse sampling. For random sampling, the total sample size is predetermined and the number of records is a random variable while in inverse-sampling the number of records to be observed is predetermined; thus the sample size is a random variable. The purpose of this papper is to determinevia simulations, which of the two schemes, if any, is more efficient. Since the two schemes are equivalent asymptotically, the simulations were carried out for small to moderate sized record-breaking samples. Simulated biases and mean square errors of the maximum likelihood estimators of the parameters using the two sampling schemes were compared. In general, it was found that if the estimators were well behaved, then there was no significant difference between the mean square errors of the estimates for the two schemes. However, for certain distributions described by both a shape and a scale parameter, random sampling led to estimators that were inconsistent. On the other hand, the estimated obtained from inverse sampling were always consistent. Moreover, for moderated sized record-breaking samples, the total sample size that needs to be observed is smaller for inverse sampling than for random sampling. 相似文献

11.

Random variate generation from D-distributions

Purushottam W. Laud Paul Ramgopal Adrian F. M. Smith 《Statistics and Computing》1993,3(3):109-112

Within the context of non-parametric Bayesian inference, Dykstra and Laud (1981) define an extended gamma (EG) process and use it as a prior on increasing hazard rates. The attractive features of the extended gamma (EG) process, among them its capability to index distribution functions that are absolutely continuous, are offset by the intractable nature of the computation that needs to be performed. Sampling based approaches such as the Gibbs Sampler can alleviate these difficulties but the EG processes then give rise to the problem of efficient random variate generation from a class of distributions called D-distributions. In this paper, we describe a novel technique for sampling from such distributions, thereby providing an efficient computation procedure for non-parametric Bayesian inference with a rich class of priors for hazard rates. 相似文献

12.

拟适应再加权分类随机森林

马景义谢邦昌《统计与信息论坛》2010,25(3):13-16

综合Adaboost算法的自适应再加权和随机森林算法的未修剪随机变量划分树基模型,文章提出了用于自适应随机森林算法。通过实验数据发现,在训练集较大、贝叶斯误差较小时,模拟自适应再加权会起作用,从而,拟自适应随机森林算法会优于随机森林算法。相似文献

13.

Random weighting estimation of sampling distributions via importance resampling

Bingbing Gao Shesheng Gao Yongmin Zhong Chengfan Gu 《统计学通讯:模拟与计算》2017,46(1):640-654

This paper presents a new random weighting-based adaptive importance resampling method to estimate the sampling distribution of a statistic. A random weighting-based cross-entropy procedure is developed to iteratively calculate the optimal resampling probability weights by minimizing the Kullback-Leibler distance between the optimal importance resampling distribution and a family of parameterized distributions. Subsequently, the random weighting estimation of the sampling distribution is constructed from the obtained optimal importance resampling distribution. The convergence of the proposed method is rigorously proved. Simulation and experimental results demonstrate that the proposed method can effectively estimate the sampling distribution of a statistic. 相似文献

14.

永久随机数法样本轮换初探

金勇进栾文英《统计教育》2004,(2):14-16

本文系统介绍了永久随机数法样本轮换理论,讨论了在等概率、不等概率抽样条件下永久随机数法样本轮换的具体应用,并将其与传统的子样本轮换方法进行比较,希望能够促进永久随机数法样本轮换在经常性抽样调查中的应用和推广。相似文献

15.

Sampling efficiency for an alternating poisson process

Elinor S. Pape 《统计学通讯:理论与方法》2013,42(9):3175-3178

The efficiency of schemes for sampling an alternating Poisson process (0,1 observations) is evaluated by the inverse ratio of the variance of the proportion estimate, p, to the binomial variance. The variance ratio presented by D.R. Cox (in Renewal Theory) for fixed interval sampling is generalized to accommodate random sampling and random sampling after a time delay equal to a fixed proportion, γ , of the mean time between observations, δ. The result is a sampling design tool that provides quantifications for the effect of various spacings between observations and of fixed vs. random sampling. Direct application is made to thes field of work sampling. 相似文献

16.

Estimation of Finite Population Mean in Stratified Random Sampling with Two Auxiliary Variables under Double Sampling Design

Sat Gupta 《统计学通讯:理论与方法》2013,42(13):2798-2808

In this article, a chain ratio-product type exponential estimator is proposed for estimating finite population mean in stratified random sampling with two auxiliary variables under double sampling design. Theoretical and empirical results show that the proposed estimator is more efficient than the existing estimators, i.e., usual stratified random sample mean estimator, Chand (1975) chain ratio estimator, Choudhary and Singh (2012) estimator, chain ratio-product-type estimator, Sahoo et al. (1993) difference type estimator, and Kiregyera (1984) regression-type estimator. Two data sets are used to illustrate the performances of different estimators. 相似文献

17.

Structural Change Monitoring for Random Coefficient Autoregressive Time Series

Fuxiao Li Zheng Tian Peiyan Qi 《统计学通讯:模拟与计算》2015,44(4):996-1009

A monitoring scheme is proposed to sequentially detect a structural change in random coefficient autoregressive time series of order p (RCA(p)) after a training period of size T. It extends structural change monitoring to RCA(p) time series. The asymptotic properties of our monitoring statistic are established under both the null of no change in parameters and the alternative of a change in coefficient. The finite sample properties are investigated by a simulation study. 相似文献

18.

Random sections of a sphere

Rodney Coleman 《Revue canadienne de statistique》1989,17(1):27-39

The Bertrand paradox is that, whereas we can define in a unique way a point uniformly at random in the interior of a circle, uniformly random chords can be given a variety of competing specifications. This is generalized to spheres, and the distributions of the uniformly random line sections (chords) and plane sections (disks) are tabulated. This includes the large class which are constructed as uniformly random chords of uniformly random disk sections. 相似文献

19.

用于分类的随机森林和Bagging分类树比较 总被引：1，自引：0，他引：1

马景义谢邦昌《统计与信息论坛》2010,25(10):18-22

借助试验数据,从两种理论分析角度解释随机森林算法优于Bagging分类树算法的原因。将两种算法表述在两种不同的框架下,消除了这两种算法分析中的一些模糊之处。尤其在第二种分析框架下,更能清楚的看出,之所以随机森林算法优于Bagging分类树算法,是因为随机森林算法对应更小的偏差。相似文献

20.

Random sampling from the watson distribution

Kim-Hung Li Carl Ka-Fai Wong 《统计学通讯:模拟与计算》2013,42(4):997-1009

An envelope-rejection method is used to generate random variates from the Watson distribution. The method is compact and is competitive with, if not superior to, the existing sampling algorithms. For the girdle form of the Watson distribution, a faster algorithm is proposed. As a result, Johnson's sampling algorithm for the Bingham distribution is improved. 相似文献