首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A simple proof is given to show that there always exists a neighborhood of zero in which a moment generating function has a power series expansion. Thus, the relation between moments and derivatives of the moment generating function at zero can be obtained without resorting to postcalculus theorems.  相似文献   

2.
文章证明了参考集为有限集时的综合DEA模型的最优值的分布的无缺陷性.  相似文献   

3.
选择性集成算法是目前机器学习关注的热点之一。在对一海藻繁殖案例研究的基础上,提出了一种基于k—nleanS聚类技术的快速选择性BaggingTre咚集成算法;同时与传统统计方法和一些常用的机器学习方法相比较,发现该算法具有较小的模型推广误差和更高的预测精度的优点,而且其运行的效率也得到了较大的提高。  相似文献   

4.
An exact confidence interval for the number or proportion of successes in a finite population is developed using the standard technique of inverting a family of tests. The resulting procedure is compared with two methods available in the sampling literature and is shown to be equivalent to one of the methods and superior to the other method.  相似文献   

5.
随机森林方法研究综述   总被引:57,自引:5,他引:57  
随机森林(RF)是一种统计学习理论,它是利用bootsrap重抽样方法从原始样本中抽取多个样本,对每个bootsrap样本进行决策树建模,然后组合多棵决策树的预测,通过投票得出最终预测结果。它具有很高的预测准确率,对异常值和噪声具有很好的容忍度,且不容易出现过拟合,在医学、生物信息、管理学等领域有着广泛的应用。为此,介绍了随机森林原理及其有关性质,讨论其最新的发展情况以及一些重要的应用领域。  相似文献   

6.
This paper presents a Bayesian non-parametric approach to survival analysis based on arbitrarily right censored data. The analysis is based on posterior predictive probabilities using a Polya tree prior distribution on the space of probability measures on [0, ∞). In particular we show that the estimate generalizes the classical Kaplanndash;Meier non-parametric estimator, which is obtained in the limiting case as the weight of prior information tends to zero.  相似文献   

7.
A statistical distribution of a random variable is uniquely represented by its normal-based quantile function. For a symmetrical distribution it is S-shaped (for negative kurtosis) and inverted S-shaped (otherwise). As skewness departs from zero, the quantile function gradually transforms into a monotone convex function (positive skewness) or concave function (otherwise). Recently, a new general modeling platform has been introduced, response modeling methodology, which delivers good representation to monotone convex relationships due to its unique “continuous monotone convexity” property. In this article, this property is exploited to model the normal-based quantile function, and explored using a set of 27 distributions.  相似文献   

8.
《统计学通讯:理论与方法》2012,41(16-17):3244-3258
An extension of soft classification trees to multinomial outcomes is presented. Estimates of the method's predictive accuracy, as well as average tree size and tree depths, are systematically compared to those of the conventional Classification and Regression Tree (CART) approach by the means of simulations. A similar comparison is performed on real datasets. Results point to an advantage in favor of the soft tree.  相似文献   

9.
The authors derive the analytic expressions for the mean and variance of the log-likelihood ratio for testing equality of k (k ≥ 2) normal populations, and suggest a chi-square approximation and a gamma approximation to the exact null distribution. Numerical comparisons show that the two approximations and the original beta approximation of Neyman and Pearson (1931 Neyman , J. , Pearson , E. S. ( 1931 ). On the problem of k samples . In: Neyman , J. , Pearson , E. S. , eds. Joint Statistical Papers . Cambridge : Cambridge University Press , pp. 116131 . [Google Scholar]) are all accurate, and the gamma approximation is the most accurate.  相似文献   

10.
11.
The mean and variance of a sum of a random number of random variables are well known when the number of summands is independent of each summand and when the summands are independent and identically distributed (iid), or when all summands are identical. In scientific and financial applications, the preceding conditions are often too restrictive. Here, we calculate the mean and variance of a sum of a random number of random summands when the mean and variance of each summand depend on the number of summands and when every pair of summands has the same correlation. This article shows that the variance increases with the correlation between summands and equals the variance in the iid or identical cases when the correlation is zero or one.  相似文献   

12.
With a view to the study of, for instance, arterial trees, this paper presents some exact distributional results on finite trees with (reciprocal) inverse Gaussian and gamma resistances. In particular, it is shown that under the specified model the conditional distribution of the minimal sufficient statistic given the total resistance of the tree is a convolution of gamma distributions and two-dimensional reciprocal inverse Gaussian distributions.  相似文献   

13.
It is well known that it is difficult to obtain an accurate optimal design for a mixture experimental design with complex constraints. In this article, we construct a random search algorithm which can be used to find the optimal design for mixture model with complex constraints. First, we generate an initial set by the Monte-Carlo method, and then run the random search algorithm to get the optimal set of points. After that, we explain the effectiveness of this method by using two examples.  相似文献   

14.
In this paper, we show that a hypergeometric random variable can be represented as a sum of independent Bernoulli random variables that are, except in degenerate cases, not identically distributed. In the proof, we use the factorial moment generating function. An asymptotic result on the probabilities of the Bernoulli random variables in the sum is also presented. Numerical examples are used to illustrate the results.  相似文献   

15.
Consider observations (representing lifelengths) taken on a random field indexed by lattice points. Estimating the distribution function F(x) = P(X i  ≤ x) is an important problem in survival analysis. We propose to estimate F(x) by kernel estimators, which take into account the smoothness of the distribution function. Under some general mixing conditions, our estimators are shown to be asymptotically unbiased and consistent. In addition, the proposed estimator is shown to be strongly consistent and sharp rates of convergence are obtained.  相似文献   

16.
Abstract

Teratological experiments are controlled dose-response studies in which impregnated animals are randomly assigned to various exposure levels of a toxic substance. Subsequently, both continuous and discrete responses are recorded on the litters of fetuses that these animals produce. Discrete responses are usually binary in nature, such as the presence or absence of some fetal anomaly. This clustered binary data usually exhibits over-dispersion (or under-dispersion), which can be interpreted as either variation between litter response probabilities or intralitter correlation. To model the correlation and/or variation, the beta-binomial distribution has been assumed for the number of positive fetal responses within a litter. Although the mean of the beta-binomial model has been linked to dose-response functions, in terms of measuring over-dispersion, it may be a restrictive method in modeling data from teratological studies. Also for certain toxins, a threshold effect has been observed in the dose-response pattern of the data. We propose to incorporate a random effect into a general threshold dose-response model to account for the variation in responses, while at the same time estimating the threshold effect. We fit this model to a well-known data set in the field of teratology. Simulation studies are performed to assess the validity of the random effects threshold model in these types of studies.  相似文献   

17.
In clinical research an early and prompt detection of the risk class of a new patient may really play a crucial role in determining the effectiveness of the treatment and, consequently, achieving a satisfying prognosis of the patient's chances. There exists a number of popular rule-based algorithms for classification, whose performances are very attractive whenever data of large number of patients are available. However, when datasets only include data of a few hundred patients, the most common approaches give unstable results and developing effective decision-support systems become scientifically challenging. Since rules can be derived from different models as well as expert knowledge resources, each of them having its advantages and weaknesses, this article suggests a “hybrid” approach to address the classification problem when the number of patients is too small to effectively use a single technique only. The hybrid strategy was applied to a case study and its predictive performance was compared with performances of each single approach: due to the seriousness of a misclassification of high-risk patients, special attention was paid on the specificity. The results show that the hybrid strategy outperforms each single strategy involved.  相似文献   

18.
In this article, a novel technique IRUSRT (inverse random under sampling and random tree) by combining inverse random under sampling and random tree is proposed to implement imbalanced learning. The main idea is to severely under sample the majority class thus creating multiple distinct training sets. With each training set, a random tree is trained to separate the minority class from the majority class. By combining these random trees through fusion, a composite classifier is constructed. The experimental analysis on 23 real-world datasets assessed over area under the ROC curve (AUC), F-measure, and G-mean indicates that IRUSRT performs significantly better when compared with many existing class imbalance learning methods.  相似文献   

19.
Sliced Inverse Regression (SIR; 1991) is a dimension reduction method for reducing the dimension of the predictors without losing regression information. The implementation of SIR requires inverting the covariance matrix of the predictors—which has hindered its use to analyze high-dimensional data where the number of predictors exceed the sample size. We propose random sliced inverse regression (rSIR) by applying SIR to many bootstrap samples, each using a subset of randomly selected candidate predictors. The final rSIR estimate is obtained by aggregating these estimates. A simple variable selection procedure is also proposed using these bootstrap estimates. The performance of the proposed estimates is studied via extensive simulation. Application to a dataset concerning myocardial perfusion diagnosis from cardiac Single Proton Emission Computed Tomography (SPECT) images is presented.  相似文献   

20.
Generalized degrees of freedom (GDF), as defined by Ye (1998 Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. Journal of the American Statistical Association 93(441):120131. [Google Scholar] JASA 93:120–131), represent the sensitivity of model fits to perturbations of the data. Such GDF can be computed for any statistical model, making it possible, in principle, to derive the effective number of parameters in machine-learning approaches and thus compute information-theoretical measures of fit. We compare GDF with cross-validation and find that the latter provides a less computer-intensive and more robust alternative. For Bernoulli-distributed data, GDF estimates were unstable and inconsistently sensitive to the number of data points perturbed simultaneously. Cross-validation, in contrast, performs well also for binary data, and for very different machine-learning approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号