共查询到20条相似文献,搜索用时 31 毫秒
1.
A fast splitting procedure for classification trees 总被引:1,自引:0,他引:1
This paper provides a faster method to find the best split at each node when using the CART methodology. The predictability index is proposed as a splitting rule for growing the same classification tree as CART does when using the Gini index of heterogeneity as an impurity measure. A theorem is introduced to show a new property of the index : the for a given predictor has a value not lower than the for any split generated by the predictor. This property is used to make a substantial saving in the time required to generate a classification tree. Three simulation studies are presented in order to show the computational gain in terms of both the number of splits analysed at each node and the CPU time. The proposed splitting algorithm can prove computational efficiency in real data sets as shown in an example. 相似文献
2.
Zailei Cheng 《统计学通讯:理论与方法》2013,42(23):5850-5861
3.
4.
This paper deals with the construction of optimum partitions
of
for a clustering criterion which is based on a convex function of the class centroids
as a generalization of the classical SSQ clustering criterion for n data points. We formulate a dual optimality problem involving two sets of variables and derive a maximum-support-plane (MSP) algorithm for constructing a (sub-)optimum partition as a generalized k-means algorithm. We present various modifications of the basic criterion and describe the corresponding MSP algorithm. It is shown that the method can also be used for solving optimality problems in classical statistics (maximizing Csiszárs
-divergence) and for simultaneous classification of the rows and columns of a contingency table. 相似文献
5.
6.
7.
Kulasekera K.B. Williams Calvin L. Coffin Marie Manatunga Amita 《Lifetime data analysis》2001,7(4):415-433
Problems with censored data arise quite frequently in reliability applications. Estimation of the reliability function is usually of concern. Reliability function estimators proposed by Kaplan and Meier (1958), Breslow (1972), are generally used when dealing with censored data. These estimators have the known properties of being asymptotically unbiased, uniformly strongly consistent, and weakly convergent to the same Gaussian process, when properly normalized. We study the properties of the smoothed Kaplan-Meier estimator with a suitable kernel function in this paper. The smooth estimator is compared with the Kaplan-Meier and Breslow estimators for large sample sizes giving an exact expression for an appropriately normalized difference of the mean square error (MSE) of the two estimators. This quantifies the deficiency of the Kaplan-Meier estimator in comparison to the smoothed version. We also obtain a non-asymptotic bound on an expected 1-type error under weak conditions. Some simulations are carried out to examine the performance of the suggested method. 相似文献
8.
9.
10.
11.
12.
M. A. A. Cox 《统计学通讯:理论与方法》2013,42(20):5050-5057
AbstractThe adoption of control charts can be traced to the classic text by Shewhart (1931) and championed by many writers since then, including Deming (1982). Numerous other texts and publications stress the continuing importance of this area. While tables of key Shewhart control chart parameters are extremely useful they are easily lost or mislaid and can sometimes be difficult to interpret. To address this issue spreadsheet code is implemented to produce all the key control chart factors. 相似文献
13.
14.
M. Kukuk 《Statistical Papers》1994,35(1):231-242
For observable indicators with ordered categories one can assume underlying latent variables following certain marginal distributions. Transforming the latent variables changes its marginal distributions but not the observable qualitative indicators. The joint distribution of the latent variables can be constructed from the marginal distributions. There is a broad class of multivariate distributions for which the observable indicators are equivalent. By choosing the multivariate normal distribution from this class we can analyse a linear relationship between the transformed latent variables. This leads to latent structural equation models. Estimation of these latter models is therefore more general than the distributional assumption might initially suggest. Robustness of the estimation procedure is also discussed for deviations from this distribution family. Using ordinal business survey data of the German Ifo-institute we test the efficiency of firms' price expectations implied by the rational expectation hypothesis. 相似文献
15.
16.
17.
18.
For a sequence of strictly stationary random fields that are uniformly ρ′-mixing and satisfy a Lindeberg condition, a central limit theorem is obtained for sequences of “rectangular” sums from the given random fields. The “Lindeberg CLT” is then used to prove a CLT for some kernel estimators of probability density for some strictly stationary random fields satisfying ρ′-mixing, and whose probability density and joint densities are absolutely continuous. 相似文献
19.