首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A fast splitting procedure for classification trees   总被引:1,自引:0,他引:1  
This paper provides a faster method to find the best split at each node when using the CART methodology. The predictability index is proposed as a splitting rule for growing the same classification tree as CART does when using the Gini index of heterogeneity as an impurity measure. A theorem is introduced to show a new property of the index : the for a given predictor has a value not lower than the for any split generated by the predictor. This property is used to make a substantial saving in the time required to generate a classification tree. Three simulation studies are presented in order to show the computational gain in terms of both the number of splits analysed at each node and the CPU time. The proposed splitting algorithm can prove computational efficiency in real data sets as shown in an example.  相似文献   

2.
3.
4.
This paper deals with the construction of optimum partitions of for a clustering criterion which is based on a convex function of the class centroids as a generalization of the classical SSQ clustering criterion for n data points. We formulate a dual optimality problem involving two sets of variables and derive a maximum-support-plane (MSP) algorithm for constructing a (sub-)optimum partition as a generalized k-means algorithm. We present various modifications of the basic criterion and describe the corresponding MSP algorithm. It is shown that the method can also be used for solving optimality problems in classical statistics (maximizing Csiszárs -divergence) and for simultaneous classification of the rows and columns of a contingency table.  相似文献   

5.
6.
7.
Problems with censored data arise quite frequently in reliability applications. Estimation of the reliability function is usually of concern. Reliability function estimators proposed by Kaplan and Meier (1958), Breslow (1972), are generally used when dealing with censored data. These estimators have the known properties of being asymptotically unbiased, uniformly strongly consistent, and weakly convergent to the same Gaussian process, when properly normalized. We study the properties of the smoothed Kaplan-Meier estimator with a suitable kernel function in this paper. The smooth estimator is compared with the Kaplan-Meier and Breslow estimators for large sample sizes giving an exact expression for an appropriately normalized difference of the mean square error (MSE) of the two estimators. This quantifies the deficiency of the Kaplan-Meier estimator in comparison to the smoothed version. We also obtain a non-asymptotic bound on an expected 1-type error under weak conditions. Some simulations are carried out to examine the performance of the suggested method.  相似文献   

8.
9.
10.
11.
12.
Abstract

The adoption of control charts can be traced to the classic text by Shewhart (1931 Shewhart, W. A. 1931. Economic control of quality of manufactured product. London: Macmillan. ISBN: 1614278115. [Google Scholar]) and championed by many writers since then, including Deming (1982 Deming, W. E. 1982. Out of the crisis: Quality, productivity and competitive position. Cambridge: Cambridge University Press. ISBN: 0521305535. [Google Scholar]). Numerous other texts and publications stress the continuing importance of this area. While tables of key Shewhart control chart parameters are extremely useful they are easily lost or mislaid and can sometimes be difficult to interpret. To address this issue spreadsheet code is implemented to produce all the key control chart factors.  相似文献   

13.
14.
For observable indicators with ordered categories one can assume underlying latent variables following certain marginal distributions. Transforming the latent variables changes its marginal distributions but not the observable qualitative indicators. The joint distribution of the latent variables can be constructed from the marginal distributions. There is a broad class of multivariate distributions for which the observable indicators are equivalent. By choosing the multivariate normal distribution from this class we can analyse a linear relationship between the transformed latent variables. This leads to latent structural equation models. Estimation of these latter models is therefore more general than the distributional assumption might initially suggest. Robustness of the estimation procedure is also discussed for deviations from this distribution family. Using ordinal business survey data of the German Ifo-institute we test the efficiency of firms' price expectations implied by the rational expectation hypothesis.  相似文献   

15.
16.
17.
18.
For a sequence of strictly stationary random fields that are uniformly ρ′-mixingρ-mixing and satisfy a Lindeberg condition, a central limit theorem is obtained for sequences of “rectangular” sums from the given random fields. The “Lindeberg CLT” is then used to prove a CLT for some kernel estimators of probability density for some strictly stationary random fields satisfying ρ′-mixingρ-mixing, and whose probability density and joint densities are absolutely continuous.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号