期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

PERFORMANCE OF A LOCALIZED TREE SPLITTING CRITERION IN TREE AVERAGING

Alexandra P. Bremner Ross H. Taplin 《Australian & New Zealand Journal of Statistics》2004,46(4):583-599

This paper explores the performance of the local splitting criterion devised by Bremner & Taplin for classification and regression trees when multiple trees are averaged to improve performance. The criterion is compared with the deviance used by Clark & Pregibon's method, which is a global splitting criterion typically used to grow trees. The paper considers multiple trees generated by randomly selecting splits with probability proportional to the likelihood for the split, and by bagging where bootstrap samples from the data are used to grow trees. The superiority of the localized splitting criterion often persists when multiple trees are grown and averaged for six datasets. Tree averaging is known to be advantageous when the trees being averaged produce different predictions, and this can be achieved by choosing splits where the splitting criterion is locally optimal. The paper shows that use of locally optimal splits gives promising results in conjunction with both local and global splitting criteria, and with and without random selection of splits. The paper also extends the local splitting criterion to accommodate categorical predictors. 相似文献

2.

Block diagrams and splitting criteria for classification trees

Paul C. Taylor Bernard W. Silverman 《Statistics and Computing》1993,3(4):147-161

Various aspects of the classification tree methodology of Breiman et al., (1984) are discussed. A method of displaying classification trees, called block diagrams, is developed. Block diagrams give a clear presentation of the classification, and are useful both to point out features of the particular data set under consideration and also to highlight deficiencies in the classification method being used. Various splitting criteria are discussed; the usual Gini-Simpson criterion presents difficulties when there is a relatively large number of classes and improved splitting criteria are obtained. One particular improvement is the introduction of adaptive anti-end-cut factors that take advantage of highly asymmetrical splits where appropriate. They use the number and mix of classes in the current node of the tree to identify whether or not it is likely to be advantageous to create a very small offspring node. A number of data sets are used as examples. 相似文献

3.

Performance of localized regression tree splitting criteria on data with discontinuities

Alexandra P. Bremner Ross H. Taplin 《Australian & New Zealand Journal of Statistics》2004,46(3):367-381

Properties of the localized regression tree splitting criterion, described in Bremner & Taplin (2002) and referred to as the BT method, are explored in this paper and compared to those of Clark & Pregibon's (1992) criterion (the CP method). These properties indicate why the BT method can result in superior trees. This paper shows that the BT method exhibits a weak bias towards edge splits, and the CP method exhibits a strong bias towards central splits in the presence of main effects. A third criterion, called the SM method, that exhibits no bias towards a particular split position is introduced. The SM method is a modification of the BT method that uses more symmetric local means. The BT and SM methods are more likely to split at a discontinuity than the CP method because of their relatively low bias towards particular split positions. The paper shows that the BT and SM methods can be used to discover discontinuities in the data, and that they offer a way of producing a variety of different trees for examination or for tree averaging methods. 相似文献

4.

Probit and Logit Model Selection

Guo Chen Hiroki Tsurumi 《统计学通讯:理论与方法》2013,42(1):159-175

Monte Carlo experiments are conducted to compare the Bayesian and sample theory model selection criteria in choosing the univariate probit and logit models. We use five criteria: the deviance information criterion (DIC), predictive deviance information criterion (PDIC), Akaike information criterion (AIC), weighted, and unweighted sums of squared errors. The first two criteria are Bayesian while the others are sample theory criteria. The results show that if data are balanced none of the model selection criteria considered in this article can distinguish the probit and logit models. If data are unbalanced and the sample size is large the DIC and AIC choose the correct models better than the other criteria. We show that if unbalanced binary data are generated by a leptokurtic distribution the logit model is preferred over the probit model. The probit model is preferred if unbalanced data are generated by a platykurtic distribution. We apply the model selection criteria to the probit and logit models that link the ups and downs of the returns on S&P500 to the crude oil price. 相似文献

5.

A class of closeness criteria

Robert L. Fountain 《统计学通讯:理论与方法》2013,42(8):1865-1883

A generalized class of closeness criteria for the pairwise comparison of esti¬mators is defined. This class contains an infinite number of members including Pitman's measure of closeness and at least one transitive criterion. Several specific members of the class are examined, and their relationships to Rao concentration and stochastic domination are shown. Graphical and analyti¬cal characterizations are shown for these members of the class. Examples are given which illustrate the behavior of some of these criteria. 相似文献

6.

Fisher Information in Record Values and Their Concomitants: A Comparison of Two Sampling Schemes

Morteza Amini M. Razmkhah 《统计学通讯:理论与方法》2013,42(7):1298-1314

Two sampling designs via inverse sampling for generating record data and their concomitants are considered: single sample and multisample. The purpose here is to compare the Fisher information in these two sampling schemes. It is shown that the comparison criterion depends on the underlying distribution. Several general results are established for some parametric families and their well known subclasses such as location-scale and shape families, exponential family and proportional (reversed) hazard model. Farlie-Gumbel-Morgenstern (FGM) family, bivariate normal distribution, and some other common bivariate distributions are considered as examples for illustrations and are classified according to this criterion. 相似文献

7.

A reflected feature space for CART

D. C. Wickramarachchi B. L. Robertson M. Reale C. J. Price J. A. Brown 《Australian & New Zealand Journal of Statistics》2019,61(3):380-391

We present an algorithm for learning oblique decision trees, called HHCART(G). Our decision tree combines learning concepts from two classification trees, HHCART and Geometric Decision Tree (GDT). HHCART(G) is a simplified HHCART algorithm that uses linear structure in the training examples, captured by a modified GDT angle bisector, to define splitting directions. At each node, we reflect the training examples with respect to the modified angle bisector to align this linear structure with the coordinate axes. Searching axis parallel splits in this reflected feature space provides an efficient and effective way of finding oblique splits in the original feature space. Our method is much simpler than HHCART because it only considers one reflected feature space for node splitting. HHCART considers multiple reflected feature spaces for node splitting making it more computationally intensive to build. Experimental results show that HHCART(G) is an effective classifier, producing compact trees with similar or better results than several other decision trees, including GDT and HHCART trees. 相似文献

8.

Three optimal cut-point selection criteria based on sensitivity and specificity with user-defined weights

Dan-Ling Li Jun-Xiang Peng Chong-Yang Duan Ju-Min Deng 《统计学通讯:理论与方法》2019,48(3):742-754

Methods: Based on the index S (S = SENSITIVITY (SEN) × SPECIFICITY (SPE)), the new weighted product index S_w is defined as S_w = (SEN)²^w × (SPE)^2(1-^w), where (0≤w≤1). The S_w is developed to be a new tool to select the optimal cut point in ROC analysis and be compared with the other two commonly used criteria.

Results: Comparing the optimal cut point for the three criteria, the wave range of the optimal cut point for the maximized weighted Youden index criterion is the widest, the weighted closest-to-(0,1) criterion is the narrowest and the weighted product index S_w criterion lays between the ranges of the two criteria. 相似文献

9.

On Sample Allocation in Multivariate Surveys

Marcin Kozak 《统计学通讯:模拟与计算》2013,42(4):901-910

The problem of a sample allocation between strata in the case of multiparameter surveys is considered in this article. There are several multivariate sample allocation methods and, moreover, several criteria to deal with in such a case. A maximum coefficient of variation of estimators of the population mean of characters under study is taken as the optimality criterion. This article contains a study on a group of the methods that are easy to implement and do not need complex numerical computation; however, they all are approximate. Five such methods are presented and compared using a simulation study. Finally, it is shown which methods should be considered when designing a survey in which the multivariate sample allocation is to be involved. 相似文献

10.

A PRESS statistic for working correlation structure selection in generalized estimating equations

Gul Inan Mahbub A. H. M. Latif John Preisser 《Journal of applied statistics》2019,46(4):621-637

Generalized estimating equations (GEE) is one of the most commonly used methods for regression analysis of longitudinal data, especially with discrete outcomes. The GEE method accounts for the association among the responses of a subject through a working correlation matrix and its correct specification ensures efficient estimation of the regression parameters in the marginal mean regression model. This study proposes a predicted residual sum of squares (PRESS) statistic as a working correlation selection criterion in GEE. A simulation study is designed to assess the performance of the proposed GEE PRESS criterion and to compare its performance with its counterpart criteria in the literature. The results show that the GEE PRESS criterion has better performance than the weighted error sum of squares SC criterion in all cases but is surpassed in performance by the Gaussian pseudo-likelihood criterion. Lastly, the working correlation selection criteria are illustrated with data from the Coronary Artery Risk Development in Young Adults study. 相似文献

11.

Boosting with Bayesian stumps

David G. T. Denison 《Statistics and Computing》2001,11(2):171-178

Boosting is a new, powerful method for classification. It is an iterative procedure which successively classifies a weighted version of the sample, and then reweights this sample dependent on how successful the classification was. In this paper we review some of the commonly used methods for performing boosting and show how they can be fit into a Bayesian setup at each iteration of the algorithm. We demonstrate how this formulation gives rise to a new splitting criterion when using a domain-partitioning classification method such as a decision tree. Further we can improve the predictive performance of simple decision trees, known as stumps, by using a posterior weighted average of them to classify at each step of the algorithm, rather than just a single stump. The main advantage of this approach is to reduce the number of boosting iterations required to produce a good classifier with only a minimal increase in the computational complexity of the algorithm. 相似文献

12.

Asymptotic biases of information and cross-validation criteria under canonical parametrization

Haruhiko Ogasawara 《统计学通讯:理论与方法》2019,48(4):964-985

An asymptotic expansion of the cross-validation criterion (CVC) using the Kullback-Leibler distance is derived when the leave-k-out method is used and when parameters are estimated by the weighted score method. By this expansion, the asymptotic bias of the Takeuchi information criterion (TIC) is derived as well as that of the CVC. Under canonical parametrization in the exponential family of distributions when maximum likelihood estimation is used, the magnitudes of the asymptotic biases of the Akaike information criterion (AIC) and CVC are shown to be smaller than that of the TIC. Examples in typical statistical distributions are shown. 相似文献

13.

Regression model selection—a residual likelihood approach

Peide Shi Chih-Ling Tsai 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(2):237-252

Summary. We obtain the residual information criterion RIC, a selection criterion based on the residual log-likelihood, for regression models including classical regression models, Box–Cox transformation models, weighted regression models and regression models with autoregressive moving average errors. We show that RIC is a consistent criterion, and that simulation studies for each of the four models indicate that RIC provides better model order choices than the Akaike information criterion, corrected Akaike information criterion, final prediction error, C _p and R _adj², except when the sample size is small and the signal-to-noise ratio is weak. In this case, none of the criteria performs well. Monte Carlo results also show that RIC is superior to the consistent Bayesian information criterion BIC when the signal-to-noise ratio is not weak, and it is comparable with BIC when the signal-to-noise ratio is weak and the sample size is large. 相似文献

14.

A new criterion of confidence set estimation: Improvement of the Neyman shortness

《Journal of statistical planning and inference》1998,69(2):329-338

In past studies various criteria have been proposed for evaluating the performance of a confidence set. However, each of these criteria often causes some unsatisfactory results even for the standard models such as location model, scale model and multinormal model. In this article, we propose a new criterion so that the procedure of the confidence set estimation based on the criterion can lead to a desirable confidence set at least for the above models. The approach is on the basis of an improvement of the Neyman shortness according to two steps. The first step is some kind of theoretical improvement, referring to a proposal of Pratt. As a result, we get a solution to Pratt's paradox. In the second step, we adopt a kind of robust or minimax procedure without sticking to the uniform optimality. In conclusion, it is shown that the procedure based on our criterion produces a desirable and acceptable confidence set. 相似文献

15.

OPTIMAL EXPERIMENTAL DESIGNS FOR MULTILEVEL MODELS WITH COVARIATES

《统计学通讯:理论与方法》2013,42(12):2683-2697

In this paper optimal experimental designs for multilevel models with covariates and two levels of nesting are considered. Multilevel models are used to describe the relationship between an outcome variable and a treatment condition and covariate. It is assumed that the outcome variable is measured on a continuous scale. As optimality criteria D-optimality, and L-optimality are chosen. It is shown that pre-stratification on the covariate leads to a more efficient design and that the person level is the optimal level of randomization. Furthermore, optimal sample sizes are given and it is shown that these do not depend on the optimality criterion when randomization is done at the group level. 相似文献

16.

On minimax designs when there are two candidate models

《Journal of Statistical Computation and Simulation》2012,82(11):841-862

This work is motivated by the need to find experimental designs which are robust under different model assumptions. We measure robustness by calculating a measure of design efficiency with respect to a design optimality criterion and say that a design is robust if it is reasonably efficient under different model scenarios. We discuss two design criteria and an algorithm which can be used to obtain robust designs. The first criterion employs a Bayesian-type approach by putting a prior or weight on each candidate model and possibly priors on the corresponding model parameters. We define the first criterion as the expected value of the design efficiency over the priors. The second design criterion we study is the minimax design which minimizes the worst value of a design criterion over all candidate models. We establish conditions when these two criteria are equivalent when there are two candidate models. We apply our findings to the area of accelerated life testing and perform sensitivity analysis of designs with respect to priors and misspecification of planning values. 相似文献

17.

Performances of Bayesian model selection criteria for generalized linear models with non-ignorably missing covariates

《Journal of Statistical Computation and Simulation》2012,82(8):1670-1691

This article deals with model comparison as an essential part of generalized linear modelling in the presence of covariates missing not at random (MNAR). We provide an evaluation of the performances of some of the popular model selection criteria, particularly of deviance information criterion (DIC) and weighted L (WL) measure, for comparison among a set of candidate MNAR models. In addition, we seek to provide deviance and quadratic loss-based model selection criteria with alternative penalty terms targeting directly the MNAR models. This work is motivated by the need in the literature to understand the performances of these important model selection criteria for comparison among a set of MNAR models. A Monte Carlo simulation experiment is designed to assess the finite sample performances of these model selection criteria in the context of interest under different scenarios for missingness amounts. Some naturally driven DIC and WL extensions are also discussed and evaluated. 相似文献

18.

An improved C_p criterion for spline smoothing

Chun-Shu Chen Hsin-Cheng Huang 《Journal of statistical planning and inference》2011,141(1):445-452

Spline smoothing is a popular technique for curve fitting, in which selection of the smoothing parameter is crucial. Many methods such as Mallows’ C_p, generalized maximum likelihood (GML), and the extended exponential (EE) criterion have been proposed to select this parameter. Although C_p is shown to be asymptotically optimal, it is usually outperformed by other selection criteria for small to moderate sample sizes due to its high variability. On the other hand, GML and EE are more stable than C_p, but they do not possess the same asymptotic optimality as C_p. Instead of selecting this smoothing parameter directly using C_p, we propose to select among a small class of selection criteria based on Stein's unbiased risk estimate (SURE). Due to the selection effect, the spline estimate obtained from a criterion in this class is nonlinear. Thus, the effective degrees of freedom in SURE contains an adjustment term in addition to the trace of the smoothing matrix, which cannot be ignored in small to moderate sample sizes. The resulting criterion, which we call adaptive C_p, is shown to have an analytic expression, and hence can be efficiently computed. Moreover, adaptive C_p is not only demonstrated to be superior and more stable than commonly used selection criteria in a simulation study, but also shown to possess the same asymptotic optimality as C_p. 相似文献

19.

On estimating the common mean in two normal distributions after a preliminary test for equality of variances

Kazuhiro Ohtani 《统计学通讯:理论与方法》2013,42(7):1977-1993

Given two random samples of equal size from two normal distributions with common mean but possibly different variances, we examine the sampling performance of the pre-test estimator for the common mean after a preliminary test for equality of variances. It is shown that when the alternative in the pretest is one-sided, the Graybill-Deal estimator is dominated by the pre-test estimator if the critical value is chosen appropriately. It is also shown that all estimators, the grand mean, the Graybill-Deal estimator and the pre-test estimator, are admissible when the alternative in the pre-test is two-sided. The optimal critical values in the two-sided pre-test are sought based on the minimax regret and the minimum average risk criteria, and it is shown that the Graybill-Deal estimator is most preferable under the minimum average risk criterion when the alternative in the pre-test is two-sided. 相似文献

20.

Comment on “Issues Involved With the Seasonal Adjustment of Economic Time Series” by William R. Bell and Steven C. Hiilmer

Christopher A. Sims 《商业与经济统计学杂志》2013,31(1):92-94

This article investigates the existence of multiple regimes in the U.S. economy during the 1923—1991 period. A technique known as regression tree analysis is applied to search for splits in the data, if any exist, rather than choosing a splitting point a priori as has been done in previous work. Using this technique, strong evidence for the existence of nonlinear behavior of U.S. output is found over this period. Monte Carlo results are presented to assess the significance of the regime changes that are found. 相似文献