期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Aggregating classifiers via Rademacher–Walsh polynomials

《Journal of Statistical Computation and Simulation》2012,82(6):1187-1199

Methods are proposed to combine several individual classifiers in order to develop more accurate classification rules. The proposed algorithm uses Rademacher–Walsh polynomials to combine M (≥2) individual classifiers in a nonlinear way. The resulting classifier is optimal in the sense that its misclassification error rate is always less than, or equal to, that of each constituent classifier. A number of numerical examples (based on both real and simulated data) are also given. These examples demonstrate some new, and far-reaching, benefits of working with combined classifiers. 相似文献

2.

On Maximum Depth and Related Classifiers 总被引：1，自引：0，他引：1

ANIL K. GHOSH PROBAL CHAUDHURI 《Scandinavian Journal of Statistics》2005,32(2):327-350

Abstract. Over the last couple of decades, data depth has emerged as a powerful exploratory and inferential tool for multivariate data analysis with wide-spread applications. This paper investigates the possible use of different notions of data depth in non-parametric discriminant analysis. First, we consider the situation where the prior probabilities of the competing populations are all equal and investigate classifiers that assign an observation to the population with respect to which it has the maximum location depth. We propose a different depth-based classification technique for unequal prior problems, which is also useful for equal prior cases, especially when the populations have different scatters and shapes. We use some simulated data sets as well as some benchmark real examples to evaluate the performance of these depth-based classifiers. Large sample behaviour of the misclassification rates of these depth-based non-parametric classifiers have been derived under appropriate regularity conditions. 相似文献

3.

Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment

Steele Brian M. Patterson David A. 《Statistics and Computing》2000,10(4):349-355

Euclidean distance k-nearest neighbor (k-NN) classifiers are simple nonparametric classification rules. Bootstrap methods, widely used for estimating the expected prediction error of classification rules, are motivated by the objective of calculating the ideal bootstrap estimate of expected prediction error. In practice, bootstrap methods use Monte Carlo resampling to estimate the ideal bootstrap estimate because exact calculation is generally intractable. In this article, we present analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers and propose a new weighted k-NN classifier based on resampling ideas. The resampling-weighted k-NN classifier replaces the k-NN posterior probability estimates by their expectations under resampling and predicts an unclassified covariate as belonging to the group with the largest resampling expectation. A simulation study and an application involving remotely sensed data show that the resampling-weighted k-NN classifier compares favorably to unweighted and distance-weighted k-NN classifiers. 相似文献

4.

An ensemble method for concept drift in nonstationary environment

Dhouha Mejri Riadh Khanchel Mohamed Limam 《Journal of Statistical Computation and Simulation》2013,83(6):1115-1128

Most statistical and data-mining algorithms assume that data come from a stationary distribution. However, in many real-world classification tasks, data arrive over time and the target concept to be learned from the data stream may change accordingly. Many algorithms have been proposed for learning drifting concepts. To deal with the problem of learning when the distribution generating the data changes over time, dynamic weighted majority was proposed as an ensemble method for concept drift. Unfortunately, this technique considers neither the age of the classifiers in the ensemble nor their past correct classification. In this paper, we propose a method that takes into account expert's age as well as its contribution to the global algorithm's accuracy. We evaluate the effectiveness of our proposed method by using m classifiers and training a collection of n-fold partitioning of the data. Experimental results on a benchmark data set show that our method outperforms existing ones. 相似文献

5.

Discriminant analysis of multivariate repeated measures data with a Kronecker product structured covariance matrices

Mirosław Krzyśko Michał Skorzybut 《Statistical Papers》2009,50(4):817-835

This paper proposes new classifiers under the assumption of multivariate normality for multivariate repeated measures data with Kronecker product covariance structures. These classifiers are especially effective when the number of observations is not large enough to estimate the covariance matrices, and thus the traditional classifiers fail. Computational scheme for maximum likelihood estimates of required class parameters are also given. The quality of these new classifiers are examined on some real data. 相似文献

6.

Monitoring Variation in a Multivariate Process When the Dimension is Large Relative to the Sample Size

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(6):939-951

A control procedure is presented for monitoring changes in variation for a multivariate normal process in a Phase II operation where the subgroup size, m, is less than p, the number of variates. The methodology is based on a form of Wilk' statistic, which can be expressed as a function of the ratio of the determinants of two separate estimates of the covariance matrix. One estimate is based on the historical data set from Phase I and the other is based on an augmented data set including new data obtained in Phase II. The proposed statistic is shown to be distributed as the product of independent beta distributions that can be approximated using either a chi-square or F-distribution. An ARL study of the statistic is presented for a range of conditions for the population covariance matrix. Cases are considered where a p-variate process is being monitored using a sample of m observations per subgroup and m < p. Data from an industrial multivariate process is used to illustrate the proposed technique. 相似文献

7.

Estimating a Single Shelf‐Life for Multiple Batches

Andreas Kiermeier Arūnas P. Verbyla Richard G. Jarrett 《Australian & New Zealand Journal of Statistics》2012,54(3):343-358

Pharmaceutical companies and manufacturers of food products are legally required to label the product's shelf‐life on the packaging. For pharmaceutical products the requirements for how to determine the shelf‐life are highly regulated. However, the regulatory documents do not specifically define the shelf‐life. Instead, the definition is implied through the estimation procedure. In this paper, the focus is on the situation where multiple batches are used to determine a label shelf‐life that is applicable to all future batches. Consequently, the short‐comings of existing estimation approaches are discussed. These are then addressed by proposing a new definition of shelf‐life and label shelf‐life, where greater emphasis is placed on within and between batch variability. Furthermore, an estimation approach is developed and the properties of this approach are illustrated using a simulation study. Finally, the approach is applied to real data. 相似文献

8.

Some small-sample properties of some recently proposed multivariate outlier detection techniques

《Journal of Statistical Computation and Simulation》2012,82(8):701-712

Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p _n, their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p _n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables. 相似文献

9.

Have I seen you before? Principles of Bayesian predictive classification revisited

Jukka Corander Yaqiong Cui Timo Koski Jukka Sirén 《Statistics and Computing》2013,23(1):59-73

A general inductive Bayesian classification framework is considered using a simultaneous predictive distribution for test items. We introduce a principle of generative supervised and semi-supervised classification based on marginalizing the joint posterior distribution of labels for all test items. The simultaneous and marginalized classifiers arise under different loss functions, while both acknowledge jointly all uncertainty about the labels of test items and the generating probability measures of the classes. We illustrate for data from multiple finite alphabets that such classifiers achieve higher correct classification rates than a standard marginal predictive classifier which labels all test items independently, when training data are sparse. In the supervised case for multiple finite alphabets the simultaneous and the marginal classifiers are proven to become equal under generalized exchangeability when the amount of training data increases. Hence, the marginal classifier can be interpreted as an asymptotic approximation to the simultaneous classifier for finite sets of training data. It is also shown that such convergence is not guaranteed in the semi-supervised setting, where the marginal classifier does not provide a consistent approximation. 相似文献

10.

Nonparametric empirical Bayes for the Dirichlet process mixture model

Jon D. McAuliffe David M. Blei Michael I. Jordan 《Statistics and Computing》2006,16(1):5-14

The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data arrive. However, analyses based on the Dirichlet process prior are sensitive to the choice of the parameters, including an infinite-dimensional distributional parameter G ₀. Most previous applications have either fixed G ₀ as a member of a parametric family or treated G ₀ in a Bayesian fashion, using parametric prior specifications. In contrast, we have developed an adaptive nonparametric method for constructing smooth estimates of G ₀. We combine this method with a technique for estimating α, the other Dirichlet process parameter, that is inspired by an existing characterization of its maximum-likelihood estimator. Together, these estimation procedures yield a flexible empirical Bayes treatment of Dirichlet process mixtures. Such a treatment is useful in situations where smooth point estimates of G ₀ are of intrinsic interest, or where the structure of G ₀ cannot be conveniently modeled with the usual parametric prior families. Analysis of simulated and real-world datasets illustrates the robustness of this approach. 相似文献

11.

Admissible stochastic complexity models for classification problems

Padhraic Smyth 《Statistics and Computing》1992,2(2):97-104

In this paper we investigate the application of stochastic complexity theory to classification problems. In particular, we define the notion of admissible models as a function of problem complexity, the number of data pointsN, and prior belief. This allows us to derive general bounds relating classifier complexity with data-dependent parameters such as sample size, class entropy and the optimal Bayes error rate. We discuss the application of these results to a variety of problems, including decision tree classifiers, Markov models for image segmentation, and feedforward multilayer neural network classifiers. 相似文献

12.

On the estimation of the proportion of positives in a sequence of screening experiments

Mikelis Bickis Susana Bleuer Daniel Krewski 《Revue canadienne de statistique》1996,24(1):1-15

A metaanalytic estimator of the proportion of positives in a sequence of screening experiments is proposed. The distribution-free estimator is based on the empirical distribution of P-values from individual experiments, which is uniform under the global null hypotheses of no positives in the sequence of experiments performed. Under certain regularity conditions, the proportion of positives corresponds to the derivative of this distribution under the alternative hypothesis of the existence of some positives. The statistical properties of the estimator are established, including its bias, variance, and rate of convergence to normality. Optimal estimators with minimum mean squared error are also developed under specific alternative hypotheses. The application of the proposed methods is illustrated using data from a sequence of screening experiments with chemicals to determine their carcinogenic potential. 相似文献

13.

Solving Survo puzzles using matrix combinatorial products

Kimmo Vehkalahti Reijo Sund 《Journal of Statistical Computation and Simulation》2015,85(13):2666-2681

We investigate combinatorial matrix problems that are related to restricted integer partitions. They arise from Survo puzzles, where the basic task is to fill an m×n table by integers 1, 2,?…?, mn, so that each number appears only once, when the column sums and the row sums are fixed. We present a new computational method for solving Survo puzzles with binary matrices that are recoded and combined using the Hadamard, Kronecker, and Khatri–Rao products. The idea of our method is based on using the matrix interpreter and other data analytic tools of Survo R, which represents the newest generation of the Survo computing environment, recently implemented as a multiplatform, open source R package. We illustrate our method with detailed examples. 相似文献

14.

A Positive False Discovery Rate Convergence Result

Igor Melnykov 《统计学通讯:理论与方法》2013,42(23):4239-4246

The positive false discovery rate (pFDR) is the average proportion of false rejections given that the overall number of rejections is greater than zero. Assuming that the proportion of true null hypotheses, proportion of false positives, and proportion of true positives all converge pointwise, the pFDR converges to a continuous limit uniformly over all significance levels. We are showing that the uniform convergence still holds given a weaker assumption that the proportion of true positives converges in L ¹. 相似文献

15.

SEQUENTIAL,BOTTOM‐UP VARIABLE SELECTION FOR HIGH‐DIMENSIONAL CLASSIFICATION

Peter Hall Hugh Miller 《Australian & New Zealand Journal of Statistics》2010,52(4):403-421

Most methods for variable selection work from the top down and steadily remove features until only a small number remain. They often rely on a predictive model, and there are usually significant disconnections in the sequence of methodologies that leads from the training samples to the choice of the predictor, then to variable selection, then to choice of a classifier, and finally to classification of a new data vector. In this paper we suggest a bottom‐up approach that brings the choices of variable selector and classifier closer together, by basing the variable selector directly on the classifier, removing the need to involve predictive methods in the classification decision, and enabling the direct and transparent comparison of different classifiers in a given problem. Specifically, we suggest ‘wrapper methods’, determined by classifier type, for choosing variables that minimize the classification error rate. This approach is particularly useful for exploring relationships among the variables that are chosen for the classifier. It reveals which variables have a high degree of leverage for correct classification using different classifiers; it shows which variables operate in relative isolation, and which are important mainly in conjunction with others; it permits quantification of the authority with which variables are selected; and it generally leads to a reduced number of variables for classification, in comparison with alternative approaches based on prediction. 相似文献

16.

Statistical considerations in bioequivalence of two area under the concentration–time curves obtained from serial sampling data

Steven Y. Hua D. L. Hawkins Jihao Zhou 《Journal of applied statistics》2013,40(5):1140-1154

In this paper, we study the bioequivalence (BE) inference problem motivated by pharmacokinetic data that were collected using the serial sampling technique. In serial sampling designs, subjects are independently assigned to one of the two drugs; each subject can be sampled only once, and data are collected at K distinct timepoints from multiple subjects. We consider design and hypothesis testing for the parameter of interest: the area under the concentration–time curve (AUC). Decision rules in demonstrating BE were established using an equivalence test for either the ratio or logarithmic difference of two AUCs. The proposed t-test can deal with cases where two AUCs have unequal variances. To control for the type I error rate, the involved degrees-of-freedom were adjusted using Satterthwaite's approximation. A power formula was derived to allow the determination of necessary sample sizes. Simulation results show that, when the two AUCs have unequal variances, the type I error rate is better controlled by the proposed method compared with a method that only handles equal variances. We also propose an unequal subject allocation method that improves the power relative to that of the equal and symmetric allocation. The methods are illustrated using practical examples. 相似文献

17.

Inference and Decision Making for 21st-Century Drug Development and Approval

Stephen J. Ruberg Frank E. Harrell Jr. Margaret Gamalo-Siebers Lisa LaVange J. Jack Lee Karen Price 《The American statistician》2019,73(1):319-327

ABSTRACT

The cost and time of pharmaceutical drug development continue to grow at rates that many say are unsustainable. These trends have enormous impact on what treatments get to patients, when they get them and how they are used. The statistical framework for supporting decisions in regulated clinical development of new medicines has followed a traditional path of frequentist methodology. Trials using hypothesis tests of “no treatment effect” are done routinely, and the p-value < 0.05 is often the determinant of what constitutes a “successful” trial. Many drugs fail in clinical development, adding to the cost of new medicines, and some evidence points blame at the deficiencies of the frequentist paradigm. An unknown number effective medicines may have been abandoned because trials were declared “unsuccessful” due to a p-value exceeding 0.05. Recently, the Bayesian paradigm has shown utility in the clinical drug development process for its probability-based inference. We argue for a Bayesian approach that employs data from other trials as a “prior” for Phase 3 trials so that synthesized evidence across trials can be utilized to compute probability statements that are valuable for understanding the magnitude of treatment effect. Such a Bayesian paradigm provides a promising framework for improving statistical inference and regulatory decision making. 相似文献

18.

Logistic regression and latent class models for estimating positivities in diagnostic assays with poor resolution

Tom Smith Penelope Vounatsou 《统计学通讯:理论与方法》2013,42(7):1677-1700

In biomedical research and diagnostic practice it is common to classify objects dichotomously based on continuous observations (x) measuring some form of biological activity, where some proportion of the objects have a level of activity above background. In this paper, we consider the problem of estimating the proportion of positive objects for a typical assay where:(i) the distribution of x for positive objects is unknown. although (ii) the risk of positivity is known to be a monotonic function of x:and (iii) x has been measured for a set of negative control objects. Monte Carlo simulations evaluating four alternative estimators of the positivity, including novel non-parametric mixture decompositions, indicate that where the positives and negatives have distributions of x with a moderate degree of overlap, a non-parametric decomposition using a latent class model provides precise and close to unbiased estimates. The methods are illustrated using data from an autoradiography assay used in cell biology. 相似文献

19.

A robust prediction error criterion for pareto modelling of upper tails

Debbie J. Dupuis Maria‐Pia Victoria‐Feser 《Revue canadienne de statistique》2006,34(4):639-658

Estimation of the Pareto tail index from extreme order statistics is an important problem in many settings. The upper tail of the distribution, where data are sparse, is typically fitted with a model, such as the Pareto model, from which quantities such as probabilities associated with extreme events are deduced. The success of this procedure relies heavily not only on the choice of the estimator for the Pareto tail index but also on the procedure used to determine the number k of extreme order statistics that are used for the estimation. The authors develop a robust prediction error criterion for choosing k and estimating the Pareto index. A Monte Carlo study shows the good performance of the new estimator and the analysis of real data sets illustrates that a robust procedure for selection, and not just for estimation, is needed. 相似文献

20.

A nonparametric allocation scheme for classification based on transvariation probabilities

《Journal of Statistical Computation and Simulation》2012,82(8):977-987

In this paper, a nonparametric discriminant analysis procedure that is less sensitive than traditional procedures to deviations from the usual assumptions is proposed. The procedure uses the projection pursuit methodology where the projection index is the two-group transvariation probability. Montanari [A. Montanari, Linear discriminant analysis and transvariation, J. Classification 21 (2004), pp. 71–88] proposed and used this projection index to measure group separation but allocated the new observation using distances. Our procedure employs a method of allocation based on group–group transvariation probability to classify the new observation. A simulation study shows that the procedure proposed in this paper provides lower misclassification error rates than classical procedures like linear discriminant analysis and quadratic discriminant analysis and recent procedures like maximum depth and Montanari's transvariation-based classifiers, when the underlying distributions are skewed and/or the prior probabilities are unequal. 相似文献