期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Group testing with test error as a function of concentration

Kevin C. Burns Carl A. Mauro 《统计学通讯:理论与方法》2013,42(10):2821-2837

A common assumption in group testing applications is that there is no test error, i.e., misclassification of a single item or a group of items cannot occur. Graff and Roeloffs ( 1972 ) have proposed a procedure applicable when there is a known probability of misclassification. We generalize their results to the situation where the probability of misclassification depends on the proportion of defective items in the group. 相似文献

2.

A Modified One-Sided Sequential Screening Procedure Based on Individual Misclassification Error

Shu-Fei Wu Ying-Po Lin Huei-Jiuan Lin 《统计学通讯:模拟与计算》2013,42(9):1754-1778

In this article, the modified procedure is proposed by simplifying the procedure of Tsai and Wu (2002 Tsai , H. T. , Wu , S. F. (2002). Sequential screening procedure based on individual misclassification error. IIE Trans. 34:1079–1085.[Taylor & Francis Online], [Web of Science ®] , [Google Scholar]) by only weighing the screening variables once instead of weighting twice. The numerical comparison of the modified procedure with the old procedure shows that the total inspection cost of the modified procedure is very close to the old one. A theorem is derived to simplify the calculation of all desired probabilities and the expected costs when k-screening variables are allocated into r-stages. Finally, an example of investigating the cycles of failure of silver-zinc batteries is given to illustrate the modified screening procedure. 相似文献

3.

Optimal cut-off for rare events and unbalanced misclassification costs

Raffaella Calabrese 《Journal of applied statistics》2014,41(8):1678-1693

This paper develops a method for handling two-class classification problems with highly unbalanced class sizes and misclassification costs. When the class sizes are highly unbalanced and the minority class represents a rare event, conventional classification methods tend to strongly favour the majority class, resulting in very low detection of the minority class. A method is proposed to determine the optimal cut-off for asymmetric misclassification costs and for unbalanced class sizes. Monte Carlo simulations show that this proposal performs better than the method based on the notion of classification accuracy. Finally, the proposed method is applied to empirical data on Italian small and medium enterprises to classify them into default and non-default groups. 相似文献

4.

Smooth Nonparametric Allocation of Classification

Asheber Abebe Sai V. Nudurupati 《统计学通讯:模拟与计算》2013,42(5):694-709

A nonparametric discriminant analysis procedure that is robust to deviations from the usual assumptions is proposed. The procedure uses the projection pursuit methodology where the projection index is the two-group transvariation probability. We use allocation based on the centrality of the new point measured using a smooth version of point-group transvariation. It is shown that the new procedure provides lower misclassification error rates than competing methods for data from skewed heavy-tailed and skewed distributions as well as unequal training data sizes. 相似文献

5.

The application of bias to discriminant analysis

Pasquale J. Di Pillo 《统计学通讯:理论与方法》2013,42(9):843-854

When classification rules are constructed using sample estimatest it is known that the probability of misclassification is not minimized. This article introduces a biased minimum X² rule to classify items from a multivariate normal population. Using the principle of variance reduction, the probability of misclassification is reduced when the biased procedure is employed. Results of sampling experiments over a broad range of conditions are provided to demonstrate this improvement. 相似文献

6.

A three-population constrained discrimination procedure

David Patterson 《统计学通讯:理论与方法》2013,42(16):4771-4787

ABSTRACT

Classification rules with a reserve judgment option provide a way to satisfy constraints on the misclassification probabilities when there is a high degree of overlap among the populations. Constructing rules which maximize the probability of correct classification while satisfying such constraints is a difficult optimization problem. This paper uses the form of the optimal solution to develop a relatively simple and computationally fast method for three populations which has a non parametric quality in controlling the misclassification probabilities. Simulations demonstrate that this procedure performs well. 相似文献

7.

A classifier under the strongly spiked eigenvalue model in high-dimension,low-sample-size context

Aki Ishii 《统计学通讯:理论与方法》2020,49(7):1561-1577

Abstract

We consider the classification of high-dimensional data under the strongly spiked eigenvalue (SSE) model. We create a new classification procedure on the basis of the high-dimensional eigenstructure in high-dimension, low-sample-size context. We propose a distance-based classification procedure by using a data transformation. We also prove that our proposed classification procedure has consistency property for misclassification rates. We discuss performances of our classification procedure in simulations and real data analyses using microarray data sets. 相似文献

8.

On a Multiple Three-Decision Problem for Comparing Several Treatments with the Best Control

Amar Nath Gill Anju Goyal Parminder Singh 《统计学通讯:理论与方法》2013,42(19):3432-3454

In this article, a multiple three-decision procedure is proposed to classify p (≥2) treatments as better or worse than the best of q (≥2) control treatments in one way layout. Critical constants required for the implementation of the proposed procedure are tabulated for some pre-specified values of probability of no misclassification. Power function of the proposed procedure is defined and a common sample size necessary to guarantee various pre-specified power levels are tabulated under two optimal allocation schemes. Finally the implementation of the proposed methodology is demonstrated through numerical examples based on real life data. 相似文献

9.

Classification with discrete and continuous variables via general mixed-data models

A. R. de Leon A. Soo T. Williamson 《Journal of applied statistics》2011,38(5):1021-1032

We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-à-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children. 相似文献

10.

Goodness of fit tests with misclassified data

K.F. Cheng H.M. Hsueh T.H. Chien 《统计学通讯:理论与方法》2013,42(6):1379-1393

The most popular goodness of fit test for a multinomial distribution is the chi-square test. But this test is generally biased if observations are subject to misclassification, In this paper we shall discuss how to define a new test procedure when we have double sample data obtained from the true and fallible devices. An adjusted chi-square test based on the imputation method and the likelihood ratio test are considered, Asymptotically, these two procedures are equivalent. However, an example and simulation results show that the former procedure is not only computationally simpler but also more powerful under finite sample situations. 相似文献

11.

On triple sampling scremes for estimating from binomial data with misclassification errors ∗

Yosef Hochbeg Aaron Tenenbein 《统计学通讯:理论与方法》2013,42(13):1523-1533

Previous work has been carried out on the use of double sampling schemes for inference from binomial data which are subject to misclassification. The double sampling scheme utilizes a sample of n units which are classified by both a fallible and a true device and another sample of n₂ units which are classified only by a fallible device. A triple sampljng scheme incorporates an additional sample of n_l units which are classified only by the true device. In this paper we apply this triple sampling to estimation from binomialdata. First estimation of a binomial proportion is discussed under different misclassification structures. Then, the problem of optimal allocation of sample sizes is discussed. 相似文献

12.

Does the supplemental nutrition assistance program really increase obesity? The importance of accounting for misclassification errors

Achilleas Vassilopoulos Andreas C. Drichoutis Rodolfo M. Nayga Jr. Panagiotis Lazaridis 《Journal of applied statistics》2018,45(12):2269-2278

The prevalence of obesity among US citizens has grown rapidly over the last few decades, especially among low-income individuals. This has led to questions about the effectiveness of nutritional assistance programs such as the Supplemental Nutrition Assistance Program (SNAP). Previous results on the effect of SNAP participation on obesity are mixed. These findings are however based on the assumption that participation status can be accurately observed, despite significant misclassification errors reported in the literature. Using propensity score matching, we conclude that there seems to be a positive effect of SNAP participation on obesity rates for female participants and no such effect for males, a result that is consistent with several previous studies. However, an extensive sensitivity analysis reveals that the positive effect for females is sensitive to misclassification errors and to the conditional independence assumption. Thus analogous findings should also be used with caution unless examined under the prism of classification errors and of other assumptions used for the identification of causal parameters. 相似文献

13.

Comparison of combinatoric and likelihood ratio procedures for classifying samples

Charles L Dunn 《统计学通讯:理论与方法》2013,42(21):2361-2377

A random sample is to be classified as coming from one of two normally distributed populations with known parameters. Combinatoric procedures which classify the sample based upon the sample mean(s) and variance(s) are described for the univariate and multivariate problems. Comparisons of misclassification probabilities are made between the combinatoric and the likelihood ratio procedure in the univariate case and between two alternative combinatoric procedures in the bivariate case. 相似文献

14.

Inference for misclassified multinomial data with covariates

Shijia Wang Liangliang Wang Tim B. Swartz 《Revue canadienne de statistique》2020,48(4):655-669

This article considers multinomial data subject to misclassification in the presence of covariates which affect both the misclassification probabilities and the true classification probabilities. A subset of the data may be subject to a secondary measurement according to an infallible classifier. Computations are carried out in a Bayesian setting where it is seen that the prior has an important role in driving the inference. In addition, a new and less problematic definition of nonidentifiability is introduced and is referred to as hierarchical nonidentifiability. 相似文献

15.

Weighted Support Vector Machine Using k-Means Clustering

Sungwan Bang 《统计学通讯:模拟与计算》2013,42(10):2307-2324

The support vector machine (SVM) has been successfully applied to various classification areas with great flexibility and a high level of classification accuracy. However, the SVM is not suitable for the classification of large or imbalanced datasets because of significant computational problems and a classification bias toward the dominant class. The SVM combined with the k-means clustering (KM-SVM) is a fast algorithm developed to accelerate both the training and the prediction of SVM classifiers by using the cluster centers obtained from the k-means clustering. In the KM-SVM algorithm, however, the penalty of misclassification is treated equally for each cluster center even though the contributions of different cluster centers to the classification can be different. In order to improve classification accuracy, we propose the WKM–SVM algorithm which imposes different penalties for the misclassification of cluster centers by using the number of data points within each cluster as a weight. As an extension of the WKM–SVM, the recovery process based on WKM–SVM is suggested to incorporate the information near the optimal boundary. Furthermore, the proposed WKM–SVM can be successfully applied to imbalanced datasets with an appropriate weighting strategy. Experiments show the effectiveness of our proposed methods. 相似文献

16.

Posterior probability estimators in classification simulations

Gregory T. Schwemer Olive. Jean Dunn 《统计学通讯:模拟与计算》2013,42(2):133-140

Class specific stratified posterior probability estimators of misclassification probabilities in discriminant analysis simulations are introduced. These estimators afford a significant variance reduction over the usual count estimators. Sufficient conditions for a variance reduction are given. The stratified posterior probability estimator is generalized to other class specific expectations. 相似文献

17.

Rank procedures in many population forced discrimination problems

Tie-Hua Ng Ronald H. Randies 《统计学通讯:理论与方法》2013,42(17):1943-1959

In this paper the rank method for forced discrimination in two population problems, introduced by Randies, Broffitt, Ramberg and Hogg (1978), is extended to cover settings involving more than two populations. Several methods of ranking are compared to the normal theory procedure in a Monte Carlo study. Asymptotic theory is included which confirms that the rank method does balance the limiting probabilities of misclassification in a two population setting. 相似文献

18.

Item reliability based on a paucity of item failures

Alan J. Gross Philip F. Rust 《统计学通讯:理论与方法》2013,42(10):2981-2990

When item reliability is high, lot acceptance sampling is likely to uncover no defective units. Nevertheless, the reliability lower confidence limits are often disappointing. The use of a Bayesian procedure with certain beta priors produces tighter confidence limits. The case where one observed defective is permitted in accepting a lot is also examined. Numerical results for both cases are presented. In addition, the impact of item misclassification is addressed. 相似文献

19.

Bayesian sample size determination for estimating binomial parameters from data subject to misclassification 总被引：1，自引：0，他引：1

E. Rahme L. Joseph & T. W. Gyorkos 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(1):119-128

We investigate the sample size problem when a binomial parameter is to be estimated, but some degree of misclassification is possible. The problem is especially challenging when the degree to which misclassification occurs is not exactly known. Motivated by a Canadian survey of the prevalence of toxoplasmosis infection in pregnant women, we examine the situation where it is desired that a marginal posterior credible interval for the prevalence of width w has coverage 1−α, using a Bayesian sample size criterion. The degree to which the misclassification probabilities are known a priori can have a very large effect on sample size requirements, and in some cases achieving a coverage of 1−α is impossible, even with an infinite sample size. Therefore, investigators must carefully evaluate the degree to which misclassification can occur when estimating sample size requirements. 相似文献

20.

Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes

Yue Zhang Kiros Berhane 《Journal of applied statistics》2016,43(4):704-720

We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development. 相似文献