期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Subset Selection Procedure for Selecting the Exponential Population Having the Longest Mean Lifetime When the Guarantee Times are the Same

《统计学通讯:理论与方法》2013,42(7):1555-1569

ABSTRACT

Consider k(≥ 2) independent exponential populations Π₁, Π₂, …, Π_k, having the common unknown location parameter μ ∈ (?∞, ∞) (also called the guarantee time) and unknown scale parameters σ₁, σ₂, …σ_k, respectively (also called the remaining mean lifetimes after the completion of guarantee times), σ_i > 0, i = 1, 2, …, k. Assume that the correct ordering between σ₁, σ₂, …, σ_k is not known apriori and let σ_[i], i = 1, 2, …, k, denote the ith smallest of σ_j s, so that σ_[1] ≤ σ_[2] ··· ≤ σ_[k]. Then Θ_i = μ + σ_i is the mean lifetime of Π_i, i = 1, 2, …, k. Let Θ_[1] ≤ Θ_[2] ··· ≤ Θ_[k] denote the ranked values of the Θ_j s, so that Θ_[i] = μ + σ_[i], i = 1, 2, …, k, and let Π_(i) denote the unknown population associated with the ith smallest mean lifetime Θ_[i] = μ + σ_[i], i = 1, 2, …, k. Based on independent random samples from the k populations, we propose a selection procedure for the goal of selecting the population having the longest mean lifetime Θ_[k] (called the “best” population), under the subset selection formulation. Tables for the implementation of the proposed selection procedure are provided. It is established that the proposed subset selection procedure is monotone for a general k (≥ 2). For k = 2, we consider the loss measured by the size of the selected subset and establish that the proposed subset selection procedure is minimax among selection procedures that satisfy a certain probability requirement (called the P*-condition) for the inclusion of the best population in the selected subset. 相似文献

2.

A Restricted Subset Selection Rule for Selecting at Least One of the t Best Normal Populations in Terms of Their Means When Their Common Variance is Known,Case II

Pinyuen Chen Lifang Hsu 《统计学通讯:理论与方法》2014,43(10-12):2250-2259

Consider k( ? 2) normal populations with unknown means μ₁, …, μ_k, and a common known variance σ². Let μ_[1] ? ??? ? μ_[k] denote the ordered μ_i.The populations associated with the t(1 ? t ? k ? 1) largest means are called the t best populations. Hsu and Panchapakesan (2004) proposed and investigated a procedure R_HPfor selecting a non empty subset of the k populations whose size is at most m(1 ? m ? k ? t) so that at least one of the t best populations is included in the selected subset with a minimum guaranteed probability P* whenever μ_{[k ? t + 1]} ? μ_{[k ? t]} ? δ*, where P*?and?δ* are specified in advance of the experiment. This probability requirement is known as the indifference-zone probability requirement. In the present article, we investigate the same procedure R_HP for the same goal as before but when k ? t < m ? k ? 1 so that at least one of the t best populations is included in the selected subset with a minimum guaranteed probability P* whatever be the configuration of the unknown μ_i. The probability requirement in this latter case is termed the subset selection probability requirement. Santner (1976) proposed and investigated a different procedure (R_S) based on samples of size n from each of the populations, considering both cases, 1 ? m ? k ? t and k ? t < m ? k. The special case of t = 1 was earlier studied by Gupta and Santner (1973) and Hsu and Panchapakesan (2002) for their respective procedures. 相似文献

3.

New procedures for selecting the k-best exponential populations

Allen C. K. Ng 《统计学通讯:模拟与计算》2017,46(3):2176-2184

Suppose exponential populations π_i with parameters (μ_i,σ_i) (i = 1, 2, …, K) are given. The σ_i can be unknown and unequal. This article discusses how to select the k (≥1) best populations. Under the subset selection formulation, a one-stage procedure is proposed. Under the indifference zone formulation, a two-stage procedure is proposed. An appealing feature of these procedures is that no statistical tables are needed for their implementation. 相似文献

4.

Subset Selection Toward Optimizing the Best Performance at a Second Stage

Chaim Meyer Ehrman Abba Krieger Klaus J. Miescke 《商业与经济统计学杂志》2013,31(2):295-303

In the search for the best of n candidates, two-stage procedures of the following type are in common use. In a first stage, weak candidates are removed, and the subset of promising candidates is then further examined. At a second stage, the best of the candidates in the subset is selected. In this article, optimization is not aimed at the parameter with largest value but rather at the best performance of the selected candidates at Stage 2. Under a normal model, a new procedure based on posterior percentiles is derived using a Bayes approach, where nonsymmetric normal (proper and improper) priors are applied. Comparisons are made with two other procedures frequently used in selection decisions. The three procedures and their performances are illustrated with data from a recent recruitment process at a Midwestern university. 相似文献

5.

A nonparametric procedure for testing partially ranked data

Jyh-Shyang Wu 《Journal of nonparametric statistics》2017,29(2):213-230

In consumer preference studies, it is common to seek a complete ranking of a variety of, say N, alternatives or treatments. Unfortunately, as N increases, it becomes progressively more confusing and undesirable for respondents to rank all N alternatives simultaneously. Moreover, the investigators may only be interested in consumers’ top few choices. Therefore, it is desirable to accommodate the setting where each survey respondent ranks only her/his most preferred k (k?N) alternatives. In this paper, we propose a simple procedure to test the independence of N alternatives and the top-k ranks, such that the value of k can be predetermined before securing a set of partially ranked data or be at the discretion of the investigator in the presence of complete ranking data. The asymptotic distribution of the proposed test under root-n local alternatives is established. We demonstrate our procedure with two real data sets. 相似文献

6.

SELECTION PROCEDURES FOR SCALE PARAMETERS USING TWO-SAMPLE U-STATISTICS

A.N. Gill G.P. Mehta 《Australian & New Zealand Journal of Statistics》1991,33(3):347-362

Let be k independent populations having the same known quantile of order p (0 p 1) and let F(x)=F(x/_i) be the absolutely continuous cumulative distribution function of the ith population indexed by the scale parameter ₁, i = 1,…, k. We propose subset selection procedures based on two-sample U-statistics for selecting a subset of k populations containing the one associated with the smallest scale parameter. These procedures are compared with the subset selection procedures based on two-sample linear rank statistics given by Gill & Mehta (1989) in the sense of Pitman asymptotic relative efficiency, with interesting results. 相似文献

7.

Selecting the normal population with the smallest variance: A restricted subset selection rule

Elena M. Buzaianu Pinyuen Chen S. Panchapakesan 《统计学通讯:理论与方法》2017,46(16):7887-7901

Consider k( ? 2) normal populations whose means are all known or unknown and whose variances are unknown. Let σ²_[1] ? ??? ? σ_[k]² denote the ordered variances. Our goal is to select a non empty subset of the k populations whose size is at most m(1 ? m ? k ? 1) so that the population associated with the smallest variance (called the best population) is included in the selected subset with a guaranteed minimum probability P* whenever σ²_[2]/σ_[1]² ? δ* > 1, where P* and δ* are specified in advance of the experiment. Based on samples of size n from each of the populations, we propose and investigate a procedure called R_BCP. We also derive some asymptotic results for our procedure. Some comparisons with an earlier available procedure are presented in terms of the average subset sizes for selected slippage configurations based on simulations. The results are illustrated by an example. 相似文献

8.

Single-stage sampling procedure of the t best populations under heteroscedasticity

Miin-Jye Wen Li-Ching Huang 《统计学通讯:理论与方法》2017,46(18):9265-9273

Given k( ? 3) independent normal populations with unknown means and unknown and unequal variances, a single-stage sampling procedure to select the best t out of k populations is proposed and the procedure is completely independent of the unknown means and the unknown variances. For various combinations of k and probability requirement, tables of procedure parameters are provided for practitioners. 相似文献

9.

An essentially complete class of multiple decision procedures

Shanti S. Gupta Deng-Yuang Huang 《Journal of statistical planning and inference》1980,4(2):115-121

Let π₁,…, π_k represent k(?2) independent populations. The quality of the ith population π_i is characterized by a real-valued parameter θ_i, usually unknown. We define the best population in terms of a measure of separation between θ_i's. A selection of a subset containing the best population is called a correct selection (CS). We restrict attention to rules for which the size of the selected subset is controlled at a given point and the infimum of the probability of correct selection over the parameter space is maximized. The main theorem deals with construction of an essentially complete class of selection rules of the above type. Some classical subset selection rules are shown to belong to this class. 相似文献

10.

Group selection for production yield among k manufacturing lines

Chen-ju Lin Wen Lea Pearn 《Journal of statistical planning and inference》2011,141(4):1510-1518

Producing qualified items or products is essential to meet the requirement preset by customers. Evaluation and selection of desired manufacturing lines become challenging tasks for decision makers. Production yield is one of the important factors in measuring production performance. The goal of this paper is to screen a group of manufacturing lines and identify the best one with the highest yield. For the production lines with extremely low fraction of defectives, the yield index, S_pk, is an efficient indicator for quality level. This paper considers the production selection problem by using S_pk to compare k (k>2) manufacturing lines. A subset is constructed to contain the production lines with the highest yield. A systematic approach of test order k compares selected pairs of manufacturing lines along with the Bonferroni method is proposed to solve this problem. Each pair of production yields is compared by taking ratio. The paper provides critical values and required sample sizes of the group selection procedure. An application example on evaluating four power inductor productions is presented to illustrate the practicality of the proposed approach. 相似文献

11.

A new distribution-free k-sample test: Analysis of kernel density functionals

Su Chen 《Revue canadienne de statistique》2020,48(2):167-186

A novel distribution-free k-sample test of differences in location shifts based on the analysis of kernel density functional estimation is introduced and studied. The proposed test parallels one-way analysis of variance and the Kruskal–Wallis (KW) test aiming at testing locations of unknown distributions. In contrast to the rank (score)-transformed non-parametric approach, such as the KW test, the proposed F-test uses the measurement responses along with well-known kernel density estimation (KDE) to estimate the locations and construct the test statistic. A practical optimal bandwidth selection procedure is also provided. Our simulation studies and real data example indicate that the proposed analysis of kernel density functional estimate (ANDFE) test is superior to existing competitors for fat-tailed or heavy-tailed distributions when the k groups differ mainly in location rather than shape, especially with unbalanced data. ANDFE is also highly recommended when it is unclear whether test groups differ mainly in shape or location. The Canadian Journal of Statistics 48: 167–186; 2020 © 2019 Statistical Society of Canada 相似文献

12.

Variable Selection for Support Vector Machines

Surette Bierman Sarel Steel 《统计学通讯:模拟与计算》2013,42(8):1640-1658

Consider using values of variables X ₁, X ₂,…, X _p to classify entities into one of two classes. Kernel-based procedures such as support vector machines (SVMs) are well suited for this task. In general, the classification accuracy of SVMs can be substantially improved if instead of all p candidate variables, a smaller subset of (say m) variables is used. A new two-step approach to variable selection for SVMs is therefore proposed: best variable subsets of size k = 1,2,…, p are first identified, and then a new data-dependent criterion is used to determine a value for m. The new approach is evaluated in a Monte Carlo simulation study, and on a sample of data sets. 相似文献

13.

A Resampling Approach for Under-estimating a Finite Population Total from a Censored Sample

《统计学通讯:理论与方法》2013,42(12):2305-2320

Abstract

Suppose a finite population of N objects each of which has an unknown value μ _i ≥ 0, i = 1, … , N of a nonnegative characteristic of interest. A random sample has been drawn, but only for a selected subset of the sample the μ-values have been observed. The subset selection procedure has been somewhat obscure, and thus the subsample is censorized rather than random. Despite that, a reliable lower bound for the population total (the sum of all μ _i) is required which uses the statistical information contained in the data. We propose a resampling procedure to construct an under-estimate of the population total. We also consider the case when the objects of the population have unequal sampling probabilities, in particular when the population is divided into a few number of strata with constant probabilities within each stratum. A real data example illustrates the method. 相似文献

14.

Some multiple decision problems in analysis of variance

Shanti S. Gupta Deng-Yuan Huang 《统计学通讯:理论与方法》2013,42(11):1035-1054

In most practical situations to which the analysis of variance tests are applied, they do not supply the information that the experimenter aims at. If, for example, in one-way ANOVA the hypothesis is rejected in actual application of the F-test, the resulting conclusion that the true means θ₁,…,θ_k are not all equal, would by itself usually be insufficient to satisfy the experimenter. In fact his problems would begin at this stage. The experimenter may desire to select the “best” population or a subset of the “good” populations; he may like to rank the populations in order of “goodness” or he may like to draw some other inferences about the parameters of interest.

The extensive literature on selection and ranking procedures depends heavily on the use of independence between populations (block, treatments, etc.) in the analysis of variance. In practical applications, it is desirable to drop this assumption or independence and consider cases more general than the normal.

In the present paper, we derive a method to construct optimal (in some sense) selection procedures to select a nonempty subset of the k populations containing the best population as ranked in terms of θ_i’s which control the size of the selected subset and which maximizes the minimum average probability of selecting the best. We also consider the usual selection procedures in one-way ANOVA based on the generalized least squares estimates and apply the method to two-way layout case. Some examples are discussed and some results on comparisons with other procedures are also obtained. 相似文献

15.

The simultaneous confidence intervals for all distances from the extreme populations for two-parameter exponential populations based on the multiply type II censored samples

《Journal of Statistical Computation and Simulation》2012,82(2):137-165

Among k independent two-parameter exponential distributions which have the common scale parameter, the lower extreme population (LEP) is the one with the smallest location parameter and the upper extreme population (UEP) is the one with the largest location parameter. Given a multiply type II censored sample from each of these k independent two-parameter exponential distributions, 14 estimators for the unknown location parameters and the common unknown scale parameter are considered. Fourteen simultaneous confidence intervals (SCIs) for all distances from the extreme populations (UEP and LEP) and from the UEP from these k independent exponential distributions under the multiply type II censoring are proposed. The critical values are obtained by the Monte Carlo method. The optimal SCIs among 14 methods are identified based on the criteria of minimum confidence length for various censoring schemes. The subset selection procedures of extreme populations are also proposed and two numerical examples are given for illustration. 相似文献

16.

An outlier detection scheme for dynamical sequential datasets

Shiliang Zhang Zonglin Ye Yanbin Zhang Xiali Hei 《统计学通讯:模拟与计算》2019,48(5):1450-1502

Outlier detection plays an important role in the pre-treatment of sequential datasets to obtain pure valuable data. This paper proposes an outlier detection scheme for dynamical sequential datasets. First, the conception of forward outlier factor(FOF) and backward outlier factor(BOF) are employed to measure an object’s similarity shared with its sequentially adjacent objects. The object that shows no similarity with its sequential neighbors is labeled as suspicious outliers, which will be treated subsequently to judge whether it is really an outlier in the dataset. Second, the sequentially adjacent suspicious outliers are defined as suspicious outlier series(SOS), then the expected path representing the ideal transition path through the suspicious outliers in the SOS and the measured path representing the real path through all the objects in the SOS are employed, and the ratio of the length of the expected path to that of the measured path indicates whether there exist outliers in the SOS. Third, in the case that there exist outliers in the SOS, if there are N suspicious outliers in the SOS, then 2^N ? 2 remaining path will be generated by removing k(0 < k < N) suspicious outliers and sequentially connecting the remaining ones. The dynamical sequential outlier factor(DSOF) is employed to represent the ratio of the length of measured path of the considered remaining path to the that of the the expected path of the corresponding SOS, and the degree of the objects removed in a remaining path being outliers is indicated by the DSOF. The proposed outlier detection scheme is conducted from a dynamical perspective, and breaks the tight relation between being an outlier and being not similar with adjacent objects. Experiments are conducted to evaluate the effectiveness of the proposed scheme, and the experimental results verify that the proposed scheme has higher detection quality for sequential dataset. In addition, the proposed outlier detection scheme is not dependent on the size of dataset and needs no prior information about the distribution of the data. 相似文献

17.

A comparison of two procedures to select the best binomial population with sequential elimination of inferior populations

Bruce Levin Cheng-Shiun Leu 《Journal of statistical planning and inference》2007

We compare the selection procedure of Levin and Robbins [1981. Selecting the highest probability in binomial or multinomial trials. Proc. Nat. Acad. Sci. USA 78, 4663–4666.] with the procedure of Paulson [1994. Sequential procedures for selecting the best one of k Koopman–Darmois populations. Sequential Analysis 13, 207–220.] to identify the best of several binomial populations with sequential elimination of unlikely candidates. We point out situations in which the Levin–Robbins procedure dominates the Paulson procedure in terms of the duration of the experiment, the expected total number of observations, and the expected number of failures. Because the Levin–Robbins procedure is also easier to implement than Paulson's procedure and gives a tighter guarantee for the probability of correct selection, we conclude that it holds a competitive edge over Paulson's procedure. 相似文献

18.

Strong Consistency of Reduced K‐means Clustering

Yoshikazu Terada 《Scandinavian Journal of Statistics》2014,41(4):913-931

Reduced k‐means clustering is a method for clustering objects in a low‐dimensional subspace. The advantage of this method is that both clustering of objects and low‐dimensional subspace reflecting the cluster structure are simultaneously obtained. In this paper, the relationship between conventional k‐means clustering and reduced k‐means clustering is discussed. Conditions ensuring almost sure convergence of the estimator of reduced k‐means clustering as unboundedly increasing sample size have been presented. The results for a more general model considering conventional k‐means clustering and reduced k‐means clustering are provided in this paper. Moreover, a consistent selection of the numbers of clusters and dimensions is described. 相似文献

19.

A novel Hybrid RBF Neural Networks model as a forecaster

Oguz Akbilgic Hamparsum Bozdogan M. Erdal Balaban 《Statistics and Computing》2014,24(3):365-375

We introduce a novel predictive statistical modeling technique called Hybrid Radial Basis Function Neural Networks (HRBF-NN) as a forecaster. HRBF-NN is a flexible forecasting technique that integrates regression trees, ridge regression, with radial basis function (RBF) neural networks (NN). We develop a new computational procedure using model selection based on information-theoretic principles as the fitness function using the genetic algorithm (GA) to carry out subset selection of best predictors. Due to the dynamic and chaotic nature of the underlying stock market process, as is well known, the task of generating economically useful stock market forecasts is difficult, if not impossible. HRBF-NN is well suited for modeling complex non-linear relationships and dependencies between the stock indices. We propose HRBF-NN as our forecaster and a predictive modeling tool to study the daily movements of stock indices. We show numerical examples to determine a predictive relationship between the Istanbul Stock Exchange National 100 Index (ISE100) and seven other international stock market indices. We select the best subset of predictors by minimizing the information complexity (ICOMP) criterion as the fitness function within the GA. Using the best subset of variables we construct out-of-sample forecasts for the ISE100 index to determine the daily directional movements. Our results obtained demonstrate the utility and the flexibility of HRBF-NN as a clever predictive modeling tool for highly dependent and nonlinear data. 相似文献

20.

Variable selection in linear regression based on ridge estimator

《Journal of Statistical Computation and Simulation》2012,82(11):1211-1224

In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity. 相似文献