首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper studies a sequential procedure R for selecting a random size subset that contains the multinomial cell which has the smallest cell probability. The stopping rule of the proposed procedure R is the composite of the stopping rules of curtailed sampling, inverse sampling, and the Ramey-Alam sampling. A reslut on the worst configuration is shown and it is employed in computing the procedure parameters that guarantee certain probability requirements. Tables of these procedure parameters, the corresponding probability of correct selection, the expected sample size, and the expected subset size are given for comparison purpose.  相似文献   

2.
The problem of treatment related withdrawals has not been fully addressed in the statistical literature. Statistical procedures which compare efficacies of treatments when withdrawals may occur prior to the conclusion of a study are discussed. A unified test statistic which incorporates all available data, not just the last values and adjusts for withdrawal patterns, proportion of withdrawals, and level of response prior to withdrawal has been developed with the help of data- dependent scoring schemes. A randomization technique has been developed to compute an empirical significance level for each scoring system under both the Fulldata and Endpoint data for a specified parameter configuration. The proposed methods have been applied to a subset (Allen Park Hospital) of the VA study 127.  相似文献   

3.
In the past decade a number of fixed sampling methods have been developed for selecting the "best" or at least a "good" subset of vaiable in regression analysis. We are interested in deriving a sequential selection procedure to select a subset of a random size including equations. Tables for an example are given at the end of this paper  相似文献   

4.
This paper presents a selection procedure that combines Bechhofer's indifference zone selection and Gupta's subset selection approaches, by using a preference threshold. For normal populations with common known variance, a subset is selected of all populations that have sample sums within the distance of this threshold from the largest sample sum. We derive the minimal necessary sample size and the value for the preference threshold, in order to satisfy two probability requirements for correct selection, one related to indifference zone selection, the other to subset selection. Simulation studies are used to illustrate the method.  相似文献   

5.
Consider k(k ≥ 2) two-parameter Weibull populations. We want to select a subset of the populations not exceeding m in size such that the subset contains at least ? of the t best populations. We have proposed a procedure which uses either the maximum likelihood estimators or ‘simplified’ linear estimators of the parameters. The estimators are based on type II censored data. The ranking of the populations is done by comparing their reliabilities at a certain fixed time. In selected cases the constants for the procedure are tabulated using Monte Carlo methods.  相似文献   

6.
The stalactite plot for the detection of multivariate outliers   总被引:1,自引:0,他引:1  
Detection of multiple outliers in multivariate data using Mahalanobis distances requires robust estimates of the means and covariance of the data. We obtain this by sequential construction of an outlier free subset of the data, starting from a small random subset. The stalactite plot provides a cogent summary of suspected outliers as the subset size increases. The dependence on subset size can be virtually removed by a simulation-based normalization. Combined with probability plots and resampling procedures, the stalactite plot, particularly in its normalized form, leads to identification of multivariate outliers, even in the presence of appreciable masking.  相似文献   

7.
Abstract

Dominance analysis is a procedure for measuring the importance of predictors in multiple regression analysis. We show that dominance analysis can be enhanced using a dynamic programing approach for the rank-ordering of predictors. Using customer satisfaction data from a call center operation, we demonstrate how the integration of dominance analysis with dynamic programing can provide a better understanding of predictor importance. As a cautionary note, we recommend careful reflection on the relationship between predictor importance and variable subset selection. We observed that slight changes in the selected predictor subset can have an impact on the importance rankings produced by a dominance analysis.  相似文献   

8.
The problem of selection of a subset containing the largest of several location parameters is considered, and a Gupta-type selection rule based on sample medians is investigated for normal and double exponential populations. Numerical comparisons between rules based on medians and means of small samples are made for normal and contaminated normal populations, assuming the popula-tion means to be equally spaced. It appears that the rule based on sample means loses its superiority over the rule based on sample medians in case the samples are heavily contaminated. The asymptotic relative efficiency (ARE) of the medians procedure relative to the means procedure is also computed, assuming the normal means to be in a slippage configuration. The means proce-dure is found to be superior to the median procedure in the sense of ARE. As in the small sample case, the situation is reversed if the normal populations are highly contaminate.  相似文献   

9.
We restrict attention to a class of Bernoulli subset selection procedures which take observations one-at-a-time and can be compared directly to the Gupta-Sobel single-stage procedure. For the criterion of minimizing the expected total number of observations required to terminate experimentation, we show that optimal sampling rules within this class are not of practical interest. We thus turn to procedures which, although not optimal, exhibit desirable behavior with regard to this criterion. A procedure which employs a modification of the so-called least-failures sampling rule is proposed, and is shown to possess many desirable properties among a restricted class of Bernoulli subset selection procedures. Within this class, it is optimal for minimizing the number of observations taken from populations excluded from consideration following a subset selection experiment, and asymptotically optimal for minimizing the expected total number of observations required. In addition, it can result in substantial savings in the expected total num¬ber of observations required as compared to a single-stage procedure, thus it may be de¬sirable to a practitioner if sampling is costly or the sample size is limited.  相似文献   

10.
Consider k (k >(>)2) Weibull populations. We shall derive a method of constructing optimal selection procedures to select a subset of the k populations containing the best population which control the size of the selected subset and which maximises the minimum probability of making a correct selection. Procedures and results are derived for the case when sample sizes are unequal. Some tables and figures are given at the end of this paper.  相似文献   

11.
The problem of selecting the best population from among a finite number of populations in the presence of uncertainty is a problem one faces in many scientific investigations, and has been studied extensively, Many selection procedures have been derived for different selection goals. However, most of these selection procedures, being frequentist in nature, don't tell how to incorporate the information in a particular sample to give a data-dependent measure of correct selection achieved for this particular sample. They often assign the same decision and probability of correct selection for two different sample values, one of which actually seems intuitively much more conclusive than the other. The methodology of conditional inference offers an approach which achieves both frequentist interpret ability and a data-dependent measure of conclusiveness. By partitioning the sample space into a family of subsets, the achieved probability of correct selection is computed by conditioning on which subset the sample falls in. In this paper, the partition considered is the so called continuum partition, while the selection rules are both the fixed-size and random-size subset selection rules. Under the distributional assumption of being monotone likelihood ratio, results on least favourable configuration and alpha-correct selection are established. These re-sults are not only useful in themselves, but also are used to design a new sequential procedure with elimination for selecting the best of k Binomial populations. Comparisons between this new procedure and some other se-quential selection procedures with regard to total expected sample size and some risk functions are carried out by simulations.  相似文献   

12.
We extend a basic result of Huber's on least favorable distributions to the setting of conditional inference, using an approach based on the notion of log-Gâteaux differentiation and perturbed models. Whereas Huber considered intervals of fixed width for location parameters and their average coverage rates, we study error models having longest confidence intervals, conditional on the location configuration of the sample. Our version of the problem does not have a global solution, but one that changes from configuration to configuration. Asymptotically, the conditionally least-informative shape minimizes the conditional Fisher information. We characterize the asymptotic solution within Huber's contamination model.  相似文献   

13.
Yu  Tingting  Wu  Lang  Gilbert  Peter 《Lifetime data analysis》2019,25(2):229-258

In HIV vaccine studies, longitudinal immune response biomarker data are often left-censored due to lower limits of quantification of the employed immunological assays. The censoring information is important for predicting HIV infection, the failure event of interest. We propose two approaches to addressing left censoring in longitudinal data: one that makes no distributional assumptions for the censored data—treating left censored values as a “point mass” subgroup—and the other makes a distributional assumption for a subset of the censored data but not for the remaining subset. We develop these two approaches to handling censoring for joint modelling of longitudinal and survival data via a Cox proportional hazards model fit by h-likelihood. We evaluate the new methods via simulation and analyze an HIV vaccine trial data set, finding that longitudinal characteristics of the immune response biomarkers are highly associated with the risk of HIV infection.

  相似文献   

14.
Let π1,…, πk represent k(?2) independent populations. The quality of the ith population πi is characterized by a real-valued parameter θi, usually unknown. We define the best population in terms of a measure of separation between θi's. A selection of a subset containing the best population is called a correct selection (CS). We restrict attention to rules for which the size of the selected subset is controlled at a given point and the infimum of the probability of correct selection over the parameter space is maximized. The main theorem deals with construction of an essentially complete class of selection rules of the above type. Some classical subset selection rules are shown to belong to this class.  相似文献   

15.
We introduce a novel predictive statistical modeling technique called Hybrid Radial Basis Function Neural Networks (HRBF-NN) as a forecaster. HRBF-NN is a flexible forecasting technique that integrates regression trees, ridge regression, with radial basis function (RBF) neural networks (NN). We develop a new computational procedure using model selection based on information-theoretic principles as the fitness function using the genetic algorithm (GA) to carry out subset selection of best predictors. Due to the dynamic and chaotic nature of the underlying stock market process, as is well known, the task of generating economically useful stock market forecasts is difficult, if not impossible. HRBF-NN is well suited for modeling complex non-linear relationships and dependencies between the stock indices. We propose HRBF-NN as our forecaster and a predictive modeling tool to study the daily movements of stock indices. We show numerical examples to determine a predictive relationship between the Istanbul Stock Exchange National 100 Index (ISE100) and seven other international stock market indices. We select the best subset of predictors by minimizing the information complexity (ICOMP) criterion as the fitness function within the GA. Using the best subset of variables we construct out-of-sample forecasts for the ISE100 index to determine the daily directional movements. Our results obtained demonstrate the utility and the flexibility of HRBF-NN as a clever predictive modeling tool for highly dependent and nonlinear data.  相似文献   

16.
A common problem in analysis of variance is testing for heterogeneity of different subsets of the full set of k population means. A step-down procedure tests a given subset of p means only after rejecting homogeneity for all sets that contain it. The Peritz and Gabriel closed procedure rejects homogeneity for the subset if every partition of the k means that includes the subset includes some rejected set. The Begun and Gabriel closure algorithm reduces computations, but the number of tests still increases exponentially with respect to the number of complementary means, m=kp. We propose a new algorithm that tests only the m−1 pairs of adjacent ordered complementary sample means. Our algorithm may be used with analyses of variance test statistics in balanced and unbalanced designs, and with Studentized ranges except in extremely unbalanced designs. Seaman, Levin, and Serlin proposed a more powerful closure criterion that cannot exploit the Begun and Gabriel algorithm. We propose a new algorithm in this case as well.  相似文献   

17.
Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive data sets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large data sets. Subsemble partitions the full data set into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be a beneficial tool for small- to moderate-sized data sets, and often has better prediction performance than the underlying algorithm fit just once on the full data set. We also describe how to include Subsemble as a candidate in a SuperLearner library, providing a practical way to evaluate the performance of Subsemble relative to the underlying algorithm fit just once on the full data set.  相似文献   

18.
We give a neccessary and sufficient condition for the equality of theOLS andGLS estimator of a subset of the regression coefficients in a linear model.  相似文献   

19.
The parameters of a finite mixture model cannot be consistently estimated when the data come from an embedded distribution with fewer components than that being fitted, because the distribution is represented by a subset in the parameter space, and not by a single point. Feng & McCulloch (1996) give conditions, not easily verified, under which the maximum likelihood (ML) estimator will converge to an arbitrary point in this subset. We show that the conditions can be considerably weakened. Even though embedded distributions may not be uniquely represented in the parameter space, estimators of quantities of interest, like the mean or variance of the distribution, may nevertheless actually be consistent in the conventional sense. We give an example of some practical interest where the ML estimators are root of n -consistent.
Similarly consistent statistics can usually be found to test for a simpler model vs a full model. We suggest a test statistic suitable for a general class of model and propose a parameter-based bootstrap test, based on this statistic, for when the simpler model is correct.  相似文献   

20.
The paper is concerned with static search on a finite set. An unknown subset of cardinality k of the finite set is to be found by testing its subsets. We investigate two problems: in the first, the number of common elements of the tested and the unknown subset is given; in the second, only the information whether the tested and the unknown subset are disjoint or not is given. Both problems correspond to problems on false coins. If the unknown subset is taken from the family of k-element sets with uniform distribution, we determine the minimum of the lengths of the strategies that find the unknown element with small error probability. The strategies are constructed by probabilistic means.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号