首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A common problem in analysis of variance is testing for heterogeneity of different subsets of the full set of k population means. A step-down procedure tests a given subset of p means only after rejecting homogeneity for all sets that contain it. The Peritz and Gabriel closed procedure rejects homogeneity for the subset if every partition of the k means that includes the subset includes some rejected set. The Begun and Gabriel closure algorithm reduces computations, but the number of tests still increases exponentially with respect to the number of complementary means, m=kp. We propose a new algorithm that tests only the m−1 pairs of adjacent ordered complementary sample means. Our algorithm may be used with analyses of variance test statistics in balanced and unbalanced designs, and with Studentized ranges except in extremely unbalanced designs. Seaman, Levin, and Serlin proposed a more powerful closure criterion that cannot exploit the Begun and Gabriel algorithm. We propose a new algorithm in this case as well.  相似文献   

2.
Consider k independent random samples such that ith sample is drawn from a two-parameter exponential population with location parameter μi and scale parameter θi,?i = 1, …, k. For simultaneously testing differences between location parameters of successive exponential populations, closed testing procedures are proposed separately for the following cases (i) when scale parameters are unknown and equal and (ii) when scale parameters are unknown and unequal. Critical constants required for the proposed procedures are obtained numerically and the selected values of the critical constants are tabulated. Simulation study revealed that the proposed procedures has better ability to detect the significant differences and has more power in comparison to exiting procedures. The illustration of the proposed procedures is given using real data.  相似文献   

3.
We propose optimal procedures to achieve the goal of partitioning k multivariate normal populations into two disjoint subsets with respect to a given standard vector. Definition of good or bad multivariate normal populations is given according to their Mahalanobis distances to a known standard vector as being small or large. Partitioning k multivariate normal populations is reduced to partitioning k non-central Chi-square or non-central F distributions with respect to the corresponding non-centrality parameters depending on whether the covariance matrices are known or unknown. The minimum required sample size for each population is determined to ensure that the probability of correct decision attains a certain level. An example is given to illustrate our procedures.  相似文献   

4.
An extension of a result about the estimation in Karlin and Rubin is given for the following case:The sample space, the parameter space and the decision space are subsets of a multi-dimensional Euclidean space, there is defined a suitable partial ordering in each of spaces, and a probability distribution has monotone likelihood ratio with respect to the partial orderings (see Ishii, 1976). In the special case when the loss function is quadratic a simple proof of a result in Karlin and Rubin is given. Stein's estimators are discussed as examples.  相似文献   

5.
For discrete statistical models it is shown that any statistic induces a partition of the set of all possible distributions defined on the sample space. This partition identifies the subsets of the parameter space for which the statistic is sufficient.  相似文献   

6.
The problem of clustering individuals is considered within the context of a mixture of distributions. A modification of the usual approach to population mixtures is employed. As usual, a parametric family of distributions is considered, a set of parameter values being associated with each population. In addition, with each observation is associated an identification parameter, Indicating from which population the observation arose. Theresulting likelihood function is interpreted in terms of the conditional probability density of a sample from a mixture of populations, given the identification parameter of each observation. Clustering algorithms are obtained by applying a method of iterated maximum likelihood to this like-lihood function.  相似文献   

7.
The problem of selecting the best population from among a finite number of populations in the presence of uncertainty is a problem one faces in many scientific investigations, and has been studied extensively, Many selection procedures have been derived for different selection goals. However, most of these selection procedures, being frequentist in nature, don't tell how to incorporate the information in a particular sample to give a data-dependent measure of correct selection achieved for this particular sample. They often assign the same decision and probability of correct selection for two different sample values, one of which actually seems intuitively much more conclusive than the other. The methodology of conditional inference offers an approach which achieves both frequentist interpret ability and a data-dependent measure of conclusiveness. By partitioning the sample space into a family of subsets, the achieved probability of correct selection is computed by conditioning on which subset the sample falls in. In this paper, the partition considered is the so called continuum partition, while the selection rules are both the fixed-size and random-size subset selection rules. Under the distributional assumption of being monotone likelihood ratio, results on least favourable configuration and alpha-correct selection are established. These re-sults are not only useful in themselves, but also are used to design a new sequential procedure with elimination for selecting the best of k Binomial populations. Comparisons between this new procedure and some other se-quential selection procedures with regard to total expected sample size and some risk functions are carried out by simulations.  相似文献   

8.
This paper is concerned primarily with subset selection procedures based on the sample mediansof logistic populations. A procedure is given which chooses a nonempty subset from among kindependent logistic populations, having a common known variance, so that the populations with thelargest location parameter is contained in the subset with a pre‐specified probability. Theconstants required to apply the median procedure with small sample sizes (≤= 19) are tabulated and can also be used to construct simultaneous confidence intervals. Asymptotic formulae are provided for application with larger sample sizes. It is shown that, under certain situations, rules based on the median are substantially more efficient than analogous procedures based either on sample means or on the sum of joint ranks.  相似文献   

9.
A random sample is to be classified as coming from one of two normally distributed populations with known parameters. Combinatoric procedures which classify the sample based upon the sample mean(s) and variance(s) are described for the univariate and multivariate problems. Comparisons of misclassification probabilities are made between the combinatoric and the likelihood ratio procedure in the univariate case and between two alternative combinatoric procedures in the bivariate case.  相似文献   

10.
A unit ω is to be classified into one of two correlated homoskedastic normal populations by linear discriminant function known as W classification statistic [T.W. Anderson, An asymptotic expansion of the distribution of studentized classification statistic, Ann. Statist. 1 (1973), pp. 964–972; T.W. Anderson, An Introduction to Multivariate Statistical Analysis, 2nd edn, Wiley, New York, 1984; G.J. Mclachlan, Discriminant Analysis and Statistical Pattern Recognition, John Wiley and Sons, New York, 1992]. The two populations studied here are two different states of the same population, like two different states of a disease where the population is the population of diseased patient. When a sample unit is observed in both the states (populations), the observations made on it (which form a pair) become correlated. A training sample is unbalanced when not all sample units are observed in both the states. Paired and also unbalanced samples are natural in studies related to correlated populations. S. Bandyopadhyay and S. Bandyopadhyay [Choosing better training sample for classifying an individual into one of two correlated normal populations, Calcutta Statist. Assoc. Bull. 54(215–216) (2003), pp. 167–180] studied the effect of unbalanced training sample structure on the performance of W statistics in the univariate correlated normal set-up for finding optimal sampling strategy for a better classification rate. In this study, the results are extended to the multivariate case with discussion on application in real scenario.  相似文献   

11.
Empirical Bayes procedures have been developed extensively in the literature, under the assumption that the underlying parameter space (or the sample space) is Euclidean in nature. However, there has been almost no research carried out into when the data comes from a different space. We develop empirical Bayes techniques to estimate the mean direction of the Fisher-von Mises distribution. In this case, the underlying space is non-Euclidean. The special case when the data are angles on the unit circle is illustrated with an example.  相似文献   

12.
We introduce two types of graphical log‐linear models: label‐ and level‐invariant models for triangle‐free graphs. These models generalise symmetry concepts in graphical log‐linear models and provide a tool with which to model symmetry in the discrete case. A label‐invariant model is category‐invariant and is preserved after permuting some of the vertices according to transformations that maintain the graph, whereas a level‐invariant model equates expected frequencies according to a given set of permutations. These new models can both be seen as instances of a new type of graphical log‐linear model termed the restricted graphical log‐linear model, or RGLL, in which equality restrictions on subsets of main effects and first‐order interactions are imposed. Their likelihood equations and graphical representation can be obtained from those derived for the RGLL models.  相似文献   

13.
The analysis of complex networks is a rapidly growing topic with many applications in different domains. The analysis of large graphs is often made via unsupervised classification of vertices of the graph. Community detection is the main way to divide a large graph into smaller ones that can be studied separately. However another definition of a cluster is possible, which is based on the structural distance between vertices. This definition includes the case of community clusters but is more general in the sense that two vertices may be in the same group even if they are not connected. Methods for detecting communities in undirected graphs have been recently reviewed by Fortunato. In this paper we expand Fortunato’s work and make a review of methods and algorithms for detecting essentially structurally homogeneous subsets of vertices in binary or weighted and directed and undirected graphs.  相似文献   

14.
Successive tests of hypotheses, as exemplified with an analysis of variance table, impose a set theoretic structure on the parameter space and yet allow much arbitrariness in the definition of nuisance parameters. Two major types of statistical model, the exponential and transformation, are shown to have by basic theory well defined conditional testing procedures. The two types of testing procedure are then shown to have opposite forms of set theoretic structure on the sample space, and to differ sharply from the commonly used deviance or likelihood drop methods. The two types of model have the normal linear model as the intersection model and the two opposite forms of testing procedure manage to coincide by product space structure and independence. Details of the two types of testing procedure are discussed, related to the arbitrariness in nuisance parameter definition, and organized to provide a general-case pattern for the development of conditional procedures as an alternative to the default likelihood-drop methods.  相似文献   

15.
Sample coordination maximizes or minimizes the overlap of two or more samples selected from overlapping populations. It can be applied to designs with simultaneous or sequential selection of samples. We propose a method for sample coordination in the former case. We consider the case where units are to be selected with maximum overlap using two designs with given unit inclusion probabilities. The degree of coordination is measured by the expected sample overlap, which is bounded above by a theoretical bound, called the absolute upper bound, and which depends on the unit inclusion probabilities. If the expected overlap equals the absolute upper bound, the sample coordination is maximal. Most of the methods given in the literature consider fixed marginal sampling designs, but in many cases, the absolute upper bound is not achieved. We propose to construct optimal sampling designs for given unit inclusion probabilities in order to realize maximal coordination. Our method is based on some theoretical conditions on joint selection probability of two samples and on the controlled selection method with linear programming implementation. The method can also be applied to minimize the sample overlap.  相似文献   

16.
This paper compares minimum distance estimation with best linear unbiased estimation to determine which technique provides the most accurate estimates for location and scale parameters as applied to the three parameter Pareto distribution. Two minimum distance estimators are developed for each of the three distance measures used (Kolmogorov, Cramer‐von Mises, and Anderson‐Darling) resulting in six new estimators. For a given sample size 6 or 18 and shape parameter 1(1)4, the location and scale parameters are estimated. A Monte Carlo technique is used to generate the sample sets. The best linear unbiased estimator and the six minimum distance estimators provide parameter estimates based on each sample set. These estimates are compared using mean square error as the evaluation tool. Results show that the best linear unbaised estimator provided more accurate estimates of location and scale than did the minimum estimators tested.  相似文献   

17.
This Article Considers the problem of classifiying an observation consisting of both binary and continuous variables based on two general incomplete training samples one from each of the two given populations. The location linear model adopted by krzanowski 1975 forms the basis of our inverstigation. For a given location, When the common dispersion matrix as Well as the corresponding cell probabilities for the underlying populations are known, exact distribution of the conditional maximum likelihood classification rule is derived. The overall error rate can be obtained and is based on linear cominations of independent non– Chi– Distributions. large sample result for the case where the cell probabilities are unknown is also available.  相似文献   

18.
Assume that a number of individuals are to be classified into one of two populations and that, at the same time, the proportion of members of each population needs to be estimated. The allocated proportions given by the Bayes classification rule are not consistent estimates of the true proportions, so a different classification rule is proposed; this rule yields consistent estimates with only a small increase in the probability of misclassification. As an illustration, the case of two normal distributions with equal covariance matrices is dealt with in detail.  相似文献   

19.
Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures.  相似文献   

20.
The ranked set samples and median ranked set samples in particular have been used extensively in the literature due to many reasons. In some situations, the experimenter may not be able to quantify or measure the response variable due to the high cost of data collection, however it may be easier to rank the subject of interest. The purpose of this article is to study the asymptotic distribution of the parameter estimators of the simple linear regression model. We show that these estimators using median ranked set sampling scheme converge in distribution to the normal distribution under weak conditions. Moreover, we derive large sample confidence intervals for the regression parameters as well as a large sample prediction interval for new observation. Also, we study the properties of these estimators for small sample setup and conduct a simulation study to investigate the behavior of the distributions of the proposed estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号