首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper introduces a new class of distribution-free tests for testing the homogeneity of several location parameters against ordered alternatives. The proposed class of test statistics is based on a linear combination of two-sample U-statistics based on subsample extremes. The mean and variance of the test statistic are obtained under the null hypothesis as well as under the sequence of local alternatives. The optimal weights are also determined. It is shown via Pitman ARE comparisons that the proposed class of test statistics performs better than its competitor tests in case of heavy-tailed and long-tailed distributions  相似文献   

2.
In 1960 Levene suggested a potentially robust test of homogeneity of variance based on an ordinary least squares analysis of variance of the absolute values of mean-based residuals. Levene's test has since been shown to have inflated levels of significance when based on the F-distribution, and tests a hypothesis other than homogeneity of variance when treatments are unequally replicated, but the incorrect formulation is now standard output in several statistical packages. This paper develops a weighted least squares analysis of variance of the absolute values of both mean-based and median-based residuals. It shows how to adjust the residuals so that tests using the F -statistic focus on homogeneity of variance for both balanced and unbalanced designs. It shows how to modify the F -statistics currently produced by statistical packages so that the distribution of the resultant test statistic is closer to an F-distribution than is currently the case. The weighted least squares approach also produces component mean squares that are unbiased irrespective of which variable is used in Levene's test. To complete this aspect of the investigation the paper derives exact second-order moments of the component sums of squares used in the calculation of the mean-based test statistic. It shows that, for large samples, both ordinary and weighted least squares test statistics are equivalent; however they are over-dispersed compared to an F variable.  相似文献   

3.
The effectiveness of Bartlett adjustment, using one of several methods of deriving a Bartlett factor, in improving the chi-squared approximation to the distribution of the log likelihood ratio statistic is investigated by computer simulation in three situations of practical interest:tests of equality of exponential distributions, equality of normal distributions and equality of coefficients of variation of normal distributions.  相似文献   

4.
Summary.  Traditionally, the use of Bayes factors has required the specification of proper prior distributions on model parameters that are implicit to both null and alternative hypotheses. I describe an approach to defining Bayes factors based on modelling test statistics. Because the distributions of test statistics do not depend on unknown model parameters, this approach eliminates much of the subjectivity that is normally associated with the definition of Bayes factors. For standard test statistics, including the χ 2-, F -, t - and z -statistics, the values of Bayes factors that result from this approach have simple, closed form expressions.  相似文献   

5.
The performance of the balanced half-sample, jackknife and linearization methods for estimating the variance of the combined ratio estimate is studied by means of a computer simulation using artificially generated non-normally distributed populations.

The results of this investigation demonstrate that the variance estimates for the combined ratio estimate may be highly biased and unstable when the underlying distributions are non-normal. This is particularly true when the number of observations available from each stratum is small. The jack-  相似文献   

6.
When modelling two-way analysis of variance interactions by a multiplicative term-[Formula] asymptotic variances and covariances are derived for the parameters p, yi and Sj using maximum likelihood theory. The asymptotic framework is defined by a2/K where K is the number of observations per combination of the two factors and a2 the common variance of the eijk values. The results can be applied when K = 1. Two Monte Carlo studies were carried out to check the validity of the formulae for small values of 02/K and to assess their usefulness when replacing the unknown parameters by their estimations. The formulae fit well but the confidence regions produced are too narrow if the interaction term is small. The procedure is illustrated with two examples.  相似文献   

7.
The derivation of the distributions of linear combinations of order statistics or L-statistics and the computation of their moments has been approached in the literature several ways. In this paper we use the properties of divided differences to obtain expressions for moments of some order statistics that arise as special cases of L-statistics. Expectations of some well-known L-statistics such as the trimmed mean and the winsorised mean for the pareto distribution are computed. The study also undertakes the computation of L-moments that are expectations of certain linear combinations of order statistics. The algorithms have been implemented using some well-known continuous distributions as examples.  相似文献   

8.
ABSTRACT

Sharp bounds on expected values of L-statistics based on a sample of possibly dependent, identically distributed random variables are given in the case when the sample size is a random variable with values in the set {0, 1, 2,…}. The dependence among observations is modeled by copulas and mixing. The bounds are attainable and provide characterizations of some non trivial distributions.  相似文献   

9.
Multivariate hypothesis testing in studies of vegetation is likely to be hindered by unrealistic assumptions when based on conventional statistical methods. This can be overcome by randomization tests. In this paper, the accuracy and power of a MANOVA randomization test are evaluated for one and two factors with interaction with simulated data from three distributions. The randomization test is based on the partitioning of sum of squares computed from Euclidean distances. In one-factor designs, sample size and variance inequality were evaluated. The results showed a high level of accuracy. The power curve was higher with normal distribution, lower with uniform, intermediate with lognormal and was sensitive to variance inequality. In two-factor designs, three methods of permutations and two statistics were compared. The results showed that permutation of the residuals with F pseudo is accurate and can give good power for testing the interaction and restricted permutation for testing main factors.  相似文献   

10.
This article considers the Phase I analysis of data when the quality of a process or product is characterized by a multiple linear regression model. This is usually referred to as the analysis of linear profiles in the statistical quality control literature. The literature includes several approaches for the analysis of simple linear regression profiles. Little work, however, has been done in the analysis of multiple linear regression profiles. This article proposes a new approach for the analysis of Phase I multiple linear regression profiles. Using this approach, regardless of the number of explanatory variables used to describe it, the profile response is monitored using only three parameters, an intercept, a slope, and a variance. Using simulation, the performance of the proposed method is compared to that of the existing methods for monitoring multiple linear profiles data in terms of the probability of a signal. The advantage of the proposed method over the existing methods is greatly improved detection of changes in the process parameters of linear profiles with high-dimensional space. The article also proposes useful diagnostic aids based on F-statistics to help in identifying the source of profile variation and the locations of out-of-control samples. Finally, the use of multiple linear profile methods is illustrated by a data set from a calibration application at National Aeronautics and Space Administration (NASA) Langley Research Center.  相似文献   

11.
Classes of distribution-free tests are proposed for testing homogeneity against order restricted as well as unrestricted alternatives in randomized block designs with multiple observations per cell. Allowing for different interblock scoring schemes, these tests are constructed based on the method of within block rankings. Asymptotic distributions (cell sizes tending to infinity) of these tests are derived under the assumption of homogeneity. The Pitman asymptotic relative efficiencies relative to the least squares statistics are studied. It is shown that when blocks are governed by different distributions, adaptive choice of scores within each block results in asymptotically more efficient tests as compared with methods that ignore such information. Monte Carlo simulations of selected designs indicate that the method of within block rankings is more power robust with respect to differing block distributions.  相似文献   

12.
We study a factor analysis model with two normally distributed observations and one factor. In the case when the errors have equal variance, the maximum likelihood estimate of the factor loading is given in closed form. Exact and approximate distributions of the maximum likelihood estimate are considered. The exact distribution function is given in a complex form that involves the incomplete Beta function. Approximations to the distribution function are given for the cases of large sample sizes and small error variances. The accuracy of the approximations is discussed  相似文献   

13.
Methods for analysing unbalanced factorial designs can be traced back to the work of Frank Yates in the 1930s . Yet, still today the question on how his methods of fitting constants (Type II) and weighted squares of means (Type III) behave when negligible or insignificant interactions exist, is still unanswered. In this paper, by means of a simulation study, Type II and Type III ANOVA results are examined for all unbalanced structures originating from a 2x3 balanced factorial design within homogeneous groups (design types) accounting for structure, number of observations lost and which cells contained the missing observations. The two level factor is further analysed to test the null hypothesis, for both Type II and Type III analyses, that the unbalanced structures within each design type provide comparable F values. These results are summarised and the conclusion shows that this work agrees with statements made by Yates Burdick and Herr and Shaw and Mitchell-Olds, but there are some results which require further investigation.  相似文献   

14.

The problem of comparing several samples to decide whether the means and/or variances are significantly different is considered. It is shown that with very non-normal distributions even a very robust test to compare the means has poor properties when the distributions have different variances, and therefore a new testing scheme is proposed. This starts by using an exact randomization test for any significant difference (in means or variances) between the samples. If a non-significant result is obtained then testing stops. Otherwise, an approximate randomization test for mean differences (but allowing for variance differences) is carried out, together with a bootstrap procedure to assess whether this test is reliable. A randomization version of Levene's test is also carried out for differences in variation between samples. The five possible conclusions are then that (i) there is no evidence of any differences, (ii) evidence for mean differences only, (iii) evidence for variance differences only, (iv) evidence for mean and variance differences, or (v) evidence for some indeterminate differences. A simulation experiment to assess the properties of the proposed scheme is described. From this it is concluded that the scheme is useful as a robust, conservative method for comparing samples in cases where they may be from very non-normal distributions.  相似文献   

15.
This paper considers distributed inference for two-sample U-statistics under the massive data setting. In order to reduce the computational complexity, this paper proposes distributed two-sample U-statistics and blockwise linear two-sample U-statistics. The blockwise linear two-sample U-statistic, which requires less communication cost, is more computationally efficient especially when the data are stored in different locations. The asymptotic properties of both types of distributed two-sample U-statistics are established. In addition, this paper proposes bootstrap algorithms to approximate the distributions of distributed two-sample U-statistics and blockwise linear two-sample U-statistics for both nondegenerate and degenerate cases. The distributed weighted bootstrap for the distributed two-sample U-statistic is new in the literature. The proposed bootstrap procedures are computationally efficient and are suitable for distributed computing platforms with theoretical guarantees. Extensive numerical studies illustrate that the proposed distributed approaches are feasible and effective.  相似文献   

16.
ABSTRACT

The Concordance statistic (C-statistic) is commonly used to assess the predictive performance (discriminatory ability) of logistic regression model. Although there are several approaches for the C-statistic, their performance in quantifying the subsequent improvement in predictive accuracy due to inclusion of novel risk factors or biomarkers in the model has been extremely criticized in literature. This paper proposed a model-based concordance-type index, CK, for use with logistic regression model. The CK and its asymptotic sampling distribution is derived following Gonen and Heller's approach for Cox PH model for survival data but taking necessary modifications for use with binary data. Unlike the existing C-statistics for logistic model, it quantifies the concordance probability by taking the difference in the predicted risks between two subjects in a pair rather than ranking them and hence is able to quantify the equivalent incremental value from the new risk factor or marker. The simulation study revealed that the CK performed well when the model parameters are correctly estimated for large sample and showed greater improvement in quantifying the additional predictive value from the new risk factor or marker than the existing C-statistics. Furthermore, the illustration using three datasets supports the findings from simulation study.  相似文献   

17.
A multivariate normal mean–variance mixture based on a Birnbaum–Saunders (NMVMBS) distribution is introduced and several properties of this new distribution are discussed. A new robust non-Gaussian ARCH-type model is proposed in which there exists a relation between the variance of the observations, and the marginal distributions are NMVMBS. A simple EM-based maximum likelihood estimation procedure to estimate the parameters of this normal mean–variance mixture distribution is given. A simulation study and some real data are used to demonstrate the modelling strength of this new model.  相似文献   

18.
Various computational methods exist for generating sums of squares in an analysis of variance table. When the ANOVA design is balanced, most of these computational methods will produce equivalent sums of squares for testing the significance of the ANOVA model parameters. However, when the design is unbalanced, as is frequently the case in practice, these sums of squares depend on the computational method used.- The basic reason for the difference in these sums of squares is that different hypotheses are being tested. The purpose of this paper is to describe these hypotheses in terms of population or cell means. A numerical example is given for the two factor model with interaction. The hypotheses that are tested by the four computational methods of the SAS general linear model procedure are specified.

Although the ultimate choice of hypotheses should be made by the researcher before conducting the experiment, this paper

PENDLETON,VON TRESS,AND BREMER

presents the following guidelines in selecting these hypotheses:

When the design is balanced, all of the SAS procedures will agree.

In unbalanced ANOVA designs when there are no missing cells. SAS Type III should be used. SAS Type III tests an unweighted hypothesis about cell means. SAS Types I and II test hypotheses that are functions of the ceil frequencies. These frequencies are often merely arti¬facts of the experimental process and not reflective of any underlying frequencies in the population.

When there are missing cells, i.e. no observations for some factor level combinations. Type IV should be used with caution. SAS Type IV tests hypotheses which depend  相似文献   

19.
In this paper, we develop Bayes factor based testing procedures for the presence of a correlation or a partial correlation. The proposed Bayesian tests are obtained by restricting the class of the alternative hypotheses to maximize the probability of rejecting the null hypothesis when the Bayes factor is larger than a specified threshold. It turns out that they depend simply on the frequentist t-statistics with the associated critical values and can thus be easily calculated by using a spreadsheet in Excel and in fact by just adding one more step after one has performed the frequentist correlation tests. In addition, they are able to yield an identical decision with the frequentist paradigm, provided that the evidence threshold of the Bayesian tests is determined by the significance level of the frequentist paradigm. We illustrate the performance of the proposed procedures through simulated and real-data examples.  相似文献   

20.
Stein's two–sample procedure for a general linear model is studied and derived in terms of matrices in which the error tems are distributed as multivatriate student t–error terms. Tests and confidence regions are constructed in a similar way to classical linear models which involves percentage points of student t and F distributions. The advantages of taking two samples are: the variance of the error terms is known, and the power of tests are size of confidence regions are controllable. A new distribution called noncentral F–type distribution different from the nencentral F is found when considerinf the power of the test of general linear hypothesis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号