期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Two separate effects of variance heterogeneity on the validity and power of significance tests of location

Donald W. Zimmerman 《Statistical Methodology》2006,3(4):351-374

Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways. 相似文献

2.

Student's Distribution Under Non-Normal Situations

M. L. Tiku 《Australian & New Zealand Journal of Statistics》1971,13(3):142-148

The distribution of Student's t statistic under non-normal situations is obtained. The effect of non-normality on type I error and the power of the two-sided t-test is studied in some detail. 相似文献

3.

The split sample permutation t-tests

Shunpu Zhang 《Journal of statistical planning and inference》2009

Without the exchangeability assumption, permutation tests for comparing two population means do not provide exact control of the probability of making a Type I error. Another drawback of permutation tests is that it cannot be used to test hypothesis about one population. In this paper, we propose a new type of permutation tests for testing the difference between two population means: the split sample permutation t-tests. We show that the split sample permutation t-tests do not require the exchangeability assumption, are asymptotically exact and can be easily extended to testing hypothesis about one population. Extensive simulations were carried out to evaluate the performance of two specific split sample permutation t-tests: the split in the middle permutation t-test and the split in the end permutation t-test. The simulation results show that the split in the middle permutation t-test has comparable performance to the permutation test if the population distributions are symmetric and satisfy the exchangeability assumption. Otherwise, the split in the end permutation t-test has significantly more accurate control of level of significance than the split in the middle permutation t-test and other existing permutation tests. 相似文献

4.

A comparative study of tests for paired lifetime data

Wang Z Ng HK 《Lifetime data analysis》2006,12(4):505-522

In this paper, we investigate different procedures for testing the equality of two mean survival times in paired lifetime studies. We consider Owen’s M-test and Q-test, a likelihood ratio test, the paired t-test, the Wilcoxon signed rank test and a permutation test based on log-transformed survival times in the comparative study. We also consider the paired t-test, the Wilcoxon signed rank test and a permutation test based on original survival times for the sake of comparison. The size and power characteristics of these tests are studied by means of Monte Carlo simulations under a frailty Weibull model. For less skewed marginal distributions, the Wilcoxon signed rank test based on original survival times is found to be desirable. Otherwise, the M-test and the likelihood ratio test are the best choices in terms of power. In general, one can choose a test procedure based on information about the correlation between the two survival times and the skewness of the marginal survival distributions. 相似文献

5.

A non-parametric maximum test for the Behrens–Fisher problem

Anke Welz Graeme D. Ruxton 《Journal of Statistical Computation and Simulation》2018,88(7):1336-1347

Non-normality and heteroscedasticity are common in applications. For the comparison of two samples in the non-parametric Behrens–Fisher problem, different tests have been proposed, but no single test can be recommended for all situations. Here, we propose combining two tests, the Welch t test based on ranks and the Brunner–Munzel test, within a maximum test. Simulation studies indicate that this maximum test, performed as a permutation test, controls the type I error rate and stabilizes the power. That is, it has good power characteristics for a variety of distributions, and also for unbalanced sample sizes. Compared to the single tests, the maximum test shows acceptable type I error control. 相似文献

6.

Combining the t test and Wilcoxon's rank-sum test

Markus Neuhäuser 《Journal of applied statistics》2015,42(12):2769-2775

In the two-sample location-shift problem, Student's t test or Wilcoxon's rank-sum test are commonly applied. The latter test can be more powerful for non-normal data. Here, we propose to combine the two tests within a maximum test. We show that the constructed maximum test controls the type I error rate and has good power characteristics for a variety of distributions; its power is close to that of the more powerful of the two tests. Thus, irrespective of the distribution, the maximum test stabilizes the power. To carry out the maximum test is a more powerful strategy than selecting one of the single tests. The proposed test is applied to data of a clinical trial. 相似文献

7.

Testing equality of variances for several normal populations

Esra Gokpinar Fikri Gokpinar 《统计学通讯:模拟与计算》2017,46(1):38-52

In this study, we develop a test based on computational approach for the equality of variances of several normal populations. The proposed method is numerically compared with the existing methods. The numeric results demonstrate that the proposed method performs very well in terms of type I error rate and power of test. Furthermore we study the robustness of the tests by using simulation study when the underlying data are from t, exponential and uniform distributions. Finally we analyze a real dataset that motivated our study using the proposed test. 相似文献

8.

Robust tests for the equality of variances for clustered data

《Journal of Statistical Computation and Simulation》2012,82(4):365-377

Tests for the equality of variances are often needed in applications. In genetic studies the assumption of equal variances of continuous traits, measured in identical and fraternal twins, is crucial for heritability analysis. To test the equality of variances of traits, which are non-normally distributed, Levene [H. Levene, Robust tests for equality of variances, in Contributions to Probability and Statistics, I. Olkin, ed. Stanford University Press, Palo Alto, California, 1960, pp. 278–292] suggested a method that was surprisingly robust under non-normality, and the procedure was further improved by Brown and Forsythe [M.B. Brown and A.B. Forsythe, Robust tests for the equality of variances, J. Amer. Statis. Assoc. 69 (1974), pp. 364–367]. These tests assumed independence of observations. However, twin data are clustered – observations within a twin pair may be dependent due to shared genes and environmental factors. Uncritical application of the tests of Brown and Forsythe to clustered data may result in much higher than nominal Type I error probabilities. To deal with clustering we developed an extended version of Levene's test, where the ANOVA step is replaced with a regression analysis followed by a Wald-type test based on a clustered version of the robust Huber–White sandwich estimator of the covariance matrix. We studied the properties of our procedure using simulated non-normal clustered data and obtained Type I error rates close to nominal as well as reasonable powers. We also applied our method to oral glucose tolerance test data obtained from a twin study of the metabolic syndrome and related components and compared the results with those produced by the traditional approaches. 相似文献

9.

Which Measure Should be Used for Testing in a Paired Design: Simple Difference,Percent Change,or Symmetrized Percent Change?

Handan Camdeviren Ankarali Seyit Ankarali 《统计学通讯:模拟与计算》2013,42(2):402-415

We aimed to determine the most proper change measure among simple difference, percent, or symmetrized percent changes in simple paired designs. For this purpose, we devised a computer simulation program. Since distributions of percent and symmetrized percent change values are skewed and bimodal, paired t-test did not give good results according to Type I error and the test power. To be to able use percent change or symmetrized percent change as change measure, either the distribution of test statistics should be transformed to a known theoretical distribution by transformation methods or a new test statistic for these values should be developed. 相似文献

10.

Testing the equality of several inverse Gaussian means under heterogeneity

Sana Eftekhar 《统计学通讯:模拟与计算》2013,42(9):2757-2774

Abstract

We consider the problem of testing the equality of several inverse Gaussian means when the scale parameters and sample sizes are possibly unequal. We propose four parametric bootstrap (PB) tests based on the uniformly minimum variance unbiased estimators of parameters. We also compare our proposed tests with the existing ones via an extensive simulation study in terms of controlling the Type I error rate and power performance. Simulation results show the merits of the PB tests. 相似文献

11.

The two-sample -test with a known ratio of variances

Edna Schechtman Michael Sherman 《Statistical Methodology》2007,4(4):508-514

We consider the two-sample t-test where error variances are unknown but with known relationships between them. This situation arises, for example, when two measuring instruments average different number of replicates to report the response. In particular we compare our procedure with the usual Satterthwaite approximation in the two sample t-test with variances unequal. Our procedure uses the knowledge of a known ratio of variances while the Satterthwaite approximation assumes only that the two variances are unequal. Simulations show that our procedure has both better size and better power than the Satterthwaite approximation. Finally, we consider an extension of our results to the General Linear Model. 相似文献

12.

On the power of the t-test and some rank tests when outliers may be present

Ronald L. Iman W. J. Conover 《Revue canadienne de statistique》1977,5(2):187-193

The power of some rank tests, used for testing the hypothesis of shift, is found when the underlying distributions contain outliers. The outliers are assumed to occur as the result of mixing two normal distributions with common variance. A small sample case shows how the scores for the rank tests are found and the exact power is computed for each of these rank tests. A Monte Carlo study provides an estimate of the power of the usual two sample t-test. 相似文献

13.

A comparison of several adaptive tests for paired data

《Journal of Statistical Computation and Simulation》2012,82(9):1083-1093

In the last few years, two adaptive tests for paired data have been proposed. One test proposed by Freidlin et al. [On the use of the Shapiro–Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions, Biom. J. 45 (2003), pp. 887–900] is a two-stage procedure that uses a selection statistic to determine which of three rank scores to use in the computation of the test statistic. Another statistic, proposed by O'Gorman [Applied Adaptive Statistical Methods: Tests of Significance and Confidence Intervals, Society for Industrial and Applied Mathematics, Philadelphia, 2004], uses a weighted t-test with the weights determined by the data. These two methods, and an earlier rank-based adaptive test proposed by Randles and Hogg [Adaptive Distribution-free Tests, Commun. Stat. 2 (1973), pp. 337–356], are compared with the t-test and to Wilcoxon's signed-rank test. For sample sizes between 15 and 50, the results show that the adaptive test proposed by Freidlin et al. and the adaptive test proposed by O'Gorman have higher power than the other tests over a range of moderate to long-tailed symmetric distributions. The results also show that the test proposed by O'Gorman has greater power than the other tests for short-tailed distributions. For sample sizes greater than 50 and for small sample sizes the adaptive test proposed by O'Gorman has the highest power for most distributions. 相似文献

14.

Comparison of nonparametric analysis of variance methods: A vote for van der Waerden

Haiko Luepsen 《统计学通讯:模拟与计算》2013,42(9):2547-2576

ABSTRACT

For two-way layouts in a between-subjects analysis of variance design, the parametric F-test is compared with seven nonparametric methods: rank transform (RT), inverse normal transform (INT), aligned rank transform (ART), a combination of ART and INT, Puri & Sen's L statistic, Van der Waerden, and Akritas and Brunners ANOVA-type statistics (ATS). The type I error rates and the power are computed for 16 normal and nonnormal distributions, with and without homogeneity of variances, for balanced and unbalanced designs as well as for several models including the null and the full model. The aim of this study is to identify a method that is applicable without too much testing for all the attributes of the plot. The Van der Waerden test shows the overall best performance though there are some situations in which it is disappointing. The Puri & Sen's and the ATS tests show generally very low power. These two and the other methods cannot keep the type I error rate under control in too many situations. Especially in the case of lognormal distributions, the use of any of the rank-based procedures can be dangerous for cell sizes above 10. As already shown by many other authors, nonnormal distributions do not violate the parametric F-test, but unequal variances do, and heterogeneity of variances leads to an inflated error rate more or less also for the nonparametric methods. Finally, it should be noted that some procedures show rising error rates with increasing cell sizes, the ART, especially for discrete variables, and the RT, Puri & Sen, and the ATS in the cases of heteroscedasticity. 相似文献

15.

The performance of tests when observations have different variances

《Journal of Statistical Computation and Simulation》2012,82(1-2):83-92

It is common to test if there is an effect due to a treatment. The commonly used tests have the assumption that the observations differ in location, and that their variances are the same over the groups. Different variances can arise if the observations being analyzed are means of different numbers of observations on individuals or slopes of growth curves with missing data. This study is concerned with cases in which the unequal variances are known, or known to a constant of proportionality. It examines the performance of the ttest, the Mann–Whitney–Wilcoxon Rank Sum test, the Median test, and the Van der Waerden test under these conditions. The t-test based on the weighted means is the likelihood ratio test under normality and has the usual optimality properties. The other tests are compared to it. One may align and scale the observations by subtracting the mean and dividing by the standard deviation of each point. This leads to other, analogous test statistics based on these adjusted observations. These statistics are also compared. Finally, the regression scores tests are compared to the other procedures. 相似文献

16.

On hypothesis tests in misspecified change-point problems for a Poisson process

Christian Farinetto 《统计学通讯:理论与方法》2017,46(20):10103-10115

Consider an inhomogeneous Poisson process X on [0, T] whose unk-nown intensity function “switches” from a lower function g_* to an upper function h_* at some unknown point ?_* that has to be identified. We consider two known continuous functions g and h such that g_*(t) ? g(t) < h(t) ? h_*(t) for 0 ? t ? T. We describe the behavior of the generalized likelihood ratio and Wald’s tests constructed on the basis of a misspecified model in the asymptotics of large samples. The power functions are studied under local alternatives and compared numerically with help of simulations. We also show the following robustness result: the Type I error rate is preserved even though a misspecified model is used to construct tests. 相似文献

17.

Statistical considerations in bioequivalence of two area under the concentration–time curves obtained from serial sampling data

Steven Y. Hua D. L. Hawkins Jihao Zhou 《Journal of applied statistics》2013,40(5):1140-1154

In this paper, we study the bioequivalence (BE) inference problem motivated by pharmacokinetic data that were collected using the serial sampling technique. In serial sampling designs, subjects are independently assigned to one of the two drugs; each subject can be sampled only once, and data are collected at K distinct timepoints from multiple subjects. We consider design and hypothesis testing for the parameter of interest: the area under the concentration–time curve (AUC). Decision rules in demonstrating BE were established using an equivalence test for either the ratio or logarithmic difference of two AUCs. The proposed t-test can deal with cases where two AUCs have unequal variances. To control for the type I error rate, the involved degrees-of-freedom were adjusted using Satterthwaite's approximation. A power formula was derived to allow the determination of necessary sample sizes. Simulation results show that, when the two AUCs have unequal variances, the type I error rate is better controlled by the proposed method compared with a method that only handles equal variances. We also propose an unequal subject allocation method that improves the power relative to that of the equal and symmetric allocation. The methods are illustrated using practical examples. 相似文献

18.

A Test for the Equality of Parameters for Separate Regression Models in the Presence of Heteroskedasticity

Dennis Oberhelman 《统计学通讯:模拟与计算》2013,42(1):99-121

Testing for the equality of regression coefficients across two regressions is a problem considered by analysts in a variety of fields. If the variances of the errors of the two regressions are not equal, then it is known that the standard large sample F-test used to test the equality of the coefficients is compromised by the fact that its actual size can differ substantially from the stated level of significance in small samples. This article addresses this problem and borrows from the literature on the Behrens-Fisher problem to provide some simple modifications of the large sample test which allows one to better control the probability of committing a Type I error. Empirical evidence is presented which indicates that the suggested modifications provide tests which are superior to well-known alternative tests over a wide range of the parameter space. 相似文献

19.

The robustness of the two—sample t—test over the Pearson system

《Journal of Statistical Computation and Simulation》2012,82(3-4):295-311

The present paper has as its objective an accurate quantification of the robustness of the two–sample t-test over an extensive practical range of distributions. The method is that of a major Monte Carlo study over the Pearson system of distributions and the details indicate that the results are quite accurate. The study was conducted over the range β ₁ =0.0(0.4)2.0 (negative and positive skewness) and β ₂ =1.4 (0.4)7.8 with equal sample sizes and for both the one-and two-tail t-tests. The significance level and power levels (for nominal values of 0.05, 0.50, and 0.95, respectively) were evaluated for each underlying distribution and for each sample size, with each probability evaluated from 100,000 generated values of the test-statistic. The results precisely quantify the degree of robustness inherent in the two-sample t-test and indicate to a user the degree of confidence one can have in this procedure over various regions of the Pearson system. The results indicate that the equal-sample size two-sample t-test is quite robust with respect to departures from normality, perhaps even more so than most people realize. 相似文献

20.

An adaptive test for the one-way layout

Thomas W. O'GORMAN 《Revue canadienne de statistique》1997,25(2):269-279

An adaptive test is proposed for the one-way layout. This test procedure uses the order statistics of the combined data to obtain estimates of percentiles, which are used to select an appropriate set of rank scores for the one-way test statistic. This test is designed to have reasonably high power over a range of distributions. The adaptive procedure proposed for a one-way layout is a generalization of an existing two-sample adaptive test procedure. In this Monte Carlo study, the power and significance level of the F-test, the Kruskal-Wallis test, the normal scores test, and the adaptive test were evaluated for the one-way layout. All tests maintained their significance level for data sets having at least 24 observations. The simulation results show that the adaptive test is more powerful than the other tests for skewed distributions if the total number of observations equals or exceeds 24. For data sets having at least 60 observations the adaptive test is also more powerful than the F-test for some symmetric distributions. 相似文献