期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of several adaptive tests for paired data

《Journal of Statistical Computation and Simulation》2012,82(9):1083-1093

In the last few years, two adaptive tests for paired data have been proposed. One test proposed by Freidlin et al. [On the use of the Shapiro–Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions, Biom. J. 45 (2003), pp. 887–900] is a two-stage procedure that uses a selection statistic to determine which of three rank scores to use in the computation of the test statistic. Another statistic, proposed by O'Gorman [Applied Adaptive Statistical Methods: Tests of Significance and Confidence Intervals, Society for Industrial and Applied Mathematics, Philadelphia, 2004], uses a weighted t-test with the weights determined by the data. These two methods, and an earlier rank-based adaptive test proposed by Randles and Hogg [Adaptive Distribution-free Tests, Commun. Stat. 2 (1973), pp. 337–356], are compared with the t-test and to Wilcoxon's signed-rank test. For sample sizes between 15 and 50, the results show that the adaptive test proposed by Freidlin et al. and the adaptive test proposed by O'Gorman have higher power than the other tests over a range of moderate to long-tailed symmetric distributions. The results also show that the test proposed by O'Gorman has greater power than the other tests for short-tailed distributions. For sample sizes greater than 50 and for small sample sizes the adaptive test proposed by O'Gorman has the highest power for most distributions. 相似文献

2.

Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials

Jitendra Ganju Xinxin Yu Guoguang Ma 《Pharmaceutical statistics》2013,12(5):282-290

Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre‐specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre‐specifying multiple test statistics and relying on the minimum p‐value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p‐value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p‐value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p‐value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

3.

Two separate effects of variance heterogeneity on the validity and power of significance tests of location

Donald W. Zimmerman 《Statistical Methodology》2006,3(4):351-374

Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways. 相似文献

4.

Goodness-of-fit tests for lifetime distributions based on Type II censored data

Hadi Alizadeh Noughabi 《Journal of Statistical Computation and Simulation》2017,87(9):1787-1798

In this article, the general test statistic introduced by Alizadeh Noughabi and Balakrishnan [Goodness of fit using a new estimate of Kullback-Leibler information based on Type II censored data. IEEE Trans Reliab. 2015;64:627–635.] is applied for testing goodness of fit of lifetime distributions based on Type II censored data. The test statistic is constructed based on an estimate of Kullback–Leibler (KL) information. We investigate the properties of the proposed test statistic such as the test statistic is nonnegative, just like KL information. We apply this test statistic to following distributions: Exponential, Weibull, Log-normal and Pareto. The critical values and Type I error of the proposed tests are obtained. It is shown that the proposed tests have an excellent Type I error and hence can be used confidently in practice. Then, by Monte Carlo simulations, the power values of the proposed tests are computed against several alternatives and compared with those of the existing tests. Finally, some real-world reliability data are used for illustrative purpose. 相似文献

5.

Power of the adjusted Q statistic to evaluate heterogeneity in meta-analyses of cluster randomized trials

Shun Fu Lee Allan Donner Neil Klar 《统计学通讯:模拟与计算》2017,46(9):7062-7073

Because of its simplicity, the Q statistic is frequently used to test the heterogeneity of the estimated intervention effect in meta-analyses of individually randomized trials. However, it is inappropriate to apply it directly to the meta-analyses of cluster randomized trials without taking clustering effects into account. We consider the properties of the adjusted Q statistic for testing heterogeneity in the meta-analyses of cluster randomized trials with binary outcomes. We also derive an analytic expression for the power of this statistic to detect heterogeneity in meta-analyses, which can be useful when planning a meta-analysis. A simulation study is used to assess the performance of the adjusted Q statistic, in terms of its Type I error rate and power. The simulation results are compared to that obtained from the proposed formula. It is found that the adjusted Q statistic has a Type I error rate close to the nominal level of 5%, as compared to the unadjusted Q statistic commonly used to test for heterogeneity in the meta-analyses of individually randomized trials with an inflated Type I error rate. Data from a meta-analysis of four cluster randomized trials are used to illustrate the procedures. 相似文献

6.

Negative exponential disparity based family of goodness-of-fit tests for multinomial models

《Journal of Statistical Computation and Simulation》2012,82(1-4):43-61

This paper investigates a new family of goodness-of-fit tests based on the negative exponential disparities. This family includes the popular Pearson's chi-square as a member and is a subclass of the general class of disparity tests (Basu and Sarkar, 1994) which also contains the family of power divergence statistics. Pitman efficiency and finite sample power comparisons between different members of this new family are made. Three asymptotic approximations of the exact null distributions of the negative exponential disparity famiiy of tests are discussed. Some numerical results on the small sample perfomance of this family of tests are presented for the symmetric null hypothesis. It is shown that the negative exponential disparity famiiy, Like the power divergence family, produces a new goodness-of-fit test statistic that can be a very attractive alternative to the Pearson's chi-square. Some numerical results suggest that, application of this test statistic, as an alternative to Pearson's chi-square, could be preferable to the I ^2/3 statistic of Cressie and Read (1984) under the use of chi-square critical values. 相似文献

7.

Testing for normality in linear regression models

《Journal of Statistical Computation and Simulation》2012,82(10):1101-1113

The importance of the normal distribution for fitting continuous data is well known. However, in many practical situations data distribution departs from normality. For example, the sample skewness and the sample kurtosis are far away from 0 and 3, respectively, which are nice properties of normal distributions. So, it is important to have formal tests of normality against any alternative. D'Agostino et al. [A suggestion for using powerful and informative tests of normality, Am. Statist. 44 (1990), pp. 316–321] review four procedures Z ²(g ₁), Z ²(g ₂), D and K ² for testing departure from normality. The first two of these procedures are tests of normality against departure due to skewness and kurtosis, respectively. The other two tests are omnibus tests. An alternative to the normal distribution is a class of skew-normal distributions (see [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178]). In this paper, we obtain a score test (W) and a likelihood ratio test (LR) of goodness of fit of the normal regression model against the skew-normal family of regression models. It turns out that the score test is based on the sample skewness and is of very simple form. The performance of these six procedures, in terms of size and power, are compared using simulations. The level properties of the three statistics LR, W and Z ²(g ₁) are similar and close to the nominal level for moderate to large sample sizes. Also, their power properties are similar for small departure from normality due to skewness (γ₁≤0.4). Of these, the score test statistic has a very simple form and computationally much simpler than the other two statistics. The LR statistic, in general, has highest power, although it is computationally much complex as it requires estimates of the parameters under the normal model as well as those under the skew-normal model. So, the score test may be used to test for normality against small departure from normality due to skewness. Otherwise, the likelihood ratio statistic LR should be used as it detects general departure from normality (due to both skewness and kurtosis) with, in general, largest power. 相似文献

8.

The effect of nonnormality on near-replicate lack-of-fit tests

Edward J. Bedrick Ronald Christensen 《Revue canadienne de statistique》1999,27(3):471-484

We examine the effect of nonnormality on the distributions of near-replicate lack-of-fit F-tests. We show that when the number of clusters is large, the distributions of the lack-of-fit tests depend on the kurtosis of the error distribution, and that heavy-tailed error distributions can inflate significantly the sizes of the tests. This behaviour is also evident in small samples, where some lack-of-fit tests are clearly more affected than others by nonnormality. Two modifications of the F-tests are suggested to eliminate the effect of the kurtosis on the limiting null distributions, and their behaviour is studied for small samples. 相似文献

9.

A Multisample Rank Test for Location-Scale Parameters

Hidetoshi Murakami 《统计学通讯:模拟与计算》2013,42(7):1347-1355

One of the multisample problems is discussed in this article. A new multisample rank tests based on a k-sample Baumgartner statistic are proposed for testing the location-scale parameters. The exact critical values of proposed statistics are calculated. Simulations are used to investigate the power of proposed statistics for various population distributions. 相似文献

10.

Comparison of nonparametric analysis of variance methods: A vote for van der Waerden

Haiko Luepsen 《统计学通讯:模拟与计算》2013,42(9):2547-2576

ABSTRACT

For two-way layouts in a between-subjects analysis of variance design, the parametric F-test is compared with seven nonparametric methods: rank transform (RT), inverse normal transform (INT), aligned rank transform (ART), a combination of ART and INT, Puri & Sen's L statistic, Van der Waerden, and Akritas and Brunners ANOVA-type statistics (ATS). The type I error rates and the power are computed for 16 normal and nonnormal distributions, with and without homogeneity of variances, for balanced and unbalanced designs as well as for several models including the null and the full model. The aim of this study is to identify a method that is applicable without too much testing for all the attributes of the plot. The Van der Waerden test shows the overall best performance though there are some situations in which it is disappointing. The Puri & Sen's and the ATS tests show generally very low power. These two and the other methods cannot keep the type I error rate under control in too many situations. Especially in the case of lognormal distributions, the use of any of the rank-based procedures can be dangerous for cell sizes above 10. As already shown by many other authors, nonnormal distributions do not violate the parametric F-test, but unequal variances do, and heterogeneity of variances leads to an inflated error rate more or less also for the nonparametric methods. Finally, it should be noted that some procedures show rising error rates with increasing cell sizes, the ART, especially for discrete variables, and the RT, Puri & Sen, and the ATS in the cases of heteroscedasticity. 相似文献

11.

Multi-Sample Likelihood Ratio Tests Based on Bipolar Watson Distributions Defined on the Hypersphere

Adelaide Figueiredo 《统计学通讯:理论与方法》2013,42(4):815-820

We derive likelihood ratio tests for the equality of the directional parameters of k bipolar Watson distributions defined on the hypersphere with common concentration parameter. We analyze the power of these tests in the case of two distributions supposing in the alternative hypothesis two directional parameters forming an angle, which varies from 18° to 90°. We also compare the likelihood ratio tests with a high-concentration F-test. 相似文献

12.

An adaptive test for the one-way layout

Thomas W. O'GORMAN 《Revue canadienne de statistique》1997,25(2):269-279

An adaptive test is proposed for the one-way layout. This test procedure uses the order statistics of the combined data to obtain estimates of percentiles, which are used to select an appropriate set of rank scores for the one-way test statistic. This test is designed to have reasonably high power over a range of distributions. The adaptive procedure proposed for a one-way layout is a generalization of an existing two-sample adaptive test procedure. In this Monte Carlo study, the power and significance level of the F-test, the Kruskal-Wallis test, the normal scores test, and the adaptive test were evaluated for the one-way layout. All tests maintained their significance level for data sets having at least 24 observations. The simulation results show that the adaptive test is more powerful than the other tests for skewed distributions if the total number of observations equals or exceeds 24. For data sets having at least 60 observations the adaptive test is also more powerful than the F-test for some symmetric distributions. 相似文献

13.

Parametric bootstrap and approximate tests for two Poisson variates

《Journal of Statistical Computation and Simulation》2012,82(3):263-271

The parametric bootstrap tests and the asymptotic or approximate tests for detecting difference of two Poisson means are compared. The test statistics used are the Wald statistics with and without log-transformation, the Cox F statistic and the likelihood ratio statistic. It is found that the type I error rate of an asymptotic/approximate test may deviate too much from the nominal significance level α under some situations. It is recommended that we should use the parametric bootstrap tests, under which the four test statistics are similarly powerful and their type I error rates are all close to α. We apply the tests to breast cancer data and injurious motor vehicle crash data. 相似文献

14.

The Minimaxity of the Mid P-value under Linear and Squared Loss Functions

Ian Fellows 《统计学通讯:理论与方法》2013,42(2):244-254

The mid-p is defined as the sum of the probabilities of all outcomes more extreme than an observed value, plus half of the probabilities of all outcomes exactly as extreme. On the one hand, it offers greater power than the standard p-value, but on the other, tests based on the mid-p statistic may have greater Type I error than their nominal level. This article investigates the mid p-value's properties under the estimated truth paradigm, which views p-values as estimators of the truth. The mid-p is shown to minimize the maximum risk for one-sided and two-sided tests. 相似文献

15.

Linear hypothesis testing for weighted functional data with applications

Łukasz Smaga Jin-Ting Zhang 《Scandinavian Journal of Statistics》2020,47(2):493-515

In socioeconomic areas, functional observations may be collected with weights, called weighted functional data. In this paper, we deal with a general linear hypothesis testing (GLHT) problem in the framework of functional analysis of variance with weighted functional data. With weights taken into account, we obtain unbiased and consistent estimators of the group mean and covariance functions. For the GLHT problem, we obtain a pointwise F-test statistic and build two global tests, respectively, via integrating the pointwise F-test statistic or taking its supremum over an interval of interest. The asymptotic distributions of test statistics under the null and some local alternatives are derived. Methods for approximating their null distributions are discussed. An application of the proposed methods to density function data is also presented. Intensive simulation studies and two real data examples show that the proposed tests outperform the existing competitors substantially in terms of size control and power. 相似文献

16.

Cramér-von Mises tests of fit for the Poisson distribution

John J. Spinelli Michael A. Stephens 《Revue canadienne de statistique》1997,25(2):257-268

Goodness-of-fit tests based on the Cramér-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A², give good overall tests of fit. The statistic A² will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson. 相似文献

17.

Testing diagonality of high-dimensional covariance matrix under non-normality

Kai Xu 《Journal of Statistical Computation and Simulation》2017,87(16):3208-3224

Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings. 相似文献

18.

On the use of priors in goodness‐of‐fit tests

Alberto Contreras‐Cristn Richard A. Lockhart Michael A. Stephens Shaun Z. Sun 《Revue canadienne de statistique》2019,47(4):560-579

Priors are introduced into goodness‐of‐fit tests, both for unknown parameters in the tested distribution and on the alternative density. Neyman–Pearson theory leads to the test with the highest expected power. To make the test practical, we seek priors that make it likely a priori that the power will be larger than the level of the test but not too close to one. As a result, priors are sample size dependent. We explore this procedure in particular for priors that are defined via a Gaussian process approximation for the logarithm of the alternative density. In the case of testing for the uniform distribution, we show that the optimal test is of the U‐statistic type and establish limiting distributions for the optimal test statistic, both under the null hypothesis and averaged over the alternative hypotheses. The optimal test statistic is shown to be of the Cramér–von Mises type for specific choices of the Gaussian process involved. The methodology when parameters in the tested distribution are unknown is discussed and illustrated in the case of testing for the von Mises distribution. The Canadian Journal of Statistics 47: 560–579; 2019 © 2019 Statistical Society of Canada 相似文献

19.

Exact unconditional testing procedures for comparing two independent Poisson rates

《Journal of Statistical Computation and Simulation》2012,82(5):947-955

We consider seven exact unconditional testing procedures for comparing adjusted incidence rates between two groups from a Poisson process. Exact tests are always preferable due to the guarantee of test size in small to medium sample settings. Han [Comparing two independent incidence rates using conditional and unconditional exact tests. Pharm Stat. 2008;7(3):195–201] compared the performance of partial maximization p-values based on the Wald test statistic, the likelihood ratio test statistic, the score test statistic, and the conditional p-value. These four testing procedures do not perform consistently, as the results depend on the choice of test statistics for general alternatives. We consider the approach based on estimation and partial maximization, and compare these to the ones studied by Han (2008) for testing superiority. The procedures are compared with regard to the actual type I error rate and power under various conditions. An example from a biomedical research study is provided to illustrate the testing procedures. The approach based on partial maximization using the score test is recommended due to the comparable performance and computational advantage in large sample settings. Additionally, the approach based on estimation and partial maximization performs consistently for all the three test statistics, and is also recommended for use in practice. 相似文献

20.

Distribution-free comparison of hazard rates of two distributions under progressive type-II censoring

Maryam Sharafi N. Balakrishnan Baha-Eldin Khaledi 《Journal of Statistical Computation and Simulation》2013,83(8):1527-1542

Some distribution-free tests have been discussed in the literature with regard to the comparison of hazard rates of two distributions when the available samples are complete. We generalize here Kochar's [S.C. Kochar, A new distribution-free test for the equality of two failure rates, Biometrika 68 (1981), pp. 423–426] test statistic to the case when one available sample is progressively Type-II censored, and then derive its exact null distribution and examine its power properties by means of a Monte Carlo simulation study. 相似文献