首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
The independence assumption in statistical significance testing becomes increasingly crucial and unforgiving as sample size increases. Seemingly, inconsequential violations of this assumption can substantially increase the probability of a Type I error if sample sizes are large. In the case of Student's t test, it is found that correlations within samples in a range from 0.01 to 0.05 can lead to rejection of a true null hypothesis with high probability, if N is 50, 100 or larger.  相似文献   

2.
A test based on Tiku's MML (modified maximum likelihood) estimators is developed for testing that the population correlation coefficient is zero. The test is compared with various other tests and shown to have good Type I error robustness and power for numerous symmetric and skew bivariate populations.  相似文献   

3.
The aim of this study is to compare performances of commonly cointegration tests used in literature in terms of their empirical power and type I error probabilty for various sample sizes. As a result of the study, it has been found that some tests are not appropriate in testing cointegration in terms of empirical power and type I error probability. As a result of simulation study, λmax test for any values of ρ and sample sizes have been found most appropriate test in conclusion.  相似文献   

4.
Some nonparametric methods have been proposed to compare survival medians. Most of them are based on the asymptotic null distribution to estimate the p-value. However, for small to moderate sample sizes, those tests may have inflated Type I error rate, which makes their application limited. In this article, we proposed a new nonparametric test that uses bootstrap to estimate the sample mean and variance of the median. Through comprehensive simulation, we show that the proposed approach can control Type I error rates well. A real data application is used to illustrate the use of the new test.  相似文献   

5.
In this article, we consider the problem of comparing several multivariate normal mean vectors when the covariance matrices are unknown and arbitrary positive definite matrices. We propose a parametric bootstrap (PB) approach and develop an approximation to the distribution of the PB pivotal quantity for comparing two mean vectors. This approximate test is shown to be the same as the invariant test given in [Krishnamoorthy and Yu, Modified Nel and Van der Merwe test for the multivariate Behrens–Fisher problem, Stat. Probab. Lett. 66 (2004), pp. 161–169] for the multivariate Behrens–Fisher problem. Furthermore, we compare the PB test with two existing invariant tests via Monte Carlo simulation. Our simulation studies show that the PB test controls Type I error rates very satisfactorily, whereas other tests are liberal especially when the number of means to be compared is moderate and/or sample sizes are small. The tests are illustrated using an example.  相似文献   

6.
In this article, the two-way error component regression model is considered. For the nonhomogenous linear hypothesis testing of regression coefficients, a parametric bootstrap (PB) approach is proposed. Simulation results indicate that the PB test, regardless of the sample sizes, maintains the Type I error rates very well and outperforms the existing generalized variable test, which may far exceed the intended significance level when the sample sizes are small or moderate. Real data examples illustrate the proposed approach work quite satisfactorily.  相似文献   

7.
Although many methods are available for performing multiple comparisons based on some measure of location, most can be unsatisfactory in at least some situations, in simulations when sample sizes are small, say less than or equal to twenty. That is, the actual Type I error probability can substantially exceed the nominal level, and for some methods the actual Type I error probability can be well below the nominal level, suggesting that power might be relatively poor. In addition, all methods based on means can have relatively low power under arbitrarily small departures from normality. Currently, a method based on 20% trimmed means and a percentile bootstrap method performs relatively well (Wilcox, in press). However, symmetric trimming was used, even when sampling from a highly skewed distribution and a rigid adherence to 20% trimming can result in low efficiency when a distribution is sufficiently heavy-tailed. Robust M-estimators are more flexible but they can be unsatisfactory in terms of Type I errors when sample sizes are small. This paper describes an alternative approach based on a modified one-step M-estimator that introduces more flexibility than a trimmed mean but provides better control over Type I error probabilities compared with using a one-step M-estimator.  相似文献   

8.
A large‐sample problem of illustrating noninferiority of an experimental treatment over a referent treatment for binary outcomes is considered. The methods of illustrating noninferiority involve constructing the lower two‐sided confidence bound for the difference between binomial proportions corresponding to the experimental and referent treatments and comparing it with the negative value of the noninferiority margin. The three considered methods, Anbar, Falk–Koch, and Reduced Falk–Koch, handle the comparison in an asymmetric way, that is, only the referent proportion out of the two, experimental and referent, is directly involved in the expression for the variance of the difference between two sample proportions. Five continuity corrections (including zero) are considered with respect to each approach. The key properties of the corresponding methods are evaluated via simulations. First, the uncorrected two‐sided confidence intervals can, potentially, have smaller coverage probability than the nominal level even for moderately large sample sizes, for example, 150 per group. Next, the 15 testing methods are discussed in terms of their Type I error rate and power. In the settings with a relatively small referent proportion (about 0.4 or smaller), the Anbar approach with Yates’ continuity correction is recommended for balanced designs and the Falk–Koch method with Yates’ correction is recommended for unbalanced designs. For relatively moderate (about 0.6) and large (about 0.8 or greater) referent proportion, the uncorrected Reduced Falk–Koch method is recommended, although in this case, all methods tend to be over‐conservative. These results are expected to be used in the design stage of a noninferiority study when asymmetric comparisons are envisioned. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

9.
A Monte Carlo study was used to examine the Type I error and power values of five multivariate tests for the single-factor repeated measures model The performance of Hotelling's T2 and four nonparametric tests, including a chi-square and an F-test version of a rank-transform procedure, were investigated for different distributions, sample sizes, and numbers of repeated measures. The results indicated that both Hotellings T* and the F-test version of the rank-transform performed well, producing Type I error rates which were close to the nominal value. The chi-square version of the rank-transform test, on the other hand, produced inflated Type I error rates for every condition studied. The Hotelling and F-test version of the rank-transform procedure showed similar power for moderately-skewed distributions, but for strongly skewed distributions the F-test showed much better power. The performance of the other nonparametric tests depended heavily on sample size. Based on these results, the F-test version of the rank-transform procedure is recommended for the single-factor repeated measures model.  相似文献   

10.
Two overlapping confidence intervals have been used in the past to conduct statistical inferences about two population means and proportions. Several authors have examined the shortcomings of Overlap procedure and have determined that such a method distorts the significance level of testing the null hypothesis of two population means and reduces the statistical power of the test. Nearly all results for small samples in Overlap literature have been obtained either by simulation or by formulas that may need refinement for small sample sizes, but accurate large sample information exists. Nevertheless, there are aspects of Overlap that have not been presented and compared against the standard statistical procedure. This article will present exact formulas for the maximum % overlap of two independent confidence intervals below which the null hypothesis of equality of two normal population means or variances must still be rejected for any sample sizes. Further, the impact of Overlap on the power of testing the null hypothesis of equality of two normal variances will be assessed. Finally, the noncentral t-distribution is used to assess the Overlap impact on type II error probability when testing equality of means for sample sizes larger than 1.  相似文献   

11.
We evaluated the properties of six statistical methods for testing equality among populations with zero-inflated continuous distributions. These tests are based on likelihood ratio (LR), Wald, central limit theorem (CLT), modified CLT (MCLT), parametric jackknife (PJ), and nonparametric jackknife (NPJ) statistics. We investigated their statistical properties using simulated data from mixed distributions with an unknown portion of non zero observations that have an underlying gamma, exponential, or log-normal density function and the remaining portion that are excessive zeros. The 6 statistical tests are compared in terms of their empirical Type I errors and powers estimated through 10,000 repeated simulated samples for carefully selected configurations of parameters. The LR, Wald, and PJ tests are preferred tests since their empirical Type I errors were close to the preset nominal 0.05 level and each demonstrated good power for rejecting null hypotheses when the sample sizes are at least 125 in each group. The NPJ test had unacceptable empirical Type I errors because it rejected far too often while the CLT and MCLT tests had low testing powers in some cases. Therefore, these three tests are not recommended for general use but the LR, Wald, and PJ tests all performed well in large sample applications.  相似文献   

12.
This paper elaborates on earlier contributions of Bross (1985) and Millard (1987) who point out that when conducting conventional hypothesis tests in order to “prove” environmental hazard or environmental safety, unrealistically large sample sizes are required to achieve acceptable power with customarily-used values of Type I error probability. These authors also note that “proof of safety” typically requires much larger sample sizes than “proof of hazard”. When the sample has yet to be selected and it is feared that the sample size will be insufficient to conduct a reasonable.  相似文献   

13.
In biomedical studies, the testing problem of two sample survival curves is commonly seen. The most popular approach is the log-rank test. However, the log-rank test may lead to misleading results when two survival curves cross each other. From Li et al., it is difficult to find a good method to test two sample survival curves for all situations. Here, we propose a strategy procedure to combine some existing approaches for the testing problem. Then, we conduct simulations to examine the power and Type I error rate, and compare the proposed methods with five competitive approaches from Li et al. under various crossing situations of two survival curves. From the results, we suggest the Strategy 2 for the two survival curves testing problem, which has higher power and appropriate Type I error for each situation. Finally, we analyze two real data examples with the proposed methods for illustrations.  相似文献   

14.
Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways.  相似文献   

15.
Because the usual F test for equal means is not robust to unequal variances, Brown and Forsythe (1974a) suggest replacing F with the statistics F or W which are based on the Satterthwaite and Welch adjusted degrees of freedom procedures. This paper reports practical situations where both F and W give * unsatisfactory results. In particular, both F and W may not provide adequate control over Type I errors. Moreover, for equal variances, but unequal sample sizes, W should be avoided in favor of F (or F ), but for equal sample sizes, and possibly unequal variances, W was the only satisfactory statistic. New results on power are included as well. The paper also considers the effect of using F or W only after a significant test for equal variances has been obtained, and new results on the robustness of the F test are described. It is found that even for equal sample sizes as large as 50 per treatment group, there are practical situations where the F test does not provide adequately control over the probability of a Type I error.  相似文献   

16.
Testing for the equality of regression coefficients across two regressions is a problem considered by analysts in a variety of fields. If the variances of the errors of the two regressions are not equal, then it is known that the standard large sample F-test used to test the equality of the coefficients is compromised by the fact that its actual size can differ substantially from the stated level of significance in small samples. This article addresses this problem and borrows from the literature on the Behrens-Fisher problem to provide some simple modifications of the large sample test which allows one to better control the probability of committing a Type I error. Empirical evidence is presented which indicates that the suggested modifications provide tests which are superior to well-known alternative tests over a wide range of the parameter space.  相似文献   

17.

For comparing several logistic regression slopes to that of a control for small sample sizes, Dasgupta et al. (2001) proposed an "asymptotic" small-sample test and a "pivoted" version of that test statistic. Their results show both methods perform well in terms of Type I error control and marginal power when the response is related to the explanatory variable via a logistic regression model. This study finds, via Monte Carlo simulations, that when the underlying relationship is probit, complementary log-log, linear, or even non-monotonic, the "asymptotic" and the "pivoted" small-sample methods perform fairly well in terms of Type I error control and marginal power. Unlike their large sample competitors, they are generally robust to departures from the logistic regression model.  相似文献   

18.
Many applications of the Inverse Gaussian distribution, including numerous reliability and life testing results are presented in statistical literature. The paper studies the problem of using entropy tests to examine the goodness of fit of an Inverse Gaussian distribution with unknown parameters. Some entropy tests based on different entropy estimates are proposed. Critical values of the test statistics for various sample sizes are obtained by Monte Carlo simulations. Type I error of the tests is investigated and then power values of the tests are compared with the competing tests against various alternatives. Finally, recommendations for the application of the tests in practice are presented.  相似文献   

19.
The test on proportions as prescribed in the double sampling plan of Dodge and Roming (1929) for inspection by attributes is revisited. A noticeable deficiency of this plan is that it may require more observations than could have been required by an 'equivalent1 fixed-sample testing procedure having the same Type I and Type II error probabilities. Here, we propose a curtailed version of this sampling plan which assures the experimenter that the actual number of observations required to arrive at a terminal decision will never exceed that of the comparable fixed-size testing procedure while keeping the error probabilities at desired levels. In fact, we show that the entire power function of the proposed testing procedure matches that of the 'best' (UMP in its size) fixed-size-sample testing procedure. Other proper- ties of this curtailed double sampling testing procedure, such as its Average Sample Number and its operational characteristics, are also discussed and illustrated.  相似文献   

20.
Three methods for testing the equality of nonindependent proportions were compared with, the use of Monte Carlo techniques. The three methods included Cochran's test, an ANOVA F test, and Hotelling's T2 test. With respect to empirical significance levels, the ANOVA F test is recommended as the preferred method of analysis.

Oftentimes an experimenter is interested in testing the equality of several proportions. When the proportions are independent Kemp and Butcher (1972) and Butcher and Kemp (1974) compared several methods for analysing large sample binomial data for the case of a 3 x 3 factorial design without replication. In addition, Levy and Narula (1977) compared many of the same methods for analyzing binomial data; however, Levy and Narula investigated the relative utility of the methods for small sample sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号