首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Numerous methods have been proposed for dealing with the serious practical problems associated with the conventional analysis of covariance method, with an emphasis on comparing two groups when there is a single covariate. Recently, Wilcox (2005a: section 11.8.2) outlined a method for handling multiple covariates that allows nonlinearity and heteroscedasticity. The method is readily extended to multiple groups, but nothing is known about its small-sample properties. This paper compares three variations of the method, each method based on one of three measures of location: means, medians and 20% trimmed means. The methods based on a 20% trimmed mean or median are found to avoid Type I error probabilities well above the nominal level, but the method based on medians can be too conservative in various situations; using a 20% trimmed mean gave the best results in terms of Type I errors. The methods are based in part on a running interval smoother approximation of the regression surface. Included are comments on required sample sizes that are relevant to the so-called curse of dimensionality.  相似文献   

2.
Recently, Lombard derived an extension of the Doksum–Sievers shift function to dependent groups. This article suggests using a particular numerical method for determining the critical value, reports on the ability of the method to control the probability of a Type I error when sample sizes are small, and it provides comparisons with methods aimed at comparing deciles. It is found that for continuous distributions, Lombard's method performs well and in particular has high power relative to the other two methods considered. But when tied values can occur, now it can have relatively poor power; a method based on the Harrell-Davis estimator is found to give more satisfactory results.  相似文献   

3.
In this paper, Anbar's (1983) approach for estimating a difference between two binomial proportions is discussed with respect to a hypothesis testing problem. Such an approach results in two possible testing strategies. While the results of the tests are expected to agree for a large sample size when two proportions are equal, the tests are shown to perform quite differently in terms of their probabilities of a Type I error for selected sample sizes. Moreover, the tests can lead to different conclusions, which is illustrated via a simple example; and the probability of such cases can be relatively large. In an attempt to improve the tests while preserving their relative simplicity feature, a modified test is proposed. The performance of this test and a conventional test based on normal approximation is assessed. It is shown that the modified Anbar's test better controls the probability of a Type I error for moderate sample sizes.  相似文献   

4.
The likelihood equations based on a progressively Type II censored sample from a Type I generalized logistic distribution do not provide explicit solutions for the location and scale parameters. We present a simple method of deriving explicit estimators by approximating the likelihood equations appropriately. We examine numerically the bias and variance of these estimators and show that these estimators are as efficient as the maximum likelihood estimators (MLEs). The probability coverages of the pivotal quantities (for location and scale parameters) based on asymptotic normality are shown to be unsatisfactory, especially when the effective sample size is small. Therefore we suggest using unconditional simulated percentage points of these pivotal quantities for the construction of confidence intervals. A wide range of sample sizes and progressive censoring schemes have been considered in this study. Finally, we present a numerical example to illustrate the methods of inference developed here.  相似文献   

5.
Reference‐scaled average bioequivalence (RSABE) approaches for highly variable drugs are based on linearly scaling the bioequivalence limits according to the reference formulation within‐subject variability. RSABE methods have type I error control problems around the value where the limits change from constant to scaled. In all these methods, the probability of type I error has only one absolute maximum at this switching variability value. This allows adjusting the significance level to obtain statistically correct procedures (that is, those in which the probability of type I error remains below the nominal significance level), at the expense of some potential power loss. In this paper, we explore adjustments to the EMA and FDA regulatory RSABE approaches, and to a possible improvement of the original EMA method, designated as HoweEMA. The resulting adjusted methods are completely correct with respect to type I error probability. The power loss is generally small and tends to become irrelevant for moderately large (affordable in real studies) sample sizes.  相似文献   

6.
The most common strategy for comparing two independent groups is in terms of some measure of location intended to reflect the typical observation. However, it can be informative and important to compare the lower and upper quantiles as well, but when there are tied values, extant techniques suffer from practical concerns reviewed in the paper. For the special case where the goal is to compare the medians, a slight generalization of the percentile bootstrap method performs well in terms of controlling Type I errors when there are tied values [Wilcox RR. Comparing medians. Comput. Statist. Data Anal. 2006;51:1934–1943]. But our results indicate that when the goal is to compare the quartiles, or quantiles close to zero or one, this approach is highly unsatisfactory when the quantiles are estimated using a single order statistic or a weighted average of two order statistics. The main result in this paper is that when using the Harrell–Davis estimator, which uses all of the order statistics to estimate a quantile, control over the Type I error probability can be achieved in simulations, even when there are tied values, provided the sample sizes are not too small. It is demonstrated that this method can also have substantially higher power than the distribution free method derived by Doksum and Sievers [Plotting with confidence: graphical comparisons of two populations. Biometrika 1976;63:421–434]. Data from two studies are used to illustrate the practical advantages of the method studied here.  相似文献   

7.
This paper considers two general ways dependent groups might be compared based on quantiles. The first compares the quantiles of the marginal distributions. The second focuses on the lower and upper quantiles of the usual difference scores. Methods for comparing quantiles have been derived that typically assume that sampling is from a continuous distribution. There are exceptions, but generally, when sampling from a discrete distribution where tied values are likely, extant methods can perform poorly, even with a large sample size. One reason is that extant methods for estimating the standard error can perform poorly. Another is that quantile estimators based on a single-order statistic, or a weighted average of two-order statistics, are not necessarily asymptotically normal. Our main result is that when using the Harrell–Davis estimator, good control over the Type I error probability can be achieved in simulations via a standard percentile bootstrap method, even when there are tied values, provided the sample sizes are not too small. In addition, the two methods considered here can have substantially higher power than alternative procedures. Using real data, we illustrate how quantile comparisons can be used to gain a deeper understanding of how groups differ.  相似文献   

8.
Because the usual F test for equal means is not robust to unequal variances, Brown and Forsythe (1974a) suggest replacing F with the statistics F or W which are based on the Satterthwaite and Welch adjusted degrees of freedom procedures. This paper reports practical situations where both F and W give * unsatisfactory results. In particular, both F and W may not provide adequate control over Type I errors. Moreover, for equal variances, but unequal sample sizes, W should be avoided in favor of F (or F ), but for equal sample sizes, and possibly unequal variances, W was the only satisfactory statistic. New results on power are included as well. The paper also considers the effect of using F or W only after a significant test for equal variances has been obtained, and new results on the robustness of the F test are described. It is found that even for equal sample sizes as large as 50 per treatment group, there are practical situations where the F test does not provide adequately control over the probability of a Type I error.  相似文献   

9.
A large‐sample problem of illustrating noninferiority of an experimental treatment over a referent treatment for binary outcomes is considered. The methods of illustrating noninferiority involve constructing the lower two‐sided confidence bound for the difference between binomial proportions corresponding to the experimental and referent treatments and comparing it with the negative value of the noninferiority margin. The three considered methods, Anbar, Falk–Koch, and Reduced Falk–Koch, handle the comparison in an asymmetric way, that is, only the referent proportion out of the two, experimental and referent, is directly involved in the expression for the variance of the difference between two sample proportions. Five continuity corrections (including zero) are considered with respect to each approach. The key properties of the corresponding methods are evaluated via simulations. First, the uncorrected two‐sided confidence intervals can, potentially, have smaller coverage probability than the nominal level even for moderately large sample sizes, for example, 150 per group. Next, the 15 testing methods are discussed in terms of their Type I error rate and power. In the settings with a relatively small referent proportion (about 0.4 or smaller), the Anbar approach with Yates’ continuity correction is recommended for balanced designs and the Falk–Koch method with Yates’ correction is recommended for unbalanced designs. For relatively moderate (about 0.6) and large (about 0.8 or greater) referent proportion, the uncorrected Reduced Falk–Koch method is recommended, although in this case, all methods tend to be over‐conservative. These results are expected to be used in the design stage of a noninferiority study when asymmetric comparisons are envisioned. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

10.
Two types of decision errors can be made when using a quality control chart for non-conforming units (p-chart). A Type I error occurs when the process is not out of control but a search for an assignable cause is performed unnecessarily. A Type II error occurs when the process is out of control but a search for an assignable cause is not performed. The probability of a Type I error is under direct control of the decision-maker while the probability of a Type II error depends, in part, on the sample size. A simple sample size formula is presented for determining the required sample size for a p-chart with specified probabilities of Type I and Type II errors.  相似文献   

11.
Some nonparametric methods have been proposed to compare survival medians. Most of them are based on the asymptotic null distribution to estimate the p-value. However, for small to moderate sample sizes, those tests may have inflated Type I error rate, which makes their application limited. In this article, we proposed a new nonparametric test that uses bootstrap to estimate the sample mean and variance of the median. Through comprehensive simulation, we show that the proposed approach can control Type I error rates well. A real data application is used to illustrate the use of the new test.  相似文献   

12.
We consider the problem of choosing among a class of possible estimators by selecting the estimator with the smallest bootstrap estimate of finite sample variance. This is an alternative to using cross-validation to choose an estimator adaptively. The problem of a confidence interval based on such an adaptive estimator is considered. We illustrate the ideas by applying the method to the problem of choosing the trimming proportion of an adaptive trimmed mean. It is shown that a bootstrap adaptive trimmed mean is asymptotically normal with an asymptotic variance equal to the smallest among trimmed means. The asymptotic coverage probability of a bootstrap confidence interval based on such adaptive estimators is shown to have the nominal level. The intervals based on the asymptotic normality of the estimator share the same asymptotic result, but have poor small-sample properties compared to the bootstrap intervals. A small-sample simulation demonstrates that bootstrap adaptive trimmed means adapt themselves rather well even for samples of size 10.  相似文献   

13.
Summary.  Microarrays are a powerful new technology that allow for the measurement of the expression of thousands of genes simultaneously. Owing to relatively high costs, sample sizes tend to be quite small. If investigators apply a correction for multiple testing, a very small p -value will be required to declare significance. We use modifications to Chebyshev's inequality to develop a testing procedure that is nonparametric and yields p -values on the interval [0, 1]. We evaluate its properties via simulation and show that it both holds the type I error rate below nominal levels in almost all conditions and can yield p -values denoting significance even with very small sample sizes and stringent corrections for multiple testing.  相似文献   

14.
Most multivariate statistical techniques rely on the assumption of multivariate normality. The effects of nonnormality on multivariate tests are assumed to be negligible when variance–covariance matrices and sample sizes are equal. Therefore, in practice, investigators usually do not attempt to assess multivariate normality. In this simulation study, the effects of skewed and leptokurtic multivariate data on the Type I error and power of Hotelling's T 2 were examined by manipulating distribution, sample size, and variance–covariance matrix. The empirical Type I error rate and power of Hotelling's T 2 were calculated before and after the application of generalized Box–Cox transformation. The findings demonstrated that even when variance–covariance matrices and sample sizes are equal, small to moderate changes in power still can be observed.  相似文献   

15.
This paper elaborates on earlier contributions of Bross (1985) and Millard (1987) who point out that when conducting conventional hypothesis tests in order to “prove” environmental hazard or environmental safety, unrealistically large sample sizes are required to achieve acceptable power with customarily-used values of Type I error probability. These authors also note that “proof of safety” typically requires much larger sample sizes than “proof of hazard”. When the sample has yet to be selected and it is feared that the sample size will be insufficient to conduct a reasonable.  相似文献   

16.

For comparing several logistic regression slopes to that of a control for small sample sizes, Dasgupta et al. (2001) proposed an "asymptotic" small-sample test and a "pivoted" version of that test statistic. Their results show both methods perform well in terms of Type I error control and marginal power when the response is related to the explanatory variable via a logistic regression model. This study finds, via Monte Carlo simulations, that when the underlying relationship is probit, complementary log-log, linear, or even non-monotonic, the "asymptotic" and the "pivoted" small-sample methods perform fairly well in terms of Type I error control and marginal power. Unlike their large sample competitors, they are generally robust to departures from the logistic regression model.  相似文献   

17.
In single-arm clinical trials with survival outcomes, the Kaplan–Meier estimator and its confidence interval are widely used to assess survival probability and median survival time. Since the asymptotic normality of the Kaplan–Meier estimator is a common result, the sample size calculation methods have not been studied in depth. An existing sample size calculation method is founded on the asymptotic normality of the Kaplan–Meier estimator using the log transformation. However, the small sample properties of the log transformed estimator are quite poor in small sample sizes (which are typical situations in single-arm trials), and the existing method uses an inappropriate standard normal approximation to calculate sample sizes. These issues can seriously influence the accuracy of results. In this paper, we propose alternative methods to determine sample sizes based on a valid standard normal approximation with several transformations that may give an accurate normal approximation even with small sample sizes. In numerical evaluations via simulations, some of the proposed methods provided more accurate results, and the empirical power of the proposed method with the arcsine square-root transformation tended to be closer to a prescribed power than the other transformations. These results were supported when methods were applied to data from three clinical trials.  相似文献   

18.
In this paper we evaluate the performance of three methods for testing the existence of a unit root in a time series, when the models under consideration in the null hypothesis do not display autocorrelation in the error term. In such cases, simple versions of the Dickey-Fuller test should be used as the most appropriate ones instead of the known augmented Dickey-Fuller or Phillips-Perron tests. Through Monte Carlo simulations we show that, apart from a few cases, testing the existence of a unit root we obtain actual type I error and power very close to their nominal levels. Additionally, when the random walk null hypothesis is true, by gradually increasing the sample size, we observe that p-values for the drift in the unrestricted model fluctuate at low levels with small variance and the Durbin-Watson (DW) statistic is approaching 2 in both the unrestricted and restricted models. If, however, the null hypothesis of a random walk is false, taking a larger sample, the DW statistic in the restricted model starts to deviate from 2 while in the unrestricted model it continues to approach 2. It is also shown that the probability not to reject that the errors are uncorrelated, when they are indeed not correlated, is higher when the DW test is applied at 1% nominal level of significance.  相似文献   

19.
Naranjo and HeUmansperger (1994) recently derved a bounded influence rank regression method and suggested how hypotheses about the regression coefficients might be tested. This brief note reports some simulation results on how their procedure performs when there is one predictor. Even when the error term is highly skewed, good control over the Type I error probability is obtained Power can be high relative to least squares regression when the error term has a heavy tailed distribution .and the predictor has a symmetric distribution However, if the predictor has a skewed distribution, power can be relatively low even when the distribution of the error term is heavy tailed. Despite this, it is argued that their method provides an important and useful alternative to ordinary least squares as well as other robust regression methods.  相似文献   

20.
The standard hypothesis testing procedure in meta-analysis (or multi-center clinical trials) in the absence of treatment-by-center interaction relies on approximating the null distribution of the standard test statistic by a standard normal distribution. For relatively small sample sizes, the standard procedure has been shown by various authors to have poor control of the type I error probability, leading to too many liberal decisions. In this article, two test procedures are proposed, which rely on thet—distribution as the reference distribution. A simulation study indicates that the proposed procedures attain significance levels closer to the nominal level compared with the standard procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号