首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Statistical approaches for addressing multiplicity in clinical trials range from the very conservative (the Bonferroni method) to the least conservative the fixed sequence approach. Recently, several authors proposed methods that combine merits of the two extreme approaches. Wiens [2003. A fixed sequence Bonferroni procedure for testing multiple endpoints. Pharmaceutical Statist. 2003, 2, 211–215], for example, considered an extension of the Bonferroni approach where the type I error rate (α)(α) is allocated among the endpoints, however, testing proceeds in a pre-determined order allowing the type I error rate to be saved for later use as long as the null hypotheses are rejected. This leads to a higher power of the test in testing later null hypotheses. In this paper, we consider an extension of Wiens’ approach by taking into account correlations among endpoints for achieving higher flexibility in testing. We show strong control of the family-wise type I error rate for this extension and provide critical values and significance levels for testing up to three endpoints with equal correlations and show how to calculate them for other correlation structures. We also present results of a simulation experiment for comparing the power of the proposed method with those of Wiens’ and others. The results of this experiment show that the magnitude of the gain in power of the proposed method depends on the prospective ordering of testing of the endpoints, the magnitude of the treatment effects of the endpoints and the magnitude of correlation between endpoints. Finally, we consider applications of the proposed method for clinical trials with multiple time points and multiple doses, where correlations among endpoints frequently arise.  相似文献   

2.
A p-value is developed for testing the equivalence of the variances of a bivariate normal distribution. The unknown correlation coefficient is a nuisance parameter in the problem. If the correlation is known, the proposed p-value provides an exact test. For large samples, the p-value can be computed by replacing the unknown correlation by the sample correlation, and the resulting test is quite satisfactory. For small samples, it is proposed to compute the p-value by replacing the unknown correlation by a scalar multiple of the sample correlation. However, a single scalar is not satisfactory, and it is proposed to use different scalars depending on the magnitude of the sample correlation coefficient. In order to implement this approach, tables are obtained providing sub-intervals for the sample correlation coefficient, and the scalars to be used if the sample correlation coefficient belongs to a particular sub-interval. Once such tables are available, the proposed p-value is quite easy to compute since it has an explicit analytic expression. Numerical results on the type I error probability and power are reported on the performance of such a test, and the proposed p-value test is also compared to another test based on a rejection region. The results are illustrated with two examples: an example dealing with the comparability of two measuring devices, and an example dealing with the assessment of bioequivalence.  相似文献   

3.
Without the exchangeability assumption, permutation tests for comparing two population means do not provide exact control of the probability of making a Type I error. Another drawback of permutation tests is that it cannot be used to test hypothesis about one population. In this paper, we propose a new type of permutation tests for testing the difference between two population means: the split sample permutation t-tests. We show that the split sample permutation t-tests do not require the exchangeability assumption, are asymptotically exact and can be easily extended to testing hypothesis about one population. Extensive simulations were carried out to evaluate the performance of two specific split sample permutation t-tests: the split in the middle permutation t-test and the split in the end permutation t-test. The simulation results show that the split in the middle permutation t-test has comparable performance to the permutation test if the population distributions are symmetric and satisfy the exchangeability assumption. Otherwise, the split in the end permutation t-test has significantly more accurate control of level of significance than the split in the middle permutation t-test and other existing permutation tests.  相似文献   

4.
Consider two independent normal populations. Let R denote the ratio of the variances. The usual procedure for testing H0: R = 1 vs. H1: R = r, where r≠1, is the F-test. Let θ denote the proportion of observations to be allocated to the first population. Here we find the value of θ that maximizes the rate at which the observed significance level of the F-test converges to zero under H1, as measured by the half slope.  相似文献   

5.
Preliminary testing procedures for the two means problem traditionally employ the pooled variance t-statistic. In this paper we show that bias of the t-statistic under conditions of heterogeneity of variance may be increased if use of the t-statistic is conditional on an affirmative F-test. For this reason we conclude that use of the t-statistic in preliminary testing procedures is inappropriate.  相似文献   

6.
A procedure, based on sample spacings, is proposed for testing whether a univariate distribution is symmetric about some unknown value. The proposed test is a modification of a sign test suggested by Antille and Kersting [1977. Tests for symmetry. Z. Wahrscheinlichkeitstheorie verw. Gebiete 39, 235–255], but unlike Antille and Kersting's test, our modified test is asymptotically distribution-free and is usable in practice. A simulation study indicates that the proposed test maintains the nominal level of significance, αα fairly accurately even for samples of size as small as 20, and a comparison with the classical test based on sample coefficient of skewness, shows that our test has good power for detecting different asymmetric distributions.  相似文献   

7.
This paper discusses multiple testing procedures in dose-response clinical trials with primary and secondary endpoints. A general gatekeeping framework for constructing multiple tests is proposed, which extends the Dunnett test [Journal of the American Statistical Association 1955; 50: 1096-1121] and Bonferroni-based gatekeeping tests developed by Dmitrienko et al. [Statistics in Medicine 2003; 22:2387-2400]. The proposed procedure accounts for the hierarchical structure of the testing problem; for example, it restricts testing of secondary endpoints to the doses for which the primary endpoint is significant. The multiple testing approach is illustrated using a dose-response clinical trial in patients with diabetes. Monte-Carlo simulations demonstrate that the proposed procedure provides a power advantage over the Bonferroni gatekeeping procedure. The power gain generally increases with increasing correlation among the endpoints, especially when all primary dose-control comparisons are significant.  相似文献   

8.
ABSTRACT

A statistical test can be seen as a procedure to produce a decision based on observed data, where some decisions consist of rejecting a hypothesis (yielding a significant result) and some do not, and where one controls the probability to make a wrong rejection at some prespecified significance level. Whereas traditional hypothesis testing involves only two possible decisions (to reject or not a null hypothesis), Kaiser’s directional two-sided test as well as the more recently introduced testing procedure of Jones and Tukey, each equivalent to running two one-sided tests, involve three possible decisions to infer the value of a unidimensional parameter. The latter procedure assumes that a point null hypothesis is impossible (e.g., that two treatments cannot have exactly the same effect), allowing a gain of statistical power. There are, however, situations where a point hypothesis is indeed plausible, for example, when considering hypotheses derived from Einstein’s theories. In this article, we introduce a five-decision rule testing procedure, equivalent to running a traditional two-sided test in addition to two one-sided tests, which combines the advantages of the testing procedures of Kaiser (no assumption on a point hypothesis being impossible) and Jones and Tukey (higher power), allowing for a nonnegligible (typically 20%) reduction of the sample size needed to reach a given statistical power to get a significant result, compared to the traditional approach.  相似文献   

9.
A generalization of step-up and step-down multiple test procedures is proposed. This step-up-down procedure is useful when the objective is to reject a specified minimum number, q, out of a family of k hypotheses. If this basic objective is met at the first step, then it proceeds in a step-down manner to see if more than q hypotheses can be rejected. Otherwise it proceeds in a step-up manner to see if some number less than q hypotheses can be rejected. The usual step-down procedure is the special case where q = 1, and the usual step-up procedure is the special case where q = k. Analytical and numerical comparisons between the powers of the step-up-down procedures with different choices of q are made to see how these powers depend on the actual number of false hypotheses. Examples of application include comparing the efficacy of a treatment to a control for multiple endpoints and testing the sensitivity of a clinical trial for comparing the efficacy of a new treatment with a set of standard treatments.  相似文献   

10.
In the usual two-way layout of ANOVA (interactions are admitted) let nij ? 1 be the number of observations for the factor-level combination(i, j). For testing the hypothesis that all main effects of the first factor vanish numbers n1ij are given such that the power function of the F-test is uniformly maximized (U-optimality), if one considers only designs (nij) for which the row-sums ni are prescribed. Furthermore, in the (larger) set of all designs for which the total number of observations is given, all D-optimum designs are constructed.  相似文献   

11.
An account of the behavior of the independent-samples t-test when applied to homoschedastic bivariate normal data is presented, and a comparison is made with the paired-samples t-test. Since the significance level is not violated when applying the independent-samples t-test to data which consist of positively correlated pairs and since the estimate of the variance is based on a larger number of ‘degrees of freedom’, the results suggest that when the sample size is small, one should not worry much about the possible existence of weak positive correlation. One may do better, powerwise, to ignore such correlation and use the independent-samples t-test, as though the samples were independent.  相似文献   

12.
The relative performance of a component of a series system in two different environments is considered. The conditional probability of the failure of the system due to the failure of the specified component given that the system failed before time t is regarded as a measure of relative importance of the component to the system. A U-statistic test for checking the equality of the relative importance of the component to the system in two different environments against the alternative that the relative importance is smaller in one of the environments, is proposed. Some simulation results for estimating the power of the test are reported. The proposed test is applied to one real data set and it is seen that a different aspect of the data is brought out by this comparison than that by the comparisons of the absolute importance functions such as the subsurvival functions, considered in earlier studies.  相似文献   

13.
The basic assumption underlying the concept of ranked set sampling is that actual measurement of units is expensive, whereas ranking is cheap. This may not be true in reality in certain cases where ranking may be moderately expensive. In such situations, based on total cost considerations, k-tuple ranked set sampling is known to be a viable alternative, where one selects k units (instead of one) from each ranked set. In this article, we consider estimation of the distribution function based on k-tuple ranked set samples when the cost of selecting and ranking units is not ignorable. We investigate estimation both in the balanced and unbalanced data case. Properties of the estimation procedure in the presence of ranking error are also investigated. Results of simulation studies as well as an application to a real data set are presented to illustrate some of the theoretical findings.  相似文献   

14.
The t-statistic used in the existing literature for testing the significance of linear multiple regression coefficients has only a limited use in testing the marginal significance of explanatory variables though it is used in testing the partial significance also. This article identifies the t-statistic appropriate for testing the partial significance.  相似文献   

15.
Two overlapping confidence intervals have been used in the past to conduct statistical inferences about two population means and proportions. Several authors have examined the shortcomings of Overlap procedure and have determined that such a method distorts the significance level of testing the null hypothesis of two population means and reduces the statistical power of the test. Nearly all results for small samples in Overlap literature have been obtained either by simulation or by formulas that may need refinement for small sample sizes, but accurate large sample information exists. Nevertheless, there are aspects of Overlap that have not been presented and compared against the standard statistical procedure. This article will present exact formulas for the maximum % overlap of two independent confidence intervals below which the null hypothesis of equality of two normal population means or variances must still be rejected for any sample sizes. Further, the impact of Overlap on the power of testing the null hypothesis of equality of two normal variances will be assessed. Finally, the noncentral t-distribution is used to assess the Overlap impact on type II error probability when testing equality of means for sample sizes larger than 1.  相似文献   

16.
A life distribution is said to have a weak memoryless property if its conditional probability of survival beyond a fixed time point is equal to its (unconditional) survival probability at that point. Goodness‐of‐fit testing of this notion is proposed in the current investigation, both when the fixed time point is known and when it is unknown but estimable from the data. The limiting behaviour of the proposed test statistic is obtained and the null variance is explicitly given. The empirical power of the test is evaluated for a commonly known alternative using Monte Carlo methods, showing that the test performs well. The case when the fixed time point t0 equals a quantile of the distribution F gives a distribution‐free test procedure. The procedure works even if t0 is unknown but is estimable.  相似文献   

17.
Abstract

In a 2-step monotone missing dataset drawn from a multivariate normal population, T2-type test statistic (similar to Hotelling’s T2 test statistic) and likelihood ratio (LR) are often used for the test for a mean vector. In complete data, Hotelling’s T2 test and LR test are equivalent, however T2-type test and LR test are not equivalent in the 2-step monotone missing dataset. Then we interest which statistic is reasonable with relation to power. In this paper, we derive asymptotic power function of both statistics under a local alternative and obtain an explicit form for difference in asymptotic power function. Furthermore, under several parameter settings, we compare LR and T2-type test numerically by using difference in empirical power and in asymptotic power function. Summarizing obtained results, we recommend applying LR test for testing a mean vector.  相似文献   

18.
The power of a statistical test depends on the sample size. Moreover, in a randomized trial where two treatments are compared, the power also depends on the number of assignments of each treatment. We can treat the power as the conditional probability of correctly detecting a treatment effect given a particular treatment allocation status. This paper uses a simple z-test and a t-test to demonstrate and analyze the power function under the biased coin design proposed by Efron in 1971. We numerically show that Efron's biased coin design is uniformly more powerful than the perfect simple randomization.  相似文献   

19.
Complete sets of orthogonal F-squares of order n = sp, where g is a prime or prime power and p is a positive integer have been constructed by Hedayat, Raghavarao, and Seiden (1975). Federer (1977) has constructed complete sets of orthogonal F-squares of order n = 4t, where t is a positive integer. We give a general procedure for constructing orthogonal F-squares of order n from an orthogonal array (n, k, s, 2) and an OL(s, t) set, where n is not necessarily a prime or prime power. In particular, we show how to construct sets of orthogonal F-squares of order n = 2sp, where s is a prime or prime power and p is a positive integer. These sets are shown to be near complete and approach complete sets as s and/or p become large. We have also shown how to construct orthogonal arrays by these methods. In addition, the best upper bound on the number t of orthogonal F(n, λ1), F(n, λ2), …, F(n, λ1) squares is given.  相似文献   

20.
Let p independent test statistics be available to test a null hypothesis concerned with the same parameter. The p are assumed to be similar tests. Asymptotic and non-asymptotic optimality properties of combined tests are studied. The asymptotic study centers around two notions. The first is Bahadur efficiency. The second is based on a notion of second order comparisons. The non-asymptotic study is concerned with admissibility questions. Most of the popular combining methods are considered along with a method not studied in the past. Among the results are the following: Assume each of the p statistics has the same Bahadur slope. Then the combined test based on the sum of normal transforms, is asymptotically best among all tests studied, by virtue of second order considerations. Most of the popular combined tests are inadmissible for testing the noncentrality parameter of chi-square, t, and F distributions. For chi-square a combined test is offered which is admissible, asymptotically optimal (first order), asymptotically optimal (second order) among all tests studied, and for which critical values are obtainable in special cases. Extensions of the basic model are given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号