期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Weighing the Evidence for Hypotheses with Small Samples of Right-censored Exponential Data

Staudte R.G. Zhang J. 《Lifetime data analysis》1997,3(4):383-398

The p-value evidence for an alternative to a null hypothesis regarding the mean lifetime can be unreliable if based on asymptotic approximations when there is only a small sample of right-censored exponential data. However, a guarded weight of evidence for the alternative can always be obtained without approximation, no matter how small the sample, and has some other advantages over p-values. Weights of evidence are defined as estimators of 0 when the null hypothesis is true and 1 when the alternative is true, and they are judged on the basis of the ensuing risks, where risk is mean squared error of estimation. The evidence is guarded in that a preassigned bound is placed on the risk under the hypothesis. Practical suggestions are given for choosing the bound and for interpreting the magnitude of the weight of evidence. Acceptability profiles are obtained by inversion of a family of guarded weights of evidence for two-sided alternatives to point hypotheses, just as confidence intervals are obtained from tests; these profiles are arguably more informative than confidence intervals, and are easily determined for any level and any sample size, however small. They can help understand the effects of different amounts of censoring. They are found for several small size data sets, including a sample of size 12 for post-operative cancer patients. Both singly Type I and Type II censored examples are included. An examination of the risk functions of these guarded weights of evidence suggests that if the censoring time is of the same magnitude as the mean lifetime, or larger, then the risks in using a guarded weight of evidence based on a likelihood ratio are not much larger than they would be if the parameter were known. 相似文献

2.

Guarded Weights of Evidence and Acceptability Profiles Based on Signs: II. Asymptotic Results

M. B. Dollinger E. Kulinskaya & R. G. Staudte 《Australian & New Zealand Journal of Statistics》1999,41(4):451-461

The concepts of guarded weights of evidence and acceptability profiles have been extended to the distribution-free setting in Dollinger, Kulinskaya & Staudte (1999). In that first of two parts the advantages of these concepts relative to traditional ones such as p -values and confidence intervals derived from hypothesis tests are emphasized for small samples. Here in Part II asymptotic expressions are found for guarded weights of evidence for hypothesesregarding the median of a symmetric distribution and related acceptability profiles for the median. It is also seen that for local alternatives the efficacy and Pitman asymptotic relative efficiency of the sign statistic for testing hypotheses carries over to the more general setting of guarded weights of evidence. 相似文献

3.

Selecting the Optimal Transformation of a Continuous Covariate in Cox's Regression: Implications for Hypothesis Testing

Mamun Mahmud Karen Leffondré Yogendra P. Chaubey 《统计学通讯:模拟与计算》2013,42(1):27-45

ABSTRACT

Background: Many exposures in epidemiological studies have nonlinear effects and the problem is to choose an appropriate functional relationship between such exposures and the outcome. One common approach is to investigate several parametric transformations of the covariate of interest, and to select a posteriori the function that fits the data the best. However, such approach may result in an inflated Type I error. Methods: Through a simulation study, we generated data from Cox's models with different transformations of a single continuous covariate. We investigated the Type I error rate and the power of the likelihood ratio test (LRT) corresponding to three different procedures that considered the same set of parametric dose-response functions. The first unconditional approach did not involve any model selection, while the second conditional approach was based on a posteriori selection of the parametric function. The proposed third approach was similar to the second except that it used a corrected critical value for the LRT to ensure a correct Type I error. Results: The Type I error rate of the second approach was two times higher than the nominal size. For simple monotone dose-response, the corrected test had similar power as the unconditional approach, while for non monotone, dose-response, it had a higher power. A real-life application that focused on the effect of body mass index on the risk of coronary heart disease death, illustrated the advantage of the proposed approach. Conclusion: Our results confirm that a posteriori selecting the functional form of the dose-response induces a Type I error inflation. The corrected procedure, which can be applied in a wide range of situations, may provide a good trade-off between Type I error and power. 相似文献

4.

Testing equality of cause-specific hazard rates corresponding to m competing risks among K groups

Kulathinal SB Gasbarra D 《Lifetime data analysis》2002,8(2):147-161

In this paper, a class of tests is developed for comparing the cause-specific hazard rates of m competing risks simultaneously in K ( 2) groups. The data available for a unit are the failure time of the unit along with the identifier of the risk claiming the failure. In practice, the failure time data are generally right censored. The tests are based on the difference between the weighted averages of the cause-specific hazard rates corresponding to each risk. No assumption regarding the dependence of the competing risks is made. It is shown that the proposed test statistic has asymptotically chi-squared distribution. The proposed test is shown to be optimal for a specific type of local alternatives. The choice of weight function is also discussed. A simulation study is carried out using multivariate Gumbel distribution to compare the optimal weight function with a proposed weight function which is to be used in practice. Also, the proposed test is applied to real data on the termination of an intrauterine device.An erratum to this article can be found at 相似文献

5.

A Test for the Equality of Parameters for Separate Regression Models in the Presence of Heteroskedasticity

Dennis Oberhelman 《统计学通讯:模拟与计算》2013,42(1):99-121

Testing for the equality of regression coefficients across two regressions is a problem considered by analysts in a variety of fields. If the variances of the errors of the two regressions are not equal, then it is known that the standard large sample F-test used to test the equality of the coefficients is compromised by the fact that its actual size can differ substantially from the stated level of significance in small samples. This article addresses this problem and borrows from the literature on the Behrens-Fisher problem to provide some simple modifications of the large sample test which allows one to better control the probability of committing a Type I error. Empirical evidence is presented which indicates that the suggested modifications provide tests which are superior to well-known alternative tests over a wide range of the parameter space. 相似文献

6.

Properties of the rank transformation in factorial analysis of covariance

Todd C. Headrick Shlomo S. Sawilowsky 《统计学通讯:模拟与计算》2013,42(4):1059-1087

Real world data often fail to meet the underlying assumption of population normality. The Rank Transformation (RT) procedure has been recommended as an alternative to the parametric factorial analysis of covariance (ANCOVA). The purpose of this study was to compare the Type I error and power properties of the RT ANCOVA to the parametric procedure in the context of a completely randomized balanced 3 × 4 factorial layout with one covariate. This study was concerned with tests of homogeneity of regression coefficients and interaction under conditional (non)normality. Both procedures displayed erratic Type I error rates for the test of homogeneity of regression coefficients under conditional nonnormality. With all parametric assumptions valid, the simulation results demonstrated that the RT ANCOVA failed as a test for either homogeneity of regression coefficients or interaction due to severe Type I error inflation. The error inflation was most severe when departures from conditional normality were extreme. Also associated with the RT procedure was a loss of power. It is recommended that the RT procedure not be used as an alternative to factorial ANCOVA despite its encouragement from SAS, IMSL, and other respected sources. 相似文献

7.

A nonparametric approach to detecting the difference of survival medians

Zhongxue Chen 《统计学通讯:模拟与计算》2017,46(1):395-403

Some nonparametric methods have been proposed to compare survival medians. Most of them are based on the asymptotic null distribution to estimate the p-value. However, for small to moderate sample sizes, those tests may have inflated Type I error rate, which makes their application limited. In this article, we proposed a new nonparametric test that uses bootstrap to estimate the sample mean and variance of the median. Through comprehensive simulation, we show that the proposed approach can control Type I error rates well. A real data application is used to illustrate the use of the new test. 相似文献

8.

An empirical likelihood ratio based goodness-of-fit test for skew normality

Wei Ning Grace Ngunkeng 《Statistical Methods and Applications》2013,22(2):209-226

In this paper, an empirical likelihood ratio based goodness-of-fit test for the skew normality is proposed. The asymptotic results of the test statistic under the null hypothesis and the alternative hypothesis are derived. Simulations indicate that the Type I error of the proposed test can be well controlled for a given nominal level. The power comparison with other available tests shows that the proposed test is competitive. The test is applied to IQ scores data set and Australian Institute of Sport data set to illustrate the testing procedure. 相似文献

9.

On the goodness-of-fit test based on the ratio of cumulative hazard functions with the Type II censored data

Sangun Park 《统计学通讯:模拟与计算》2017,46(4):2935-2944

For comparing two cumulative hazard functions, we consider an extension of the Kullback–Leibler information to the cumulative hazard function, which is concerning the ratio of cumulative hazard functions. Then we consider its estimate as a goodness-of-fit test with the Type II censored data. For an exponential null distribution, the proposed test statistic is shown to outperform other test statistics based on the empirical distribution function in the heavy censoring case against the increasing hazard alternatives. 相似文献

10.

A note on a family of criteria for evaluating test statistics

Gordon Anderson Teng Wah Leo 《统计学通讯:理论与方法》2013,42(11):3138-3144

ABSTRACT

In noting that the usual criteria for choosing an optimal test, Uniform Power and Local Power are at opposite ends of a spectrum of dominance criteria, a complete “Power Dominance” family of criteria for classifying and choosing optimal tests on the basis of their power characteristics is identified, wherein successive orders of dominance attach increasing weight to power close to the null hypothesis. Indices of the extent to which a preferred test has superior power characteristics over other members in its class, and an index of the proximity of a test to the envelope function of alternative tests are also provided. The ideas are exemplified using various optimal test statistics for Normal and Laplace population distributions. 相似文献

11.

Weighted Mean Survival Test Statistics: a Class of Distance Tests for Censored Survival Data

Yu Shen & Thomas R. Fleming 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(1):269-280

A class of test statistics is introduced which is sensitive against the alternative of stochastic ordering in the two-sample censored data problem. The test statistics for evaluating a cumulative weighted difference in survival distributions are developed while taking into account the imbalances in base-line covariates between two groups. This procedure can be used to test the null hypothesis of no treatment effect, especially when base-line hazards cross and prognostic covariates need to be adjusted. The statistics are semiparametric, not rank based, and can be written as integrated weighted differences in estimated survival functions, where these survival estimates are adjusted for covariate imbalances. The asymptotic distribution theory of the tests is developed, yielding test procedures that are shown to be consistent under a fixed alternative. The choice of weight function is discussed and relies on stability and interpretability considerations. An example taken from a clinical trial for acquired immune deficiency syndrome is presented. 相似文献

12.

Comparing J independent groups with a method based on trimmed means

A. Firat Ozdemir Rand Wilcox Engin Yildiztepe 《统计学通讯:模拟与计算》2018,47(3):852-863

The ANOVA-F test is the most popular and commonly used procedure for comparing J independent groups. However, it is well known that this method is very sensitive to non-normality, which has led to the derivation of alternative techniques based on robust estimators. In this work, ANOVA-F-test, trimmed mean Welch test, bootstrap-t trimmed mean Welch test, Schrader and Hettmansperger method with trimmed means, a percentile bootstrap method with trimmed means and a newly proposed method were compared in terms of both the Type I error probability and power. The proposed method compares well with ANOVA-F and other alternatives under various situations. 相似文献

13.

An adjusted likelihood-ratio approach to the behrens-fisher problem

Hamparsum Bozdogan Donald E. Ramirez 《统计学通讯:理论与方法》2013,42(8):2405-2433

In many situations it is necessary to test the equality of the means of two normal populations when the variances are unknown and unequal. This paper studies the celebrated and controversial Behrens-Fisher problem via an adjusted likelihood-ratio test using the maximum likelihood estimates of the parameters under both the null and the alternative models. This procedure allows the significance level to be adjusted in accordance with the degrees of freedom to balance the risk due to the bias in using the maximum likelihood estimates and the risk due to the increase of variance. A large scale Monte Carlo investigation is carried out to show that -2 InA has an empirical chi-square distribution with fractional degrees of freedom instead of a chi-square distribution with one degree of freedom. Also Monte Carlo power curves are investigated under several different conditions to evaluate the performances of several conventional procedures with that of this procedure with respect to control over Type I errors and power. 相似文献

14.

A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons

David R. Bickel 《Revue canadienne de statistique》2011,39(4):610-631

The normalized maximum likelihood (NML) is a recent penalized likelihood that has properties that justify defining the amount of discrimination information (DI) in the data supporting an alternative hypothesis over a null hypothesis as the logarithm of an NML ratio, namely, the alternative hypothesis NML divided by the null hypothesis NML. The resulting DI, like the Bayes factor but unlike the P‐value, measures the strength of evidence for an alternative hypothesis over a null hypothesis such that the probability of misleading evidence vanishes asymptotically under weak regularity conditions and such that evidence can support a simple null hypothesis. Instead of requiring a prior distribution, the DI satisfies a worst‐case minimax prediction criterion. Replacing a (possibly pseudo‐) likelihood function with its weighted counterpart extends the scope of the DI to models for which the unweighted NML is undefined. The likelihood weights leverage side information, either in data associated with comparisons other than the comparison at hand or in the parameter value of a simple null hypothesis. Two case studies, one involving multiple populations and the other involving multiple biological features, indicate that the DI is robust to the type of side information used when that information is assigned the weight of a single observation. Such robustness suggests that very little adjustment for multiple comparisons is warranted if the sample size is at least moderate. The Canadian Journal of Statistics 39: 610–631; 2011. © 2011 Statistical Society of Canada 相似文献

15.

Robust weighted one-way ANOVA: Improved approximation and efficiency

Elena Kulinskaya Michael B. Dollinger 《Journal of statistical planning and inference》2007

A robust test for the one-way ANOVA model under heteroscedasticity is developed in this paper. The data are assumed to be symmetrically distributed, apart from some outliers, although the assumption of normality may be violated. The test statistic to be used is a weighted sum of squares similar to the Welch [1951. On the comparison of several mean values: an alternative approach. Biometrika 38, 330-336.] test statistic, but any of a variety of robust measures of location and scale for the populations of interest may be used instead of the usual mean and standard deviation. Under the commonly occurring condition that the robust measures of location and scale are asymptotically normal, we derive approximations to the distribution of the test statistic under the null hypothesis and to its distribution under alternative hypotheses. An expression for relative efficiency is derived, thus allowing comparison of the efficiency of the test as a function of the choice of the location and scale estimators used in the test statistic. As an illustration of the theory presented here, we apply it to three commonly used robust location–scale estimator pairs: the trimmed mean with the Winsorized standard deviation; the Huber Proposal 2 estimator pair; and the Hampel robust location estimator with the median absolute deviation. 相似文献

16.

Testing for State Dependence with Time-Variant Transition Probabilities

《Econometric Reviews》2007,26(6):685-703

We derive a simple result that allows us to test for the presence of state dependence in a dynamic Logit model with time-variant transition probabilities and an arbitrary distribution of the unobserved heterogeneity. Monte Carlo evidence suggests that this test has desirable properties even when there are some violations of the model's assumptions. We also consider alternative tests that will have desirable properties only when the transition probabilities do not depend on time and provide evidence that there is an “acceptable” range in which ignoring time-dependence does not matter too much. We conclude with an application to the Barker Hypothesis. 相似文献

17.

Tests for Comparing Mark-Specific Hazards and Cumulative Incidence Functions

Gilbert PB McKeague IW Sun Y 《Lifetime data analysis》2004,10(1):5-28

It is of interest in some applications to determine whether there is a relationship between a hazard rate function (or a cumulative incidence function) and a mark variable which is only observed at uncensored failure times. We develop nonparametric tests for this problem when the mark variable is continuous. Tests are developed for the null hypothesis that the mark-specific hazard rate is independent of the mark versus ordered and two-sided alternatives expressed in terms of mark-specific hazard functions and mark-specific cumulative incidence functions. The test statistics are based on functionals of a bivariate test process equal to a weighted average of differences between a Nelson-Aalen-type estimator of the mark-specific cumulative hazard function and a nonparametric estimator of this function under the null hypothesis. The weight function in the test process can be chosen so that the test statistics are asymptotically distribution-free. Asymptotically correct critical values are obtained through a simple simulation procedure. The testing procedures are shown to perform well in numerical studies, and are illustrated with an AIDS clinical trial example. Specifically, the tests are used to assess if the instantaneous or absolute risk of treatment failure depends on the amount of accumulation of drug resistance mutations in a subject's HIV virus. This assessment helps guide development of anti-HIV therapies that surmount the problem of drug resistance. 相似文献

18.

Testing for State Dependence with Time-Variant Transition Probabilities

Timothy J. Halliday 《Econometric Reviews》2013,32(6):685-703

We derive a simple result that allows us to test for the presence of state dependence in a dynamic Logit model with time-variant transition probabilities and an arbitrary distribution of the unobserved heterogeneity. Monte Carlo evidence suggests that this test has desirable properties even when there are some violations of the model's assumptions. We also consider alternative tests that will have desirable properties only when the transition probabilities do not depend on time and provide evidence that there is an “acceptable” range in which ignoring time-dependence does not matter too much. We conclude with an application to the Barker Hypothesis. 相似文献

19.

Two stage multiple comparison procedures based on the studentized range

Yosef Hochberg Peter A. Lachenbruch 《统计学通讯:理论与方法》2013,42(15):1447-1453

Stein’s (1945) two sample approach and Tukey’s T-Method of multiple comparisons (see e.g. Miller, 1966, Ch. 2) are combined to obtain fixed width simultaneous confidence intervals and simultaneous test procedures of predetermined Type I and Type II error levels, for all contrasts, in a one way layout. The necessary constants for implementing the two stage procedure are obtained under a least favorable configuration of the parameters. This provides the required protection of the null and alternative hypotheses under any configuration of parameters. A table is provided for some selected designs and error levels and an example is given to illustrate certain features of the new procedure. 相似文献

20.

Letters to the Editor

Boris Freidlin Joseph L. Gastwirth 《The American statistician》2013,67(3):161-164

Although several authors have indicated that the median test has low power in small samples, it continues to be presented in many statistical textbooks, included in a number of popular statistical software packages, and used in a variety of application areas. We present results of a power simulation study that shows that the median test has noticeably lower power, even for the double exponential distribution for which it is asymptotically most powerful, than other readily available rank tests. We suggest that the median test be “retired” from routine use and recommend alternative rank tests that have superior power over a relatively large family of symmetric distributions. 相似文献