期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bootstrap estimation of actual significance levels for tests based on estimated nuisance parameters

Qiwei Yao Wenyang Zhang Howell Tong 《Statistics and Computing》2001,11(4):367-371

Often for a non-regular parametric hypothesis, a tractable test statistic involves a nuisance parameter. A common practice is to replace the unknown nuisance parameter by its estimator. The validality of such a replacement can only be justified for an infinite sample in the sense that under appropriate conditions the asymptotic distribution of the statistic under the null hypothesis is unchanged when the nuisance parameter is replaced by its estimator (Crowder M.J. 1990. Biometrika 77: 499–506). We propose a bootstrap method to calibrate the error incurred in the significance level, for finite samples, due to the replacement. Further, we have proved that the bootstrap method provides a more accurate estimator for the unknown actual significance level than the nominal level. Simulations demonstrate the proposed methodology. 相似文献

2.

Exact calculations for sequential tests based on Bernoulli trials

Beverley D. Causey 《统计学通讯:模拟与计算》2013,42(2):491-495

We consider methods of computing exactly the probability of “acceptance” and the “average sample size needed” for the sequential probability ratio test (SPRT) and likewise the newer “2-SPRT,” concerning the value of a Bernoulli parameter. The methods permit one to approximate, iteratively, the desired operating characteristics for the test. 相似文献

3.

Corrected p-values for tests based on estimated nuisance parameters

Martin Crowder 《Statistics and Computing》2001,11(4):359-365

In some situations the asymptotic distribution of a random function T _n() that depends on a nuisance parameter is tractable when has known value. In that case it can be used as a test statistic, if suitably constructed, for some hypothesis. However, in practice, often needs to be replaced by an estimator S _n. In this paper general results are given concerning the asymptotic distribution of T _n(S _n) that include special cases previously dealt with. In particular, some situations are covered where the usual likelihood theory is nonregular and extreme values are employed to construct estimators and test statistics. 相似文献

4.

A note on the critical values used in stepwise tests for multiplicative components of interaction

James R. Schott 《统计学通讯:理论与方法》2013,42(5):1561-1570

A result is presented concerning the null distribution of a statistic used to determine the number of multiplicative components in a fixed two-way model. This result suggests critical values which are compared with previously suggested critical values. 相似文献

5.

Tests for noninferiority trials with binomial endpoints: A guide to modern and quasi‐exact methods for biomedical researchers

Enrico Ripamonti Chris J. Lloyd 《Pharmaceutical statistics》2019,18(3):377-387

Applied statisticians and pharmaceutical researchers are frequently involved in the design and analysis of clinical trials where at least one of the outcomes is binary. Treatments are judged by the probability of a positive binary response. A typical example is the noninferiority trial, where it is tested whether a new experimental treatment is practically not inferior to an active comparator with a prespecified margin δ. Except for the special case of δ = 0, no exact conditional test is available although approximate conditional methods (also called second‐order methods) can be applied. However, in some situations, the approximation can be poor and the logical argument for approximate conditioning is not compelling. The alternative is to consider an unconditional approach. Standard methods like the pooled z‐test are already unconditional although approximate. In this article, we review and illustrate unconditional methods with a heavy emphasis on modern methods that can deliver exact, or near exact, results. For noninferiority trials based on either rate difference or rate ratio, our recommendation is to use the so‐called E‐procedure, based on either the score or likelihood ratio statistic. This test is effectively exact, computationally efficient, and respects monotonicity constraints in practice. We support our assertions with a numerical study, and we illustrate the concepts developed in theory with a clinical example in pulmonary oncology; R code to conduct all these analyses is available from the authors. 相似文献

6.

Parametric bootstrap and approximate tests for two Poisson variates

《Journal of Statistical Computation and Simulation》2012,82(3):263-271

The parametric bootstrap tests and the asymptotic or approximate tests for detecting difference of two Poisson means are compared. The test statistics used are the Wald statistics with and without log-transformation, the Cox F statistic and the likelihood ratio statistic. It is found that the type I error rate of an asymptotic/approximate test may deviate too much from the nominal significance level α under some situations. It is recommended that we should use the parametric bootstrap tests, under which the four test statistics are similarly powerful and their type I error rates are all close to α. We apply the tests to breast cancer data and injurious motor vehicle crash data. 相似文献

7.

Comparisons of several Pareto distributions based on record values

Jing Zhao Haiqing Chen Mixia Wu 《统计学通讯:理论与方法》2018,47(10):2456-2468

In order to avoid wrong conclusions in any further analysis, it is of importance to conduct a formal comparison for characteristic quantities of the distributions. These characteristic quantities we are familiar with include mean, quantity and reliability function, and so on. In this paper, we consider two tests aiming at the comparisons for function of parameters in Pareto distribution based on record values. They are generalized p-value-based test and parametric bootstrap-based test, respectively. The resulting procedures are easy to compute and are applicable to small samples. A simulation study is conducted to investigate and compare the performance of the proposed tests. A phenomenon we note is that generalized p-value-based test almost uniformly outperforms the parametric bootstrap-based test. 相似文献

8.

UMPU tests based on sequential order statistics

S. Bedbur 《Journal of statistical planning and inference》2010

Sequential order statistics with conditional proportional hazard rates form a regular exponential family in the model parameters. This finding is used to establish uniformly most powerful unbiased (UMPU) tests for a variety of hypotheses. 相似文献

9.

EXACT P‐VALUES FOR DISCRETE MODELS OBTAINED BY ESTIMATION AND MAXIMIZATION

Chris J. Lloyd 《Australian & New Zealand Journal of Statistics》2008,50(4):329-345

In constructing exact tests from discrete data, one must deal with the possible dependence of the P‐value on nuisance parameter(s) ψ as well as the discreteness of the sample space. A classical but heavy‐handed approach is to maximize over ψ. We prove what has previously been understood informally, namely that maximization produces the unique and smallest possible P‐value subject to the ordering induced by the underlying test statistic and test validity. On the other hand, allowing for the worst case will be more attractive when the P‐value is less dependent on ψ. We investigate the extent to which estimating ψ under the null reduces this dependence. An approach somewhere between full maximization and estimation is partial maximization, with appropriate penalty, as introduced by Berger & Boos (1994, P values maximized over a confidence set for the nuisance parameter. J. Amer. Statist. Assoc. 89 , 1012–1016). It is argued that estimation followed by maximization is an attractive, but computationally more demanding, alternative to partial maximization. We illustrate the ideas on a range of low‐dimensional but important examples for which the alternative methods can be investigated completely numerically. 相似文献

10.

The automatic percentile method: Accurate confidence limits in parametric models

Thomas J. Diciccio Joseph P. Romano 《Revue canadienne de statistique》1989,17(2):155-169

The problem of constructing confidence limits for a scalar parameter is considered. Under weak conditions, Efron's accelerated bias-corrected bootstrap confidence limits are correct to second order in parametric familles. In this article, a new method, called the automatic percentile method, for setting approximate confidence limits is proposed as an attempt to alleviate two problems inherent in Efron's method. The accelerated bias-corrected method is not fully automatic, since it requires the calculation of an analytical adjustment; furthermore, it is typically not exact, though for many situations, particularly scalar-parameter familles, exact answers are available. In broader generality, the proposed method is exact when exact answers exist, and it is second-order accurate otherwise. The automatic percentile method is automatic, and for scalar parameter models it can be iterated to achieve higher accuracy, with the number of computations being linear in the number of iterations. However, when nuisance parameters are present, only second-order accuracy seems obtainable. 相似文献

11.

Testing the rate ratio under inverse sampling based on gradient statistic

Cuizhen Niu Qiang Xia 《Journal of applied statistics》2015,42(7):1402-1420

Inverse sampling is widely applied in studies with dichotomous outcomes, especially when the subjects arrive sequentially or the response of interest is difficult to obtain. In this paper, we investigate the rate ratio test problem under inverse sampling based on gradient statistic with the asymptotic method and parametric bootstrap technique. The gradient statistic has many advantages, for example, it is simple to calculate and competitive with Wald-type, score and likelihood ratio tests in terms of local power. Numerical studies are carried out to evaluate the performance of our gradient test and the existing tests, namely Wald-type, score and likelihood ratio tests. The simulation results suggest that the gradient test based on the parametric bootstrap method has excellent type I error control and large powers even in small sample design. Two real examples, from a heart disease study and a drug comparison study, are applied to illustrate our methods. 相似文献

12.

Permutation tests for homogeneity based on some characterizations

N. G. Ushakov V. G. Ushakov 《统计学通讯:理论与方法》2017,46(15):7692-7702

In this work, non parametric tests are proposed for testing the homogeneity of two or more populations. The tests are based on recently obtained characterizations. The test procedure is based on the permutation bootstrap technique. For the two-sample case the new tests are compared with permutation tests based on the empirical characteristic function and some other tests. The comparison is fulfilled via a Monte Carlo simulation. 相似文献

13.

Exact tests based on the Baumgartner-Weiß-Schindler statistic—A survey

Markus Neuhäuser 《Statistical Papers》2005,46(1):1-29

It is the purpose of this paper to review recently-proposed exact tests based on the Baumgartner-Weiß-Schindler statistic and its modification. Except for the generalized Behrens-Fisher problem, these tests are broadly applicable, and they can be used to compare two groups irrespective of whether or not ties occur. In addition, a nonparametric trend test and a trend test for binomial proportions are possible. These exact tests are preferable to commonly-applied tests, such as the Wilcoxon rank sum test, in terms of both type I error rate and power. 相似文献

14.

Adaptive bootstrap tests and its competitors in the c-sample scale problem

Herbert Büning Michael Rietz 《Journal of applied statistics》2008,35(8):853-866

This paper deals with a study of different types of tests for the two-sided c-sample scale problem. We consider the classical parametric test of Bartlett [M.S. Bartlett, Properties of sufficiency and statistical tests, Proc. R. Stat. Soc. Ser. A. 160 (1937), pp. 268–282] several nonparametric tests, especially the test of Fligner and Killeen [M.A. Fligner and T.J. Killeen, Distribution-free two-sample tests for scale, J. Amer. Statist. Assoc. 71 (1976), pp. 210–213], the test of Levene [H. Levene, Robust tests for equality of variances, in Contribution to Probability and Statistics, I. Olkin, ed., Stanford University Press, Palo Alto, 1960, pp. 278–292] and a robust version of it introduced by Brown and Forsythe [M.B. Brown and A.B. Forsythe, Robust tests for the equality of variances, J. Amer. Statist. Assoc. 69 (1974), pp. 364–367] as well as two adaptive tests proposed by Büning [H. Büning, Adaptive tests for the c-sample location problem – the case of two-sided alternatives, Comm. Statist.Theory Methods. 25 (1996), pp. 1569–1582] and Büning [H. Büning, An adaptive test for the two sample scale problem, Nr. 2003/10, Diskussionsbeiträge des Fachbereich Wirtschaftswissenschaft der Freien Universität Berlin, Volkswirtschaftliche Reihe, 2003]. which are based on the principle of Hogg [R.V. Hogg, Adaptive robust procedures. A partial review and some suggestions for future applications and theory, J. Amer. Statist. Assoc. 69 (1974), pp. 909–927]. For all the tests we use Bootstrap sampling strategies, too. We compare via Monte Carlo Methods all the tests by investigating level α and power β of the tests for distributions with different strength of tailweight and skewness and for various sample sizes. It turns out that the test of Fligner and Killeen in combination with the bootstrap is the best one among all tests considered. 相似文献

15.

A non-inferiority test for diagnostic accuracy in the absence of the golden standard test based on the paired partial areas under receiver operating characteristic curves

Shu-Man Shih Hsin-Neng Hsieh 《Journal of applied statistics》2016,43(3):550-562

Non-inferiority tests are often measured for the diagnostic accuracy in medical research. The area under the receiver operating characteristic (ROC) curve is a familiar diagnostic measure for the overall diagnostic accuracy. Nevertheless, since it may not differentiate the diverse shapes of the ROC curves with different diagnostic significance, the partial area under the ROC (PAUROC) curve, another summary measure emerges for such diagnostic processes that require the false-positive rate to be in the clinically interested range. Traditionally, to estimate the PAUROC, the golden standard (GS) test on the true disease status is required. Nevertheless, the GS test may sometimes be infeasible. Besides, in a lot of research fields such as the epidemiology field, the true disease status of the patients may not be known or available. Under the normality assumption on diagnostic test results, based on the expectation-maximization algorithm in combination with the bootstrap method, we propose the heuristic method to construct a non-inferiority test for the difference in the paired PAUROCs without the GS test. Through the simulation study, although the proposed method might provide a liberal test, as a whole, the empirical size of the proposed method sufficiently controls the size at the significance level, and the empirical power of the proposed method in the absence of the GS is as good as that of the non-inferiority in the presence of the GS. The proposed method is illustrated with the published data. 相似文献

16.

Estimation of a population size through capture-mark-recapture method: a comparison of various point and interval estimators

《Journal of Statistical Computation and Simulation》2012,82(3):335-354

This article deals with the estimation of a fixed population size through capture-mark-recapture method that gives rise to hypergeometric distribution. There are a few well-known and popular point estimators available in the literature, but no good comprehensive comparison is available about their merits. Apart from the available estimators, an empirical Bayes (EB) estimator of the population size is proposed. We compare all the point estimators in terms of relative bias and relative mean squared error. Next, two new interval estimators – (a) an EB highest posterior distribution interval and (b) a frequentist interval estimator based on a parametric bootstrap method, are proposed. The comparison is then carried among the two proposed interval estimators and interval estimators derived from the currently available estimators in terms of coverage probability and average length (AL). Based on comprehensive numerical results, we rank and recommend the point estimators as well as interval estimators for practical use. Finally, a real-life data set for a green treefrog population is used as a demonstration for all the methods discussed. 相似文献

17.

Bootstrap misspecification tests for ARCH based on the empirical process of squared residuals

《Journal of Statistical Computation and Simulation》2012,82(7):469-485

We propose and study by means of simulations and graphical tools a class of goodness-of-fit tests for ARCH models. The tests are based on the empirical distribution function of squared residuals and smooth (parametric) bootstrap. We examine empirical size and power by means of a simulation study. While the tests have overall correct size, their power strongly depends on the type of alternative and is particularly high when the assumption of Gaussian innovations is violated. As an example, the tests are applied to returns on Foreign Exchange rates. 相似文献

18.

Linear estimation of the location and scale parameters based on selected order statistics

Lai K Chan Cheng W 《统计学通讯:理论与方法》2013,42(7):2259-2278

This paper gives a review of the best linear estimates of the location and/or scale parameters based on a few order statistics selected from a complete or censored sample. Small sample and large sample cases are considered and compared. Some examples of the practical applications of the estimates are outlined. 相似文献

19.

Bayesian two-stage design for drug screening trials with switching hypothesis tests based on continuous endpoints

Nan Sun Miin-Jye Wen 《统计学通讯:理论与方法》2021,50(2):415-431

Abstract

In this paper, we propose a Bayesian two-stage design with changing hypothesis test by bridging a single-arm study and a double-arm randomized trial in one phase II clinical trial based on continuous endpoints rather than binary endpoints. We have also calibrated with respect to frequentist and Bayesian error rates. The proposed design minimizes the Bayesian expected sample size if the new candidate has low or high efficacy activity subject to the constraint upon error rates in both frequentist and Bayesian perspectives. Tables of designs for various combinations of design parameters are also provided. 相似文献

20.

Shapiro–Wilk test for skew normal distributions based on data transformations

Elizabeth González-Estrada Waldenia Cosmes 《Journal of Statistical Computation and Simulation》2019,89(17):3258-3272

A probability property that connects the skew normal (SN) distribution with the normal distribution is used for proposing a goodness-of-fit test for the composite null hypothesis that a random sample follows an SN distribution with unknown parameters. The random sample is transformed to approximately normal random variables, and then the Shapiro–Wilk test is used for testing normality. The implementation of this test does not require neither parametric bootstrap nor the use of tables for different values of the slant parameter. An additional test for the same problem, based on a property that relates the gamma and SN distributions, is also introduced. The results of a power study conducted by the Monte Carlo simulation show some good properties of the proposed tests in comparison to existing tests for the same problem. 相似文献