期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Two separate effects of variance heterogeneity on the validity and power of significance tests of location

Donald W. Zimmerman 《Statistical Methodology》2006,3(4):351-374

Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways. 相似文献

2.

Testing homogeneity in discrete mixtures

Richard Charnigo Jiayang Sun 《Journal of statistical planning and inference》2008

This paper introduces W-tests for assessing homogeneity in mixtures of discrete probability distributions. A W-test statistic depends on the data solely through parameter estimators and, if a penalized maximum likelihood estimation framework is used, has a tractable asymptotic distribution under the null hypothesis of homogeneity. The large-sample critical values are quantiles of a chi-square distribution multiplied by an estimable constant for which we provide an explicit formula. In particular, the estimation of large-sample critical values does not involve simulation experiments or random field theory. We demonstrate that W-tests are generally competitive with a benchmark test in terms of power to detect heterogeneity. Moreover, in many situations, the large-sample critical values can be used even with small to moderate sample sizes. The main implementation issue (selection of an underlying measure) is thoroughly addressed, and we explain why W-tests are well-suited to problems involving large and online data sets. Application of a W-test is illustrated with an epidemiological data set. 相似文献

3.

Testing identity of high-dimensional covariance matrix

Hao Wang Baisen Liu Ning-Zhong Shi Shurong Zheng 《Journal of Statistical Computation and Simulation》2018,88(13):2600-2611

Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes. 相似文献

4.

An Affine-Invariant Generalization of the Wilcoxon Signed-Rank Test for the Bivariate Location Problem 总被引：2，自引：0，他引：2

Denis Larocque Serge Tardif Constance van Eeden 《Australian & New Zealand Journal of Statistics》2003,45(2):153-165

This paper proposes an affine‐invariant test extending the univariate Wilcoxon signed‐rank test to the bivariate location problem. It gives two versions of the null distribution of the test statistic. The first version leads to a conditionally distribution‐free test which can be used with any sample size. The second version can be used for larger sample sizes and has a limiting χ₂² distribution under the null hypothesis. The paper investigates the relationship with a test proposed by Jan & Randles (1994). It shows that the Pitman efficiency of this test relative to the new test is equal to 1 for elliptical distributions but that the two tests are not necessarily equivalent for non‐elliptical distributions. These facts are also demonstrated empirically in a simulation study. The new test has the advantage of not requiring the assumption of elliptical symmetry which is needed to perform the asymptotic version of the Jan and Randles test. 相似文献

5.

Central limit theorems for functionals of large sample covariance matrix and mean vector in matrix‐variate location mixture of normal distributions

Taras Bodnar Stepan Mazur Nestor Parolya 《Scandinavian Journal of Statistics》2019,46(2):636-660

In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/n→c ∈ [0, + ∞) when the sample covariance matrix does not need to be invertible and p/n→c ∈ [0,1) otherwise. 相似文献

6.

Bayesian analysis of a multiple-recapture model

Paul H. Garthwaite Keming Yu Peter B. Hope 《统计学通讯:理论与方法》2013,42(9):2229-2247

In the Bayesian analysis of a multiple-recapture census, different diffuse prior distributions can lead to markedly different inferences about the population size N. Through consideration of the Fisher information matrix it is shown that the number of captures in each sample typically provides little information about N. This suggests that if there is no prior information about capture probabilities, then knowledge of just the sample sizes and not the number of recaptures should leave the distribution of Nunchanged. A prior model that has this property is identified and the posterior distribution is examined. In particular, asymptotic estimates of the posterior mean and variance are derived. Differences between Bayesian and classical point and interval estimators are illustrated through examples. 相似文献

7.

Corrected asymptotic distribution of statistics based on the multinomial law

D. Neveu A. Kramar P. Dujols 《Statistical Methodology》2007,4(1):64-74

Investigators and epidemiologists often use statistics based on the parameters of a multinomial distribution. Two main approaches have been developed to assess the inferences of these statistics. The first one uses asymptotic formulae which are valid for large sample sizes. The second one computes the exact distribution, which performs quite well for small samples. They present some limitations for sample sizes N neither large enough to satisfy the assumption of asymptotic normality nor small enough to allow us to generate the exact distribution. We analytically computed the 1/N corrections of the asymptotic distribution for any statistics based on a multinomial law. We applied these results to the kappa statistic in 2×2 and 3×3 tables. We also compared the coverage probability obtained with the asymptotic and the corrected distributions under various hypothetical configurations of sample size and theoretical proportions. With this method, the estimate of the mean and the variance were highly improved as well as the 2.5 and the 97.5 percentiles of the distribution, allowing us to go down to sample sizes around 20, for data sets not too asymmetrical. The order of the difference between the exact and the corrected values was 1/N² for the mean and 1/N³ for the variance. 相似文献

8.

Inflated statistical significance of student's t test associated with small intersubject correlation

《Journal of Statistical Computation and Simulation》2012,82(9):691-696

The independence assumption in statistical significance testing becomes increasingly crucial and unforgiving as sample size increases. Seemingly, inconsequential violations of this assumption can substantially increase the probability of a Type I error if sample sizes are large. In the case of Student's t test, it is found that correlations within samples in a range from 0.01 to 0.05 can lead to rejection of a true null hypothesis with high probability, if N is 50, 100 or larger. 相似文献

9.

Location‐invariant Multi‐sample U‐tests for Covariance Matrices with Large Dimension

下载免费PDF全文

M. Rauf Ahmad 《Scandinavian Journal of Statistics》2017,44(2):500-523

For two or more multivariate distributions with common covariance matrix, test statistics for certain special structures of the common covariance matrix are presented when the dimension of the multivariate vectors may exceed the number of such vectors. The test statistics are constructed as functions of location‐invariant estimators defined as U‐statistics, and the corresponding asymptotic theory is used to derive the limiting distributions of the proposed tests. The properties of the test statistics are established under mild and practical assumptions, and the same are numerically demonstrated using simulation results with small or moderate sample sizes and large dimensions. 相似文献

10.

A Bayesian analysis for the Wilcoxon signed-rank statistic

Richard A. Chechile 《统计学通讯:理论与方法》2018,47(21):5241-5254

A Bayesian analysis is provided for the Wilcoxon signed-rank statistic (T⁺). The Bayesian analysis is based on a sign-bias parameter φ on the (0, 1) interval. For the case of a uniform prior probability distribution for φ and for small sample sizes (i.e., 6 ? n ? 25), values for the statistic T⁺ are computed that enable probabilistic statements about φ. For larger sample sizes, approximations are provided for the asymptotic likelihood function P(T⁺|φ) as well as for the posterior distribution P(φ|T⁺). Power analyses are examined both for properly specified Gaussian sampling and for misspecified non Gaussian models. The new Bayesian metric has high power efficiency in the range of 0.9–1 relative to a standard t test when there is Gaussian sampling. But if the sampling is from an unknown and misspecified distribution, then the new statistic still has high power; in some cases, the power can be higher than the t test (especially for probability mixtures and heavy-tailed distributions). The new Bayesian analysis is thus a useful and robust method for applications where the usual parametric assumptions are questionable. These properties further enable a way to do a generic Bayesian analysis for many non Gaussian distributions that currently lack a formal Bayesian model. 相似文献

11.

On Nonsmooth Estimating Functions via Jackknife Empirical Likelihood

下载免费PDF全文

Zhouping Li Jinfeng Xu Wang Zhou 《Scandinavian Journal of Statistics》2016,43(1):49-69

In many applications, the parameters of interest are estimated by solving non‐smooth estimating functions with U‐statistic structure. Because the asymptotic covariances matrix of the estimator generally involves the underlying density function, resampling methods are often used to bypass the difficulty of non‐parametric density estimation. Despite its simplicity, the resultant‐covariance matrix estimator depends on the nature of resampling, and the method can be time‐consuming when the number of replications is large. Furthermore, the inferences are based on the normal approximation that may not be accurate for practical sample sizes. In this paper, we propose a jackknife empirical likelihood‐based inferential procedure for non‐smooth estimating functions. Standard chi‐square distributions are used to calculate the p‐value and to construct confidence intervals. Extensive simulation studies and two real examples are provided to illustrate its practical utilities. 相似文献

12.

Distributional Studies and the Computer: An Analysis of Durbin's Rank Test

Richard F. Fawcett Kathleen C. Salter 《The American statistician》2013,67(1):81-83

A study of the distribution of a statistic involves two major steps: (a) working out its asymptotic, large n, distribution, and (b) making the connection between the asymptotic results and the distribution of the statistic for the sample sizes used in practice. This crucial second step is not included in many studies. In this article, the second step is applied to Durbin's (1951) well-known rank test of treatment effects in balanced incomplete block designs (BIB's). We found that asymptotic, χ², distributions do not provide adequate approximations in most BIB's. Consequently, we feel that several of Durbin's recommendations should be altered. 相似文献

13.

Distributions of lrt associated with two-parameter exponential distributions

B. Nagarsenker P.B. Nagarsenker 《统计学通讯:理论与方法》2013,42(14):1583-1593

In this paper, an exact distribution of the likelihood ratio criterion for testing the equality of p two-parameter exponential distributions is obtained for unequal sample sizes in a computational form. A useful asymptotic expansion of the distribution is also obtained up to the order of n^-4 with the second term of the order of n^-3 and so can be used to obtain accurate approximations to the critical values of the test statistic even for comparatively small values of n where n is the combined sample size. In fact the first term alone which is a single beta distribution provides a powerful approximation for moderately large values of n. 相似文献

14.

Blinded sample size re-estimation in clinical trials comparing several treatments

Z. Govindarajulu 《Statistics》2013,47(6):575-591

An important question that arises in clinical trials is how many additional observations, if any, are required beyond those originally planned. This has satisfactorily been answered in the case of two-treatment double-blind clinical experiments. However, one may be interested in comparing a new treatment with its competitors, which may be more than one. This problem is addressed in this investigation involving responses from arbitrary distributions, in which the mean and the variance are not functionally related. First, a solution in determining the initial sample size for specified level of significance and power at a specified alternative is obtained. Then it is shown that when the initial sample size is large, the nominal level of significance and the power at the pre-specified alternative are fairly robust for the proposed sample size re-estimation procedure. An application of the results is made to the blood coagulation functionality problem considered by Kropf et al. [Multiple comparisons of treatments with stable multivariate tests in a two-stage adaptive design, including a test for non-inferiority, Biom. J. 42(8) (2000), pp. 951–965]. 相似文献

15.

Exact sample size determination in a weibull test plan when there is time censoring

《Journal of Statistical Computation and Simulation》2012,82(6):389-408

Engineers who conduct reliability tests need to choose the sample size when designing a test plan. The model parameters and quantiles are the typical quantities of interest. The large-sample procedure relies on the property that the distribution of the t -like quantities is close to the standard normal in large samples. In this paper, we use a new procedure based on both simulation and asymptotic theory to determine the sample size for a test plan. Unlike the complete data case, the t -like quantities are not pivotal quantities in general when data are time censored. However we show that the distribution of the t -like quantities only depend on the expected proportion failing and obtain the distributions by simulation for both complete and time censoring case when data follow Weibull distribution. We find that the large-sample procedure usually underestimates the sample size even when it is said to be 200 or more. The sample size given by the proposed procedure insures the requested nominal accuracy and confidence of the estimation when the test plan results in complete or time censored data. Some useful figures displaying the required sample size for the new procedure are also presented. 相似文献

16.

Maximum Test versus Adaptive Tests for the Two-Sample Location Problem

Markus Neuh user Herbert Bü ning Ludwig A. Hothorn 《Journal of applied statistics》2004,31(2):215-227

For the non-parametric two-sample location problem, adaptive tests based on a selector statistic are compared with a maximum and a sum test, respectively. When the class of all continuous distributions is not restricted, the sum test is not a robust test, i.e. it does not have a relatively high power across the different possible distributions. However, according to our simulation results, the adaptive tests as well as the maximum test are robust. For a small sample size, the maximum test is preferable, whereas for a large sample size the comparison between the adaptive tests and the maximum test does not show a clear winner. Consequently, one may argue in favour of the maximum test since it is a useful test for all sample sizes. Furthermore, it does not need a selector and the specification of which test is to be performed for which values of the selector. When the family of possible distributions is restricted, the maximin efficiency robust test may be a further robust alternative. However, for the family of t distributions this test is not as powerful as the corresponding maximum test. 相似文献

17.

Asymptotic behaviour of M‐estimators in AR(p) models under nonstandard conditions

Faouzi El Bantli Marc Hallin 《Revue canadienne de statistique》2001,29(1):155-168

The authors derive the limiting distribution of M‐estimators in AR(p) models under nonstandard conditions, allowing for discontinuities in score and density functions. Unlike usual regularity assumptions, these conditions are satisfied in the context of L₁‐estimation and autoregression quantiles. The asymptotic distributions of the resulting estimators, however, are not generally Gaussian. Moreover, their bootstrap approximations are consistent along very specific sequences of bootstrap sample sizes only. 相似文献

18.

F-distribution calibrated empirical likelihood ratio tests for multiple hypothesis testing

Lei Wang Dan Yang 《Journal of nonparametric statistics》2018,30(3):662-679

相似文献

19.

Using principal components to test normality of high-dimensional data

Rashid Mansoor 《统计学通讯:模拟与计算》2017,46(5):3396-3405

Many multivariate statistical procedures are based on the assumption of normality and different approaches have been proposed for testing this assumption. The vast majority of these tests, however, are exclusively designed for cases when the sample size n is larger than the dimension of the variable p, and the null distributions of their test statistics are usually derived under the asymptotic case when p is fixed and n increases. In this article, a test that utilizes principal components to test for nonnormality is proposed for cases when p/n → c. The power and size of the test are examined through Monte Carlo simulations, and it is argued that the test remains well behaved and consistent against most nonnormal distributions under this type of asymptotics. 相似文献

20.

Some results on bootstrap prediction intervals

Majid Mojirsheibani Robert Tibshirani 《Revue canadienne de statistique》1996,24(4):549-568

We investigate the construction of a BC_a-type bootstrap procedure for setting approximate prediction intervals for an efficient estimator θ_m of a scalar parameter θ, based on a future sample of size m. The results are also extended to nonparametric situations, which can be used to form bootstrap prediction intervals for a large class of statistics. These intervals are transformation-respecting and range-preserving. The asymptotic performance of our procedure is assessed by allowing both the past and future sample sizes to tend to infinity. The resulting intervals are then shown to be second-order correct and second-order accurate. These second-order properties are established in terms of min(m, n), and not the past sample size n alone. 相似文献