期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An Asymptotic Characterization of Finite Degree U-statistics With Sample Size-Dependent Kernels: Applications to Nonparametric Estimators and Test Statistics

Feng Yao 《统计学通讯:理论与方法》2013,42(15):3251-3265

We provide a simple result on the H-decomposition of a U-statistics that allows for easy determination of its magnitude when the statistic’s kernel depends on the sample size n. The result provides a direct and convenient method to characterize the asymptotic magnitude of semiparametric and nonparametric estimators or test statistics involving high dimensional sums. We illustrate the use of our result in previously studied estimators/test statistics and in a novel nonparametric R² test for overall significance of a nonparametric regression model. 相似文献

2.

Corrected asymptotic distribution of statistics based on the multinomial law

D. Neveu A. Kramar P. Dujols 《Statistical Methodology》2007,4(1):64-74

Investigators and epidemiologists often use statistics based on the parameters of a multinomial distribution. Two main approaches have been developed to assess the inferences of these statistics. The first one uses asymptotic formulae which are valid for large sample sizes. The second one computes the exact distribution, which performs quite well for small samples. They present some limitations for sample sizes N neither large enough to satisfy the assumption of asymptotic normality nor small enough to allow us to generate the exact distribution. We analytically computed the 1/N corrections of the asymptotic distribution for any statistics based on a multinomial law. We applied these results to the kappa statistic in 2×2 and 3×3 tables. We also compared the coverage probability obtained with the asymptotic and the corrected distributions under various hypothetical configurations of sample size and theoretical proportions. With this method, the estimate of the mean and the variance were highly improved as well as the 2.5 and the 97.5 percentiles of the distribution, allowing us to go down to sample sizes around 20, for data sets not too asymmetrical. The order of the difference between the exact and the corrected values was 1/N² for the mean and 1/N³ for the variance. 相似文献

3.

The Prediction Sum of Squares as a General Measure for Regression Diagnostics

Nguyen T. Quan 《商业与经济统计学杂志》2013,31(4):501-504

Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q² statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R ² with Q² and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q² statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q² and R ² values is negligible. 相似文献

4.

A Two Sample Test for Mean Vectors with Unequal Covariance Matrices

Tamae Kawasaki 《统计学通讯:模拟与计算》2015,44(7):1850-1866

In this paper, we consider testing the equality of two mean vectors with unequal covariance matrices. In the case of equal covariance matrices, we can use Hotelling’s T² statistic, which follows the F distribution under the null hypothesis. Meanwhile, in the case of unequal covariance matrices, the T² type test statistic does not follow the F distribution, and it is also difficult to derive the exact distribution. In this paper, we propose an approximate solution to the problem by adjusting the degrees of freedom of the F distribution. Asymptotic expansions up to the term of order N^{? 2} for the first and second moments of the U statistic are given, where N is the total sample size minus two. A new approximate degrees of freedom and its bias correction are obtained. Finally, numerical comparison is presented by a Monte Carlo simulation. 相似文献

5.

Distributional Studies and the Computer: An Analysis of Durbin's Rank Test

Richard F. Fawcett Kathleen C. Salter 《The American statistician》2013,67(1):81-83

A study of the distribution of a statistic involves two major steps: (a) working out its asymptotic, large n, distribution, and (b) making the connection between the asymptotic results and the distribution of the statistic for the sample sizes used in practice. This crucial second step is not included in many studies. In this article, the second step is applied to Durbin's (1951) well-known rank test of treatment effects in balanced incomplete block designs (BIB's). We found that asymptotic, χ², distributions do not provide adequate approximations in most BIB's. Consequently, we feel that several of Durbin's recommendations should be altered. 相似文献

6.

Identifying Variables Contributing to Outliers in Phase I

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(7):1103-1118

When a process is monitored with a T ² control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T ² statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T ² statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T ² statistic in Phase I with the corresponding T ² statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition. 相似文献

7.

Identifying rational distributed lag models using a joint test applied to the corner table

Alan Pankratz Brooks Elliott 《统计学通讯:理论与方法》2013,42(8):2279-2292

Often a distributed lag response pattern can be usefully represented in rational polynomial form. When the impulse response function decays, the corner table may be useful for model identification if appropriate statistical tests may be done. One or more joint tests are called for since use of the corner table involves studying groups of its elements. We consider an asymptotic x2 statistic that permits joint tests. We report simulation results showing that the distribution of this statistic follows the x ² distribution, for certain sample sizes and degrees of freedom, well enough to be useful in practice. With two data sets we illustrate how this statistic can be a useful aid when using the corner table. 相似文献

8.

Efficiency of t-Test and Hotelling's T 2-Test After Box-Cox Transformation

Jade Freeman Reza Modarres 《统计学通讯:理论与方法》2013,42(6):1109-1122

Early investigations of the effects of non-normality indicated that skewness has a greater effect on the distribution of t-statistic than does kurtosis. When the distribution is skewed, the actual p-values can be larger than the values calculated from the t-tables. Transformation of data to normality has shown good results in the case of univariate t-test. In order to reduce the effect of skewness of the distribution on normal-based t-test, one can transform the data and perform the t-test on the transformed scale. This method is not only a remedy for satisfying the distributional assumption, but it also turns out that one can achieve greater efficiency of the test. We investigate the efficiency of tests after a Box-Cox transformation. In particular, we consider the one sample test of location and study the gains in efficiency for one-sample t-test following a Box-Cox transformation. Under some conditions, we prove that the asymptotic relative efficiency of transformed t-test and Hotelling's T ²-test of multivariate location with respect to the same statistic based on untransformed data is at least one. 相似文献

9.

A Bayesian analysis for the Wilcoxon signed-rank statistic

Richard A. Chechile 《统计学通讯:理论与方法》2018,47(21):5241-5254

A Bayesian analysis is provided for the Wilcoxon signed-rank statistic (T⁺). The Bayesian analysis is based on a sign-bias parameter φ on the (0, 1) interval. For the case of a uniform prior probability distribution for φ and for small sample sizes (i.e., 6 ? n ? 25), values for the statistic T⁺ are computed that enable probabilistic statements about φ. For larger sample sizes, approximations are provided for the asymptotic likelihood function P(T⁺|φ) as well as for the posterior distribution P(φ|T⁺). Power analyses are examined both for properly specified Gaussian sampling and for misspecified non Gaussian models. The new Bayesian metric has high power efficiency in the range of 0.9–1 relative to a standard t test when there is Gaussian sampling. But if the sampling is from an unknown and misspecified distribution, then the new statistic still has high power; in some cases, the power can be higher than the t test (especially for probability mixtures and heavy-tailed distributions). The new Bayesian analysis is thus a useful and robust method for applications where the usual parametric assumptions are questionable. These properties further enable a way to do a generic Bayesian analysis for many non Gaussian distributions that currently lack a formal Bayesian model. 相似文献

10.

A new test for the mean vector in large dimension and small samples

Junguang Zhao 《统计学通讯:模拟与计算》2017,46(8):6115-6128

In this article, we consider the problem of testing the mean vector in the multivariate normal distribution, where the dimension p is greater than the sample size N. We propose a new test T_Block and obtain its asymptotic distribution. We also compare the proposed test with other two tests. The simulation results suggest that the performance of the new test is comparable to the existing two tests, and under some circumstances it may have higher power. Therefore, the new statistic can be employed in practice as an alternative choice. 相似文献

11.

Small-sample comparisons for powerdivergence goodness-of-fit statistics for symmetric and skewed simple null hypotheses

Miguel A. García-Pérez Vicente Núñez-Antón 《Journal of applied statistics》2001,28(7):855-874

Power-divergence goodness-of-fit statistics have asymptotically a chi-squared distribution. Asymptotic results may not apply in small-sample situations, and the exact significance of a goodness-of-fit statistic may potentially be over- or under-stated by the asymptotic distribution. Several correction terms have been proposed to improve the accuracy of the asymptotic distribution, but their performance has only been studied for the equiprobable case. We extend that research to skewed hypotheses. Results are presented for one-way multinomials involving k = 2 to 6 cells with sample sizes N = 20, 40, 60, 80 and 100 and nominal test sizes f = 0.1, 0.05, 0.01 and 0.001. Six power-divergence goodness-of-fit statistics were investigated, and five correction terms were included in the study. Our results show that skewness itself does not affect the accuracy of the asymptotic approximation, which depends only on the magnitude of the smallest expected frequency (whether this comes from a small sample with the equiprobable hypothesis or a large sample with a skewed hypothesis). Throughout the conditions of the study, the accuracy of the asymptotic distribution seems to be optimal for Pearson's X² statistic (the power-divergence statistic of index u = 1) when k > 3 and the smallest expected frequency is as low as between 0.1 and 1.5 (depending on the particular k, N and nominal test size), but a computationally inexpensive improvement can be obtained in these cases by using a moment-corrected h² distribution. If the smallest expected frequency is even smaller, a normal correction yields accurate tests through the log-likelihood-ratio statistic G² (the power-divergence statistic of index u = 0). 相似文献

12.

Comparing Two Tests for Two Rates

Chunpeng Fan Lin Wang Lynn Wei 《The American statistician》2017,71(3):275-281

This article rigorously proves superiority of the proportion χ² test to the logistic regression Wald test in terms of power when comparing two rates, despite their asymptotic equivalence under the null hypothesis that the two rates are equal. 相似文献

13.

A likelihood ratio test of a homoscedastic normal mixture against a heteroscedastic normal mixture

Yungtai Lo 《Statistics and Computing》2008,18(3):233-240

It is generally assumed that the likelihood ratio statistic for testing the null hypothesis that data arise from a homoscedastic normal mixture distribution versus the alternative hypothesis that data arise from a heteroscedastic normal mixture distribution has an asymptotic χ ² reference distribution with degrees of freedom equal to the difference in the number of parameters being estimated under the alternative and null models under some regularity conditions. Simulations show that the χ ² reference distribution will give a reasonable approximation for the likelihood ratio test only when the sample size is 2000 or more and the mixture components are well separated when the restrictions suggested by Hathaway (Ann. Stat. 13:795–800, 1985) are imposed on the component variances to ensure that the likelihood is bounded under the alternative distribution. For small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or a normal mixture with unequal variances. 相似文献

14.

Another Cautionary Note about R 2: Its Use in Weighted Least-Squares Regression Analysis

John B. Willett Judith D. Singer 《The American statistician》2013,67(3):236-238

A recent article in this journal presented a variety of expressions for the coefficient of determination (R ²) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R ² statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R ² statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example. 相似文献

15.

Small-sample comparison of the exact and asymptotic upper tail probabilities of chi-squared goodness-of-fit statistics: the binomila and the mixture binomial

《Journal of Statistical Computation and Simulation》2012,82(3):229-249

The exact and asymptotic upper tail probabilities (α = .10, .05, .01, .001) of the three chi-squared goodness-of-fit statistics Pearson's X ², likelihood ratioG ², and powerdivergence statisticD ²(λ), with λ= 2/3 are compared by complete enumeration for the binomial and the mixture binomial. For the two-component mixture binomial, three cases have been distinguished. 1. Both success probabilities and the mixing weights are unknwon. 2. One of the two success probabilities is known. And 3., the mixing weights are known. The binomial was investigated for the number of cellsk, being between 3 and 6 with sample sizes between 5 and 100, for k = 7 with sample sizes between 5 and 45, and for k = 10 with sample sizes ranging from 5 to 20. For the mixture binomial, solely k = 5 cells were considered with sample sizes from 5 to 100 and k = 8 cells with sample sizes between 4 and 20. Rating the relative accuracy of the chi-squared approximation in terms of ±10% and ±20% intervals around α led to the following conclusions for the binomial: 1. Using G2 is not recommendable. 2. At the significance levels α=.10 and α=.05X ² should be preferred over D ²; D ² is the best choice at α = .01. 3. Cochran's (1954; Biometrics, 10, 417-451) rule for the minimum expectation when using X ² seems to generalize to the binomial for G ² and D ² ; as a compromise, it gives a rather strong lower limit for the expected cell frequencies in some circumstances, but a rather liberal in others. To draw similar conclusions concerning the mixture binomial was not possible, because in that case, the accuracy of the chi-squared approximation is not only a function of the chosen test statistic and of the significance level, but also heavily depends on the numerical value of theinvolved unknown parameters and on the hypothesis to be tested. Thereto, the present study may give rise only to warnings against the application of mixture models to small samples. 相似文献

16.

Accuracy of Power-Divergence Statistics for Testing Independence and Homogeneity in Two-Way Contingency Tables

Miguel A. García-Pérez Vicente Núñez-Antón 《统计学通讯:模拟与计算》2013,42(3):503-512

The small-sample accuracy of seven members of the family of power-divergence statistics for testing independence or homogeneity in contingency tables was studied via simulation. The likelihood ratio statistic G ² and Pearson's X ² statistic are among these seven members, whose behavior was studied at nominal test sizes of.01 and.05 with marginal distributions that could be uniform or skewed and with a set of sample sizes that included sparseness conditions as measured through table density (i.e., the ratio of sample size to number of cells). The likelihood ratio statistic G ² rejected the null hypothesis too often even with large table density, whereas Pearson's X ² was sufficiently accurate and only presented a minor misbehavior when table density was less than two observations/cell. None of the other five statistics outperformed Pearson's X ². A nonasymptotic variant of X ² solved the minor inaccuracies of Pearson's X ² and turned out to be the most accurate statistic for testing independence or homogeneity, even with table densities of one observation/cell. These results clearly advise against the use of the likelihood ratio statistic G ². 相似文献

17.

On the distributions of multivariate sample skewness

Naoya Okamoto Takashi Seo 《Journal of statistical planning and inference》2010

In this paper, we consider the multivariate normality test based on measure of multivariate sample skewness defined by Srivastava (1984). Srivastava derived asymptotic expectation up to the order N⁻¹ for the multivariate sample skewness and approximate χ²

χ^{2}

test statistic, where N is sample size. Under normality, we derive another expectation and variance for Srivastava's multivariate sample skewness in order to obtain a better test statistic. From this result, improved approximate χ²

χ^{2}

test statistic using the multivariate sample skewness is also given for assessing multivariate normality. Finally, the numerical result by Monte Carlo simulation is shown in order to evaluate accuracy of the obtained expectation, variance and improved approximate χ²

χ^{2}

test statistic. Furthermore, upper and lower percentiles of χ²

χ^{2}

test statistic derived in this paper are compared with those of χ²

χ^{2}

test statistic derived by Mardia (1974) which is used multivariate sample skewness defined by Mardia (1970). 相似文献

18.

On Two Types of Breakdown Points of Weighted L 2-median

Caiya Zhang Yan Luo 《统计学通讯:理论与方法》2013,42(7):1131-1141

Zuo (2004) investigated the simplified replacement finite sample breakdown point of weighted L ^p-depth and L ^p-median for some appropriate weight functions. The addition breakdown point of weighted L ^p-depth functions is studied firstly in this article. In addition, for some other weight functions different from those in Zuo (2004 Zuo , Y. ( 2004 ). Robustness of weighted L ^p-depth and L ^p-median . Allgemeines Statistics Archiv. 88 : 215 – 234 . [Google Scholar]), we establish the lower bounds of these two types of breakdown point of weighted L ²-median. 相似文献

19.

Uniform robustness against nonnormality of the t and f tests

Hanfeng Chen Wei-Yin Loh 《统计学通讯:理论与方法》2013,42(10):3707-3723

The size of the two-sample t test is generally thought to be robust against nonnormal distributions if the sample sizes are large. This belief is based on central limit theory, and asymptotic expansions of the moments of the t statistic suggest that robustness may be improved for moderate sample sizes if the variance, skewness, and kurtosis of the distributions are matched, particularly if the sample sizes are also equal.

It is shown that asymptotic arguments such as these can be misleading and that, in fact, the size of the t test can be as large as unity if the distributions are allowed to be completely arbitrary. Restricting the distributions to be identical or symmetric (but otherwise arbitrary) does not guarantee that the size can be controlled either, but controlling the tail-heaviness of the distributions does. The last result is proved more generally for the k-sample F test. 相似文献

20.

The Tale of Cochran's Rule: My Contingency Table has so Many Expected Values Smaller than 5, What Am I to Do?

P. M. Kroonenberg Albert Verbeek 《The American statistician》2018,72(2):175-183

In an informal way, some dilemmas in connection with hypothesis testing in contingency tables are discussed. The body of the article concerns the numerical evaluation of Cochran's Rule about the minimum expected value in r × c contingency tables with fixed margins when testing independence with Pearson's X² statistic using the χ² distribution. 相似文献