首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For testing goodness-of-fit in a k cell multinomial distribution having very small frequencies, the usual chi-square approximation to the upper tail of the likelihood ratio statistic, G2 is not satisfactory. A new adjustment to G2 is determined on the basis of analytical investigation in terms of asymptotic bias and variance of the adjusted G2 A Monte Carlo simulation is performed for several one-way tables to assess the adjustment of G2 in order to obtain a closer approximation to the nomial level of significance.  相似文献   

2.
For a postulated common odds ratio for several 2 × 2 contingency tables one may, by conditioning on the marginals of the seperate tables, determine the exact expectation and variance of the entry in a particular cell of each table, hence for the total of such cells across all tables. This makes it feasible to determine limiting values, via single-degree-of-freedom, continuity-corrected chi-square tests on the common odds ratio–one determines lower and upper limits corresponding to just barely significant chi-square values. The Mantel-Haenszel approach can be viewed as a special application of this, but directed specifically to the case of unity for the odds ratio, for which the expectation and variance formulas are particularly simple. Computation of exact expectations and variances may be feasible only for 2 × 2 tables of limited size, but asymptotic formulas can be applied in other instances.Illustration is given for a particular set of four 2 × 2 tables in which both exact limits and limits by the proposed method could be applied, the two methods giving reasonably good agreement. Both procedures are directed at the distribution of the total over the designated cells, the proposed method treating that distribution as being asymptotically normal. Especially good agreement of proposed with exact limits could be anticipated in more asymptotic situations (overall, not for individual tables) but in practice this may not be demonstrable as the computation of exact limits is then unfeasible.  相似文献   

3.
Power-divergence goodness-of-fit statistics have asymptotically a chi-squared distribution. Asymptotic results may not apply in small-sample situations, and the exact significance of a goodness-of-fit statistic may potentially be over- or under-stated by the asymptotic distribution. Several correction terms have been proposed to improve the accuracy of the asymptotic distribution, but their performance has only been studied for the equiprobable case. We extend that research to skewed hypotheses. Results are presented for one-way multinomials involving k = 2 to 6 cells with sample sizes N = 20, 40, 60, 80 and 100 and nominal test sizes f = 0.1, 0.05, 0.01 and 0.001. Six power-divergence goodness-of-fit statistics were investigated, and five correction terms were included in the study. Our results show that skewness itself does not affect the accuracy of the asymptotic approximation, which depends only on the magnitude of the smallest expected frequency (whether this comes from a small sample with the equiprobable hypothesis or a large sample with a skewed hypothesis). Throughout the conditions of the study, the accuracy of the asymptotic distribution seems to be optimal for Pearson's X2 statistic (the power-divergence statistic of index u = 1) when k > 3 and the smallest expected frequency is as low as between 0.1 and 1.5 (depending on the particular k, N and nominal test size), but a computationally inexpensive improvement can be obtained in these cases by using a moment-corrected h2 distribution. If the smallest expected frequency is even smaller, a normal correction yields accurate tests through the log-likelihood-ratio statistic G2 (the power-divergence statistic of index u = 0).  相似文献   

4.
The distribution of the chi-square goodness-of-fit statistic is studied in the equiprobable case. Tables of exact critical values are given for a = .1, .05, .01, .005; k = 2(1)4, N = 26(1)50; k = 5, N = 26(1)40; k = 6(1)10, N = 26(1)30, where a is the desired significance level, k is the number of cells and N is the sample size. Methods of fitting the true distribution are compared. If k> 3, it is found that a simple additive adjustment to the asymptotic chi-square fit leads to high accuracy even for N between 10 and 20. For k = 2, the Yates corrected chi-square statistic is very accurately fitted by the usual chi-square distribution.  相似文献   

5.
In this note it is shown that even for relatively large sample sites the asymptotic distribution of the smoothed length as derived in Reschenhofer and Bomse (1991) should not be used for the determination of critical values. Therefore extended tables of critical values for both the 1% and 5% levels of significance generated by simulation are presented.  相似文献   

6.
The risk of a sampling strategy is a function on the parameter space, which is the set of all vectors composed of possible values of the variable of interest. It seems natural to ask for a minimax strategy, minimizing the maximal risk. So far answers have been provided for completely symmetric parameter spaces. Results available for more general spaces refer to sample size 1 or to large sample sizes allowing for asymptotic approximation. In the present paper we consider arbitrary sample sizes, derive a lower bound for the maximal risk under very weak conditions and obtain minimax strategies for a large class of parameter spaces. Our results do not apply to parameter spaces with strong deviations from symmetry. For such spaces a minimax strategy will prescribe to consider only a small number of samples and takes a non-random and purposive character, which is in accordance with the common practice of completely sampling a stratum of large units.  相似文献   

7.
The asymptotic relative efficiency of Kendall's and Spearman's coefficients of rank correlation are considered for samples from a bivariate normal distribution and comments are made on the calculation of their variances. For large samples it is suggested that one should use mean values of the coefficients calculated by splitting the sample into a fairly large number of smaller samples. This reduces the amount of calculation required and the asymptotic relative efficiency of this procedure is found both for ρ= 0 and ρ≠ 0.  相似文献   

8.
There are numerous situations in categorical data analysis where one wishes to test hypotheses involving a set of linear inequality constraints placed upon the cell probabilities. For example, it may be of interest to test for symmetry in k × k contingency tables against one-sided alternatives. In this case, the null hypothesis imposes a set of linear equalities on the cell probabilities (namely pij = Pji ×i > j), whereas the alternative specifies directional inequalities. Another important application (Robertson, Wright, and Dykstra 1988) is testing for or against stochastic ordering between the marginals of a k × k contingency table when the variables are ordinal and independence holds. Here we extend existing likelihood-ratio results to cover more general situations. To be specific, we consider testing Ht,0 against H1 - H0 and H1 against H2 - H 1 when H0:k × i=1 pixji = 0, j = 1,…, s, H1:k × i=1 pixji × 0, j = 1,…, s, and does not impose any restrictions on p. The xji's are known constants, and s × k - 1. We show that the asymptotic distributions of the likelihood-ratio tests are of chi-bar-square type, and provide expressions for the weighting values.  相似文献   

9.
Importance sampling and Markov chain Monte Carlo methods have been used in exact inference for contingency tables for a long time, however, their performances are not always very satisfactory. In this paper, we propose a stochastic approximation Monte Carlo importance sampling (SAMCIS) method for tackling this problem. SAMCIS is a combination of adaptive Markov chain Monte Carlo and importance sampling, which employs the stochastic approximation Monte Carlo algorithm (Liang et al., J. Am. Stat. Assoc., 102(477):305–320, 2007) to draw samples from an enlarged reference set with a known Markov basis. Compared to the existing importance sampling and Markov chain Monte Carlo methods, SAMCIS has a few advantages, such as fast convergence, ergodicity, and the ability to achieve a desired proportion of valid tables. The numerical results indicate that SAMCIS can outperform the existing importance sampling and Markov chain Monte Carlo methods: It can produce much more accurate estimates in much shorter CPU time than the existing methods, especially for the tables with high degrees of freedom.  相似文献   

10.
Five tests of homogeneity for a 2x(k+l) contingency table are compared using Monte Carlo techniques. For these studiesit is assumed that k becomes large in such a way that thecontingency table is sparse for 2xk of the cells, but the sample size in two of the cells remains large. The test statistics studied are: the chi-square approximation to the Pearson test statistic, the chi-square approximation to the likelihood ratio statistic, the normal approximation to Zelterman's (1984)the normal approximation to Pearson's chi-square, and the normal approximation to the likelihood ratio statistic. For the range of parameters studied the chi-square approximation to Pearson's statistic performs consistently well with regard to its size and power.  相似文献   

11.
The best-known non-asymptotic method for comparing two independent proportions is Fisher's exact text. The usual critical region (CR) tables for this test contain one or more of the following defects:they distinguish between rows and columns; they distinguish between the alternatives H = p1 < p2 and H = p1 > p2; they assume that the error for the two-tailed test is twice that of the one-tailed test; they do not use the optimal version of the test; they do not give both CRs for one and two tails at the same time. All this results in the unnecessary duplication of the space required for the tables, the construction of tables of low-powered methods, or the need to manipulate two different tables (one for the one-tailed test, the other for the two-tailed test). This paper presents CR tables which have been obtained from the most powerful version of Fisher's exact test and which occupy the minimum space possible. The tables, which are valid for one- or two-tailed tests, have levels of significance of 10%, 5% and 1% and values for N (the total size of both samples) of less than or equal to 40. This article shows how to calculate the P value in a specific problem, using the tables as a means of partial checking and as a preliminary step to determining the exact P value.  相似文献   

12.
Investigators and epidemiologists often use statistics based on the parameters of a multinomial distribution. Two main approaches have been developed to assess the inferences of these statistics. The first one uses asymptotic formulae which are valid for large sample sizes. The second one computes the exact distribution, which performs quite well for small samples. They present some limitations for sample sizes N neither large enough to satisfy the assumption of asymptotic normality nor small enough to allow us to generate the exact distribution. We analytically computed the 1/N corrections of the asymptotic distribution for any statistics based on a multinomial law. We applied these results to the kappa statistic in 2×2 and 3×3 tables. We also compared the coverage probability obtained with the asymptotic and the corrected distributions under various hypothetical configurations of sample size and theoretical proportions. With this method, the estimate of the mean and the variance were highly improved as well as the 2.5 and the 97.5 percentiles of the distribution, allowing us to go down to sample sizes around 20, for data sets not too asymmetrical. The order of the difference between the exact and the corrected values was 1/N2 for the mean and 1/N3 for the variance.  相似文献   

13.
Summary.  In magazine advertisements for new drugs, it is common to see summary tables that compare the relative frequency of several side-effects for the drug and for a placebo, based on results from placebo-controlled clinical trials. The paper summarizes ways to conduct a global test of equality of the population proportions for the drug and the vector of population proportions for the placebo. For multivariate normal responses, the Hotelling T 2-test is a well-known method for testing equality of a vector of means for two independent samples. The tests in the paper are analogues of this test for vectors of binary responses. The likelihood ratio tests can be computationally intensive or have poor asymptotic performance. Simple quadratic forms comparing the two vectors provide alternative tests. Much better performance results from using a score-type version with a null-estimated covariance matrix than from the sample covariance matrix that applies with an ordinary Wald test. For either type of statistic, asymptotic inference is often inadequate, so we also present alternative, exact permutation tests. Follow-up inferences are also discussed, and our methods are applied to safety data from a phase II clinical trial.  相似文献   

14.
A 2 2 2 contingency table can often be analysed in an exact fashion by using Fisher's exact test and in an approximate fashion by using the chi-squared test with Yates' continuity correction, and it is traditionally held that the approximation is valid when the minimum expected quantity E is E S 5. Unfortunately, little research has been carried out into this belief, other than that it is necessary to establish a bound E>E*, that the condition E S 5 may not be the most appropriate (Martín Andrés et al., 1992) and that E* is not a constant, but usually increasing with the growth of the sample size (Martín Andrés & Herranz Tejedor, 1997). In this paper, the authors conduct a theoretical experimental study from which they ascertain that E* value (which is very variable and frequently quite a lot greater than 5) is strongly related to the magnitude of the skewness of the underlying hypergeometric distribution, and that bounding the skewness is equivalent to bounding E (which is the best control procedure). The study enables estimating the expression for the above-mentioned E* (which in turn depends on the number of tails in the test, the alpha error used, the total sample size, and the minimum marginal imbalance) to be estimated. Also the authors show that E* increases generally with the sample size and with the marginal imbalance, although it does reach a maximum. Some general and very conservative validity conditions are E S 35.53 (one-tailed test) and E S 7.45 (two-tailed test) for alpha nominal errors in 1% h f h 10%. The traditional condition E S 5 is only valid when the samples are small and one of the marginals is very balanced; alternatively, the condition E S 5.5 is valid for small samples or a very balanced marginal. Finally, it is proved that the chi-squared test is always valid in tables where both marginals are balanced, and that the maximum skewness permitted is related to the maximum value of the bound E*, to its value for tables with at least one balanced marginal and to the minimum value that those marginals must have (in non-balanced tables) for the chi-squared test to be valid.  相似文献   

15.
A correlated probit model approximation for conditional probabilities (Mendell and Elston 1974) is used to estimate the variance for binary matched pairs data by maximum likelihood. Using asymptotic data, the bias of the estimates is shown to be small for a wide range of intra-class correlations and incidences. This approximation is also compared with other recently published, or implemented, improved approximations. For the small sample examples presented, it shows a substantial advantage over other approximations. The method is extended to allow covariates for each observation, and fitting by iteratively reweighted least squares.  相似文献   

16.
We propose bootstrap prediction intervals for an observation h periods into the future and its conditional mean. We assume that these forecasts are made using a set of factors extracted from a large panel of variables. Because we treat these factors as latent, our forecasts depend both on estimated factors and estimated regression coefficients. Under regularity conditions, asymptotic intervals have been shown to be valid under Gaussianity of the innovations. The bootstrap allows us to relax this assumption and to construct valid prediction intervals under more general conditions. Moreover, even under Gaussianity, the bootstrap leads to more accurate intervals in cases where the cross-sectional dimension is relatively small as it reduces the bias of the ordinary least-squares (OLS) estimator.  相似文献   

17.
When an I×J contingency table has many cells having very small frequencies, the usual chi-square approximation to the upper tail of the likelihood ratio goodness-of-fit statistic, G2 and Pearson chi-square statistic, X2, for testing independence, are not satisfactory. In this paper we consider the problem of adjusting G2 and X2. Suitable adjustments are suggested on the basis of analytical investigation of asymptotic bias terms for G2 and X2. A Monte Carlo simulation is performed for several tables to assess the adjustments of G2 and X2 in order to obtain a closer approximation to the nominal level of significance.  相似文献   

18.
Apart from having intrinsic mathematical interest, order statistics are also useful in the solution of many applied sampling and analysis problems. For a general review of the properties and uses of order statistics, see David (1981). This paper provides tabulations of means and variances of certain order statistics from the gamma distribution, for parameter values not previously available. The work was motivated by a particular quota sampling problem, for which existing tables are not adequate. The solution to this sampling problem actually requires the moments of the highest order statistic within a given set; however the calculation algorithm used involves a recurrence relation, which causes all the lower order statistics to be calculated first. Therefore we took the opportunity to develop more extensive tables for the gamma order statistic moments in general. Our tables provide values for the order statistic moments which were not available in previous tables, notably those for higher values of m, the gamma distribution shape parameter. However we have also retained the corresponding statistics for lower values of m, first to allow for checking accuracy of the computtions agtainst previous tables, and second to provide an integrated presentation of our new results with the previously known values in a consistent format  相似文献   

19.
The Asymptotic Power Of Jonckheere-Type Tests For Ordered Alternatives   总被引:1,自引:0,他引:1  
For the c -sample location problem with ordered alternatives, the test proposed by Barlow et al . (1972 p. 184) is an appropriate one under the model of normality. For non-normal data, however, there are rank tests which have higher power than the test of Barlow et al ., e.g. the Jonckheere test or so-called Jonckheere-type tests recently introduced and studied by Büning & Kössler (1996). In this paper the asymptotic power of the Jonckheere-type tests is computed by using results of Hájek (1968) which may be considered as extensions of the theorem of Chernoff & Savage (1958). Power studies via Monte Carlo simulation show that the asymptotic power values provide a good approximation to the finite ones even for moderate sample sizes.  相似文献   

20.
This paper considers the problem of calculating a confidence interval for the angular difference between the mean directions of two spherical random variables with rotationally symmetric unimodal distributions. For large sample sizes, it is shown that the asymptotic distribution of 1 – cos α, where α is the sample angular difference, is approximately exponential if the true difference is zero, and approximately normal for a ‘large’ true difference; a scaled beta approximation is determined for the general case. For small sample sizes, a bootstrap approach is recommended. The results are applied to two sets of palaeomagnetic data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号