首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The discussion on the use and misuse of p-values in 2016 by the American Statistician Association was a timely assertion that statistical concept should be properly used in science. Some researchers, especially the economists, who adopt significance testing and p-values to report their results, may felt confused by the statement, leading to misinterpretations of the statement. In this study, we aim to re-examine the accuracy of the p-value and introduce an alternative way for testing the hypothesis. We conduct a simulation study to investigate the reliability of the p-value. Apart from investigating the performance of p-value, we also introduce some existing approaches, Minimum Bayes Factors and Belief functions, for replacing p-value. Results from the simulation study confirm unreliable p-value in some cases and that our proposed approaches seem to be useful as the substituted tool in the statistical inference. Moreover, our results show that the plausibility approach is more accurate for making decisions about the null hypothesis than the traditionally used p-values when the null hypothesis is true. However, the MBFs of Edwards et al. [Bayesian statistical inference for psychological research. Psychol. Rev. 70(3) (1963), pp. 193–242]; Vovk [A logic of probability, with application to the foundations of statistics. J. Royal Statistical Soc. Series B (Methodological) 55 (1993), pp. 317–351] and Sellke et al. [Calibration of p values for testing precise null hypotheses. Am. Stat. 55(1) (2001), pp. 62–71] provide more reliable results compared to all other methods when the null hypothesis is false.KEYWORDS: Ban of P-value, Minimum Bayes Factors, belief functions  相似文献   

2.
Abstract

In this paper, we consider the preliminary test approach to the estimation of the regression parameter in a multiple regression model under multicollinearity situation. The preliminary test almost unbiased two-parameter estimators based on the Wald, the Likelihood ratio, and the Lagrangian multiplier tests are given, when it is suspected that the regression parameter may be restricted to a subspace and the regression error is distributed with multivariate Student’s t errors. The bias and quadratic risk of the proposed estimators are derived and compared. Furthermore, a Monte Carlo simulation is provided to illustrate some of the theoretical results.  相似文献   

3.
Non-normality and heteroscedasticity are common in applications. For the comparison of two samples in the non-parametric Behrens–Fisher problem, different tests have been proposed, but no single test can be recommended for all situations. Here, we propose combining two tests, the Welch t test based on ranks and the Brunner–Munzel test, within a maximum test. Simulation studies indicate that this maximum test, performed as a permutation test, controls the type I error rate and stabilizes the power. That is, it has good power characteristics for a variety of distributions, and also for unbalanced sample sizes. Compared to the single tests, the maximum test shows acceptable type I error control.  相似文献   

4.
5.
ABSTRACT

The Mack–Wolfe test is the most frequently used non parametric procedure for the umbrella alternative problem. In this paper, modifications of the Mack–Wolfe test are proposed for both known peak and unknown peak umbrellas. The exact mean and variance of the proposed tests in the null hypothesis are also derived. We compare these tests with some of the existing tests in terms of the type I error rate and power. In addition, a real data example is presented.  相似文献   

6.
This paper reviews global and multiple tests for the combination ofn hypotheses using the orderedp-values of then individual tests. In 1987, Röhmel and Streitberg presented a general method to construct global level α tests based on orderedp-values when there exists no prior knowledge regarding the joint distribution of the corresponding test statistics. In the case of independent test statistics, construction of global tests is available by means of recursive formulae presented by Bicher (1989), Kornatz (1994) and Finner and Roters (1994). Multiple test procedures can be developed by applying the closed test principle using these global tests as building blocks. Liu (1996) proposed representing closed tests by means of “critical matrices” which contain the critical values of the global tests. Within the framework of these theoretical concepts, well-known global tests and multiple test procedures are classified and the relationships between the different tests are characterised.  相似文献   

7.
8.
The power of the Fisher permutation test extended to 2 × k tables is evaluated unconditionally as a function of the under-lying cell probabilities in the table. These results are then applied in assessing the sensitivity of two-generation cancer bioassays in which a fixed number of pups from each litter born in the first generation are selected to continue on test in the second generation. In this case, the two rows of the table correspond to two treatment groups and the k columns correspond to the number of animals responding in a litter. The cell probabilities in this application are based on a suitable beta-binomial superpopulation model.  相似文献   

9.
In this paper, we consider the validity of the Jarque–Bera normality test whose construction is based on the residuals, for the innovations of GARCH (generalized autoregressive conditional heteroscedastic) models. It is shown that the asymptotic behavior of the original form of the JB test adopted in this paper is identical to that of the test statistic based on true errors. The simulation study also confirms the validity of the original form since it outperforms other available normality tests.  相似文献   

10.
The problem of testing for a parameter change has been a core issue in time series analysis. It is well known that the estimates-based CUSUM test often suffers from severe size distortions in general GARCH type models. The residual-based CUSUM test has been used as an alternative, which, however, has a defect not to detect the ARMA parameter changes in ARMA–GARCH models. As a remedy, one can employ the score vector-based CUSUM test in ARMA–GARCH models as in Oh and Lee (0000). However, it shows some size distortions for relatively small samples. Hence, we consider the bootstrap counterpart for obtaining a more stable test. Focus is made on the verification of the weak consistency of the proposed test. An empirical study is illustrated for its evaluation.  相似文献   

11.
In this paper, we consider the bootstrap procedure for the augmented Dickey–Fuller (ADF) unit root test by implementing the modified divergence information criterion (MDIC, Mantalos et al. [An improved divergence information criterion for the determination of the order of an AR process, Commun. Statist. Comput. Simul. 39(5) (2010a), pp. 865–879; Forecasting ARMA models: A comparative study of information criteria focusing on MDIC, J. Statist. Comput. Simul. 80(1) (2010b), pp. 61–73]) for the selection of the optimum number of lags in the estimated model. The asymptotic distribution of the resulting bootstrap ADF/MDIC test is established and its finite sample performance is investigated through Monte-Carlo simulations. The proposed bootstrap tests are found to have finite sample sizes that are generally much closer to their nominal values, than those tests that rely on other information criteria, like the Akaike information criterion [H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csáki, eds., Akademiai Kaido, Budapest, 1973, pp. 267–281]. The simulations reveal that the proposed procedure is quite satisfactory even for models with large negative moving average coefficients.  相似文献   

12.
We develop an exact Kolmogorov–Smirnov goodness-of-fit test for the Poisson distribution with an unknown mean. This test is conditional, with the test statistic being the maximum absolute difference between the empirical distribution function and its conditional expectation given the sample total. Exact critical values are obtained using a new algorithm. We explore properties of the test, and we illustrate it with three examples. The new test seems to be the first exact Poisson goodness-of-fit test for which critical values are available without simulation or exhaustive enumeration.  相似文献   

13.
14.
In this paper, we study the Jarque–Bera (JB) normality test for the innovations of ARMA–GARCH models, whose construction is based on the residuals. The validity of the JB test for ARMA–GARCH innovations should be carefully investigated in advance of actual practice, since the residual-based test may behave differently, depending upon the structure of the time series models and the form of the test statistic (cf. Chen and Kuan, 2003, Hwang and Baek, 2009, Lee and Wei, 1999). In order to demonstrate the validity of the JB test, we prove that the asymptotic distribution of the original form of the JB test is identical to that of the test statistic based on true errors under mild conditions. Simulation results are provided for illustration.  相似文献   

15.
We introduce the 2nd-power skewness and kurtosis, which are interesting alternatives to the classical Pearson's skewness and kurtosis, called 3rd-power skewness and 4th-power kurtosis in our terminology. We use the sample 2nd-power skewness and kurtosis to build a powerful test of normality. This test can also be derived as Rao's score test on the asymmetric power distribution, which combines the large range of exponential tail behavior provided by the exponential power distribution family with various levels of asymmetry. We find that our test statistic is asymptotically chi-squared distributed. We also propose a modified test statistic, for which we show numerically that the distribution can be approximated for finite sample sizes with very high precision by a chi-square. Similarly, we propose a directional test based on sample 2nd-power kurtosis only, for the situations where the true distribution is known to be symmetric. Our tests are very similar in spirit to the famous Jarque–Bera test, and as such are also locally optimal. They offer the same nice interpretation, with in addition the gold standard power of the regression and correlation tests. An extensive empirical power analysis is performed, which shows that our tests are among the most powerful normality tests. Our test is implemented in an R package called PoweR.  相似文献   

16.
17.
18.
Outliers are commonly observed in psychosocial research, generally resulting in biased estimates when comparing group differences using popular mean-based models such as the analysis of variance model. Rank-based methods such as the popular Mann–Whitney–Wilcoxon (MWW) rank sum test are more effective to address such outliers. However, available methods for inference are limited to cross-sectional data and cannot be applied to longitudinal studies under missing data. In this paper, we propose a generalized MWW test for comparing multiple groups with covariates within a longitudinal data setting, by utilizing the functional response models. Inference is based on a class of U-statistics-based weighted generalized estimating equations, providing consistent and asymptotically normal estimates not only under complete but missing data as well. The proposed approach is illustrated with both real and simulated study data.  相似文献   

19.
A robust test of a parameter while in the presence of nuisance parameters was proposed by Wang (1981). The test procedure is a robust extension of the optimal C(α) tests. A numerical method for computing the solution of the orthogonality condition that is required by the test procedure is provided. An example on the testing of normal scale while in the presence of outliers is worked out to illustrate the construction of the robust test.  相似文献   

20.
Preliminary tests of significance on the crucial assumptions are often done before drawing inferences of primary interest. In a factorial trial, the data may be pooled across the columns or rows for making inferences concerning the efficacy of the drugs {simple effect) in the absence of interaction. Pooling the data has an advantage of higher power due to larger sample size. On the other hand, in the presence of interaction, such pooling may seriously inflate the type I error rate in testing for the simple effect.

A preliminary test for interaction is therefore in order. If this preliminary test is not significant at some prespecified level of significance, then pool the data for testing the efficacy of the drugs at a specified α level. Otherwise, use of the corresponding cell means for testing the efficacy of the drugs at the specified α is recommended. This paper demonstrates that this adaptive procedure may seriously inflate the overall type I error rate. Such inflation happens even in the absence of interaction.

One interesting result is that the type I error rate of the adaptive procedure depends on the interaction and the square root of the sample size only through their product. One consequence of this result is as follows. No matter how small the non-zero interaction might be, the inflation of the type I error rate of the always-pool procedure will eventually become unacceptable as the sample size increases. Therefore, in a very large study, even though the interaction is suspected to be very small but non-zero, the always-pool procedure may seriously inflate the type I error rate in testing for the simple effects.

It is concluded that the 2 × 2 factorial design is not an efficient design for detecting simple effects, unless the interaction is negligible.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号