首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The discussion on the use and misuse of p-values in 2016 by the American Statistician Association was a timely assertion that statistical concept should be properly used in science. Some researchers, especially the economists, who adopt significance testing and p-values to report their results, may felt confused by the statement, leading to misinterpretations of the statement. In this study, we aim to re-examine the accuracy of the p-value and introduce an alternative way for testing the hypothesis. We conduct a simulation study to investigate the reliability of the p-value. Apart from investigating the performance of p-value, we also introduce some existing approaches, Minimum Bayes Factors and Belief functions, for replacing p-value. Results from the simulation study confirm unreliable p-value in some cases and that our proposed approaches seem to be useful as the substituted tool in the statistical inference. Moreover, our results show that the plausibility approach is more accurate for making decisions about the null hypothesis than the traditionally used p-values when the null hypothesis is true. However, the MBFs of Edwards et al. [Bayesian statistical inference for psychological research. Psychol. Rev. 70(3) (1963), pp. 193–242]; Vovk [A logic of probability, with application to the foundations of statistics. J. Royal Statistical Soc. Series B (Methodological) 55 (1993), pp. 317–351] and Sellke et al. [Calibration of p values for testing precise null hypotheses. Am. Stat. 55(1) (2001), pp. 62–71] provide more reliable results compared to all other methods when the null hypothesis is false.KEYWORDS: Ban of P-value, Minimum Bayes Factors, belief functions  相似文献   

2.
Abstract

In this paper, we consider the preliminary test approach to the estimation of the regression parameter in a multiple regression model under multicollinearity situation. The preliminary test almost unbiased two-parameter estimators based on the Wald, the Likelihood ratio, and the Lagrangian multiplier tests are given, when it is suspected that the regression parameter may be restricted to a subspace and the regression error is distributed with multivariate Student’s t errors. The bias and quadratic risk of the proposed estimators are derived and compared. Furthermore, a Monte Carlo simulation is provided to illustrate some of the theoretical results.  相似文献   

3.
We show that the Bradley–Blackwood simultaneous test for equal means and equal variances in paired-samples additively decomposes into separate tests of these hypotheses. The test of equal variances in the decomposition is the standard Pitman–Morgan procedure. The test of equal means in the decomposition is based on a t-ratio with (n ? 2) degrees of freedom and has the additional restriction that the variances are equal.  相似文献   

4.
Non-normality and heteroscedasticity are common in applications. For the comparison of two samples in the non-parametric Behrens–Fisher problem, different tests have been proposed, but no single test can be recommended for all situations. Here, we propose combining two tests, the Welch t test based on ranks and the Brunner–Munzel test, within a maximum test. Simulation studies indicate that this maximum test, performed as a permutation test, controls the type I error rate and stabilizes the power. That is, it has good power characteristics for a variety of distributions, and also for unbalanced sample sizes. Compared to the single tests, the maximum test shows acceptable type I error control.  相似文献   

5.
6.
The Hosmer–Lemeshow test is a widely used method for evaluating the goodness of fit of logistic regression models. But its power is much influenced by the sample size, like other chi-square tests. Paul, Pennell, and Lemeshow (2013 Paul, P., M. L. Pennell, and S. Lemeshow. 2013. Standardizing the power of the Hosmer–Lemeshow goodness of fit test in large data sets. Statistics in Medicine 32:6780.[Crossref], [PubMed], [Web of Science ®] [Google Scholar]) considered using a large number of groups for large data sets to standardize the power. But simulations show that their method performs poorly for some models. In addition, it does not work when the sample size is larger than 25,000. In the present paper, we propose a modified Hosmer–Lemeshow test that is based on estimation and standardization of the distribution parameter of the Hosmer–Lemeshow statistic. We provide a mathematical derivation for obtaining the critical value and power of our test. Through simulations, we can see that our method satisfactorily standardizes the power of the Hosmer–Lemeshow test. It is especially recommendable for enough large data sets, as the power is rather stable. A bank marketing data set is also analyzed for comparison with existing methods.  相似文献   

7.
ABSTRACT

The Mack–Wolfe test is the most frequently used non parametric procedure for the umbrella alternative problem. In this paper, modifications of the Mack–Wolfe test are proposed for both known peak and unknown peak umbrellas. The exact mean and variance of the proposed tests in the null hypothesis are also derived. We compare these tests with some of the existing tests in terms of the type I error rate and power. In addition, a real data example is presented.  相似文献   

8.
This paper reviews global and multiple tests for the combination ofn hypotheses using the orderedp-values of then individual tests. In 1987, Röhmel and Streitberg presented a general method to construct global level α tests based on orderedp-values when there exists no prior knowledge regarding the joint distribution of the corresponding test statistics. In the case of independent test statistics, construction of global tests is available by means of recursive formulae presented by Bicher (1989), Kornatz (1994) and Finner and Roters (1994). Multiple test procedures can be developed by applying the closed test principle using these global tests as building blocks. Liu (1996) proposed representing closed tests by means of “critical matrices” which contain the critical values of the global tests. Within the framework of these theoretical concepts, well-known global tests and multiple test procedures are classified and the relationships between the different tests are characterised.  相似文献   

9.
10.
The power of the Fisher permutation test extended to 2 × k tables is evaluated unconditionally as a function of the under-lying cell probabilities in the table. These results are then applied in assessing the sensitivity of two-generation cancer bioassays in which a fixed number of pups from each litter born in the first generation are selected to continue on test in the second generation. In this case, the two rows of the table correspond to two treatment groups and the k columns correspond to the number of animals responding in a litter. The cell probabilities in this application are based on a suitable beta-binomial superpopulation model.  相似文献   

11.
Consider the standard treatment-control model with a time-to-event endpoint. We propose a novel interpretable test statistic from a quantile function point of view. The large sample consistency of our estimator is proven for fixed bandwidth values theoretically and validated empirically. A Monte Carlo simulation study also shows that given small sample sizes, utilization of a tuning parameter through the application of a smooth quantile function estimator shows an improvement in efficiency in terms of the MSE when compared to direct application of classic Kaplan–Meier survival function estimator. The procedure is finally illustrated via an application to epithelial ovarian cancer data.  相似文献   

12.
In this paper, we consider the validity of the Jarque–Bera normality test whose construction is based on the residuals, for the innovations of GARCH (generalized autoregressive conditional heteroscedastic) models. It is shown that the asymptotic behavior of the original form of the JB test adopted in this paper is identical to that of the test statistic based on true errors. The simulation study also confirms the validity of the original form since it outperforms other available normality tests.  相似文献   

13.
The problem of testing for a parameter change has been a core issue in time series analysis. It is well known that the estimates-based CUSUM test often suffers from severe size distortions in general GARCH type models. The residual-based CUSUM test has been used as an alternative, which, however, has a defect not to detect the ARMA parameter changes in ARMA–GARCH models. As a remedy, one can employ the score vector-based CUSUM test in ARMA–GARCH models as in Oh and Lee (0000). However, it shows some size distortions for relatively small samples. Hence, we consider the bootstrap counterpart for obtaining a more stable test. Focus is made on the verification of the weak consistency of the proposed test. An empirical study is illustrated for its evaluation.  相似文献   

14.
In this paper, we consider the bootstrap procedure for the augmented Dickey–Fuller (ADF) unit root test by implementing the modified divergence information criterion (MDIC, Mantalos et al. [An improved divergence information criterion for the determination of the order of an AR process, Commun. Statist. Comput. Simul. 39(5) (2010a), pp. 865–879; Forecasting ARMA models: A comparative study of information criteria focusing on MDIC, J. Statist. Comput. Simul. 80(1) (2010b), pp. 61–73]) for the selection of the optimum number of lags in the estimated model. The asymptotic distribution of the resulting bootstrap ADF/MDIC test is established and its finite sample performance is investigated through Monte-Carlo simulations. The proposed bootstrap tests are found to have finite sample sizes that are generally much closer to their nominal values, than those tests that rely on other information criteria, like the Akaike information criterion [H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csáki, eds., Akademiai Kaido, Budapest, 1973, pp. 267–281]. The simulations reveal that the proposed procedure is quite satisfactory even for models with large negative moving average coefficients.  相似文献   

15.
In recent research, Elliott et al. (1996) Elliott, G. 1996. Efficient tests for an autoregressive unit root. Econometrica, 64: 813836. [Crossref], [Web of Science ®] [Google Scholar] have shown the use of local-to-unity detrending via generalized least squares (GLS) to substantially increase the power of the Dickey–Fuller (1979) unit root test. In this paper the relationship between the extent of detrending undertaken, determined by the detrending parameter &art1;, and the power of the resulting GLS-based Dickey–Fuller (DF-GLS) test is examined. Using Monte Carlo simulation it is shown that the values of &art1; suggested by Elliott et al. (1996) Elliott, G. 1996. Efficient tests for an autoregressive unit root. Econometrica, 64: 813836. [Crossref], [Web of Science ®] [Google Scholar] on the basis of a limiting power function seldom maximize the power of the DF-GLS test for the finite samples encountered in applied research. This result is found to hold for the DF-GLS test including either an intercept or an intercept and a trend term. An empirical examination of the order of integration of the UK household savings ratio illustrates these findings, with the unit root hypothesis rejected using values of &art1; other than that proposed by Elliott et al. (1996) Elliott, G. 1996. Efficient tests for an autoregressive unit root. Econometrica, 64: 813836. [Crossref], [Web of Science ®] [Google Scholar].  相似文献   

16.
We develop an exact Kolmogorov–Smirnov goodness-of-fit test for the Poisson distribution with an unknown mean. This test is conditional, with the test statistic being the maximum absolute difference between the empirical distribution function and its conditional expectation given the sample total. Exact critical values are obtained using a new algorithm. We explore properties of the test, and we illustrate it with three examples. The new test seems to be the first exact Poisson goodness-of-fit test for which critical values are available without simulation or exhaustive enumeration.  相似文献   

17.
In a special paired sample case, Hotelling’s T2 test based on the differences of the paired random vectors is the likelihood ratio test for testing the hypothesis that the paired random vectors have the same mean; with respect to a special group of affine linear transformations it is the uniformly most powerful invariant test for the general alternative of a difference in mean. We present an elementary straightforward proof of this result. The likelihood ratio test for testing the hypothesis that the covariance structure is of the assumed special form is derived and discussed. Applications to real data are given.  相似文献   

18.
19.
In this paper, we study the Jarque–Bera (JB) normality test for the innovations of ARMA–GARCH models, whose construction is based on the residuals. The validity of the JB test for ARMA–GARCH innovations should be carefully investigated in advance of actual practice, since the residual-based test may behave differently, depending upon the structure of the time series models and the form of the test statistic (cf. Chen and Kuan, 2003, Hwang and Baek, 2009, Lee and Wei, 1999). In order to demonstrate the validity of the JB test, we prove that the asymptotic distribution of the original form of the JB test is identical to that of the test statistic based on true errors under mild conditions. Simulation results are provided for illustration.  相似文献   

20.
We introduce the 2nd-power skewness and kurtosis, which are interesting alternatives to the classical Pearson's skewness and kurtosis, called 3rd-power skewness and 4th-power kurtosis in our terminology. We use the sample 2nd-power skewness and kurtosis to build a powerful test of normality. This test can also be derived as Rao's score test on the asymmetric power distribution, which combines the large range of exponential tail behavior provided by the exponential power distribution family with various levels of asymmetry. We find that our test statistic is asymptotically chi-squared distributed. We also propose a modified test statistic, for which we show numerically that the distribution can be approximated for finite sample sizes with very high precision by a chi-square. Similarly, we propose a directional test based on sample 2nd-power kurtosis only, for the situations where the true distribution is known to be symmetric. Our tests are very similar in spirit to the famous Jarque–Bera test, and as such are also locally optimal. They offer the same nice interpretation, with in addition the gold standard power of the regression and correlation tests. An extensive empirical power analysis is performed, which shows that our tests are among the most powerful normality tests. Our test is implemented in an R package called PoweR.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号