首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.  相似文献   

2.
ABSTRACT

Entropy-type integral functionals of densities are widely used in mathematical statistics, information theory, and computer science. Examples include measures of closeness between distributions (e.g., density power divergence) and uncertainty characteristics for a random variable (e.g., Rényi entropy). In this paper, we study U-statistic estimators for a class of such functionals. The estimators are based on ε-close vector observations in the corresponding independent and identically distributed samples. We prove asymptotic properties of the estimators (consistency and asymptotic normality) under mild integrability and smoothness conditions for the densities. The results can be applied in diverse problems in mathematical statistics and computer science (e.g., distribution identification problems, approximate matching for random databases, two-sample problems).  相似文献   

3.
ABSTRACT

When data analysts operate within different statistical frameworks (e.g., frequentist versus Bayesian, emphasis on estimation versus emphasis on testing), how does this impact the qualitative conclusions that are drawn for real data? To study this question empirically we selected from the literature two simple scenarios—involving a comparison of two proportions and a Pearson correlation—and asked four teams of statisticians to provide a concise analysis and a qualitative interpretation of the outcome. The results showed considerable overall agreement; nevertheless, this agreement did not appear to diminish the intensity of the subsequent debate over which statistical framework is more appropriate to address the questions at hand.  相似文献   

4.
Combining p-values from statistical tests across different studies is the most commonly used approach in meta-analysis for evolutionary biology. The most commonly used p-value combination methods mainly incorporate the z-transform tests (e.g., the un-weighted z-test and the weighted z-test) and the gamma-transform tests (e.g., the CZ method [Z. Chen, W. Yang, Q. Liu, J.Y. Yang, J. Li, and M.Q. Yang, A new statistical approach to combining p-values using gamma distribution and its application to genomewide association study, Bioinformatics 15 (2014), p. S3]). However, among these existing p-value combination methods, no method is uniformly most powerful in all situations [Chen et al. 2014]. In this paper, we propose a meta-analysis method based on the gamma distribution, MAGD, by pooling the p-values from independent studies. The newly proposed test, MAGD, allows for flexible accommodating of the different levels of heterogeneity of effect sizes across individual studies. The MAGD simultaneously retains all the characters of the z-transform tests and the gamma-transform tests. We also propose an easy-to-implement resampling approach for estimating the empirical p-values of MAGD for the finite sample size. Simulation studies and two data applications show that the proposed method MAGD is essentially as powerful as the z-transform tests (the gamma-transform tests) under the circumstance with the homogeneous (heterogeneous) effect sizes across studies.  相似文献   

5.
6.
A novel method is proposed for choosing the tuning parameter associated with a family of robust estimators. It consists of minimising estimated mean squared error, an approach that requires pilot estimation of model parameters. The method is explored for the family of minimum distance estimators proposed by [Basu, A., Harris, I.R., Hjort, N.L. and Jones, M.C., 1998, Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 549–559.] Our preference in that context is for a version of the method using the L 2 distance estimator [Scott, D.W., 2001, Parametric statistical modeling by minimum integrated squared error. Technometrics, 43, 274–285.] as pilot estimator.  相似文献   

7.
8.
ABSTRACT

We consider Pitman-closeness to evaluate the performance of univariate and multivariate forecasting methods. Optimal weights for the combination of forecasts are calculated with respect to this criterion. These weights depend on the assumption of the distribution of the individual forecasts errors. In the normal case they are identical with the optimal weights with respect to the MSE-criterion (univariate case) and with the optimal weights with respect to the MMSE-criterion (multivariate case). Further, we present a simple example to show how the different combination techniques perform. There we can see how much the optimal multivariate combination can outperform different other combinations. In practice, we can find multivariate forecasts e.g., in econometrics. There is often the situation that forecast institutes estimate several economic variables.  相似文献   

9.
A segmented line regression model has been used to describe changes in cancer incidence and mortality trends [Kim, H.-J., Fay, M.P., Feuer, E.J. and Midthune, D.N., 2000, Permutation tests for joinpoint regression with applications to cancer rates. Statistics in Medicine, 19, 335–351. Kim, H.-J., Fay, M.P., Yu, B., Barrett., M.J. and Feuer, E.J., 2004, Comparability of segmented line regression models. Biometrics, 60, 1005–1014.]. The least squares fit can be obtained by using either the grid search method proposed by Lerman [Lerman, P.M., 1980, Fitting segmented regression models by grid search. Applied Statistics, 29, 77–84.] which is implemented in Joinpoint 3.0 available at http://srab.cancer.gov/joinpoint/index.html, or by using the continuous fitting algorithm proposed by Hudson [Hudson, D.J., 1966, Fitting segmented curves whose join points have to be estimated. Journal of the American Statistical Association, 61, 1097–1129.] which will be implemented in the next version of Joinpoint software. Following the least squares fitting of the model, inference on the parameters can be pursued by using the asymptotic results of Hinkley [Hinkley, D.V., 1971, Inference in two-phase regression. Journal of the American Statistical Association, 66, 736–743.] and Feder [Feder, P.I., 1975a, On asymptotic distribution theory in segmented regression Problems-Identified Case. The Annals of Statistics, 3, 49–83.] Feder [Feder, P.I., 1975b, The log likelihood ratio in segmented regression. The Annals of Statistics, 3, 84–97.] Via simulations, this paper empirically examines small sample behavior of these asymptotic results, studies how the two fitting methods, the grid search and the Hudson's algorithm affect these inferential procedures, and also assesses the robustness of the asymptotic inferential procedures.  相似文献   

10.
This article provides a novel test for predictability within a nonlinear smooth transition predictive regression (STPR) model where inference is complicated due not only to the presence of persistent, local to unit root, predictors, and endogeneity but also the presence of unidentified parameters under the null of no predictability. In order to circumvent the unidentified parameters problem, t? statistic for the predictor in the STPR model is optimized over the Cartesian product of the spaces for the transition and threshold parameters; and to address the di?culties due to persistent and endogenous predictors, the instrumental variable (IVX) method originally developed in the linear cointegration testing framework is adopted within the STPR model. Limit distribution of this statistic (i.e., sup?tIVX test) is shown to be nuisance parameter-free and robust to the local to unit root and endogenous regressors. Simulations show that sup?tIVX has good size and power properties. An application to stock return predictability reveals presence of asymmetric regime-dependence and variability in the strength and size of predictability across asset-related (e.g., dividend/price ratio) vs. other (e.g., default yield spread) predictors.  相似文献   

11.
The paper gives a review of a number of data models for aggregate statistical data which have appeared in the computer science literature in the last ten years.After a brief introduction to the data model in general, the fundamental concepts of statistical data are introduced. These are called statistical objects because they are complex data structures (vectors, matrices, relations, time series, etc) which may have different possible representations (e.g. tables, relations, vectors, pie-charts, bar-charts, graphs, and so on). For this reason a statistical object is defined by two different types of attribute (a summary attribute, with its own summary type and with its own instances, called summary data, and the set of category attributes, which describe the summary attribute). Some conceptual models of statistical data (CSM, SDM4S), some semantic models of statistical data (SCM, SAM*, OSAM*), and some graphical models of statistical data (SUBJECT, GRASS, STORM) are also discussed.  相似文献   

12.
ABSTRACT

P values linked to null hypothesis significance testing (NHST) is the most widely (mis)used method of statistical inference. Empirical data suggest that across the biomedical literature (1990–2015), when abstracts use P values 96% of them have P values of 0.05 or less. The same percentage (96%) applies for full-text articles. Among 100 articles in PubMed, 55 report P values, while only 4 present confidence intervals for all the reported effect sizes, none use Bayesian methods and none use false-discovery rate. Over 25 years (1990–2015), use of P values in abstracts has doubled for all PubMed, and tripled for meta-analyses, while for some types of designs such as randomized trials the majority of abstracts report P values. There is major selective reporting for P values. Abstracts tend to highlight most favorable P values and inferences use even further spin to reach exaggerated, unreliable conclusions. The availability of large-scale data on P values from many papers has allowed the development and applications of methods that try to detect and model selection biases, for example, p-hacking, that cause patterns of excess significance. Inferences need to be cautious as they depend on the assumptions made by these models and can be affected by the presence of other biases (e.g., confounding in observational studies). While much of the unreliability of past and present research is driven by small, underpowered studies, NHST with P values may be also particularly problematic in the era of overpowered big data. NHST and P values are optimal only in a minority of current research. Using a more stringent threshold, as in the recently proposed shift from P < 0.05 to P < 0.005, is a temporizing measure to contain the flood and death-by-significance. NHST and P values may be replaced in many fields by other, more fit-for-purpose, inferential methods. However, curtailing selection biases requires additional measures, beyond changes in inferential methods, and in particular reproducible research practices.  相似文献   

13.
ABSTRACT

In this article, we assess the 31 articles published in Basic and Applied Social Psychology (BASP) in 2016, which is one full year after the BASP editors banned the use of inferential statistics. We discuss how the authors collected their data, how they reported and summarized their data, and how they used their data to reach conclusions. We found multiple instances of authors overstating conclusions beyond what the data would support if statistical significance had been considered. Readers would be largely unable to recognize this because the necessary information to do so was not readily available.  相似文献   

14.
《随机性模型》2013,29(3):469-496
We consider a single-commodity, discrete-time, multiperiod (sS)-policy inventory model with backlog. The cost function may contain holding, shortage, and fixed ordering costs. Holding and shortage costs may be nonlinear. We show that the resulting inventory process is quasi-regenerative, i.e., admits a cycle decomposition and indicates how to estimate the performance by Monte Carlo simulation. By using a conditioning method, the push-out technique, and the change-of-measure method, estimates of the whole response surface (i.e., the steady-state performance in dependence of the parameters s and S) and its derivatives can be found. Estimates for the optimal (sS) policy can be calculated then by numerical optimization.  相似文献   

15.
The large nonparametric model in this note is a statistical model with the family ? of all continuous and strictly increasing distribution functions. In the abundant literature of the subject, there are many proposals for nonparametric estimators that are applicable in the model. Typically the kth order statistic X k:n is taken as a simplest estimator, with k = [nq], or k = [(n + 1)q], or k = [nq] + 1, etc. Often a linear combination of two consecutive order statistics is considered. In more sophisticated constructions, different L-statistics (e.g., Harrel–Davis, Kaigh–Lachenbruch, Bernstein, kernel estimators) are proposed. Asymptotically the estimators do not differ substantially, but if the sample size n is fixed, which is the case of our concern, differences may be serious. A unified treatment of quantile estimators in the large, nonparametric statistical model is developed.  相似文献   

16.
Empirical Bayes estimation is considered for an i.i.d. sequence of binomial parameters θi arising from an unknown prior distribution G(.). This problem typically arises in industrial sampling, where samples from lots are routinely used to estimate the lot fraction defective of each lot. Two related issues are explored. The first concerns the fact that only the first few moments of G are typically estimable from the data. This suggests consideration of the interval of estimates (e.g., posterior means) corresponding to the different possible G with the specified moments. Such intervals can be obtained by application of well-known moment theory. The second development concerns the need to acknowledge the uncertainty in the estimation of the first few moments of G. Our proposal is to determine a credible set for the moments, and then find the range of estimates (e.g., posterior means) corresponding to the different possible G with moments in the credible set.  相似文献   

17.
18.

Sufficient dimension reduction (SDR) provides a framework for reducing the predictor space dimension in statistical regression problems. We consider SDR in the context of dimension reduction for deterministic functions of several variables such as those arising in computer experiments. In this context, SDR can reveal low-dimensional ridge structure in functions. Two algorithms for SDR—sliced inverse regression (SIR) and sliced average variance estimation (SAVE)—approximate matrices of integrals using a sliced mapping of the response. We interpret this sliced approach as a Riemann sum approximation of the particular integrals arising in each algorithm. We employ the well-known tools from numerical analysis—namely, multivariate numerical integration and orthogonal polynomials—to produce new algorithms that improve upon the Riemann sum-based numerical integration in SIR and SAVE. We call the new algorithms Lanczos–Stieltjes inverse regression (LSIR) and Lanczos–Stieltjes average variance estimation (LSAVE) due to their connection with Stieltjes’ method—and Lanczos’ related discretization—for generating a sequence of polynomials that are orthogonal with respect to a given measure. We show that this approach approximates the desired integrals, and we study the behavior of LSIR and LSAVE with two numerical examples. The quadrature-based LSIR and LSAVE eliminate the first-order algebraic convergence rate bottleneck resulting from the Riemann sum approximation, thus enabling high-order numerical approximations of the integrals when appropriate. Moreover, LSIR and LSAVE perform as well as the best-case SIR and SAVE implementations (e.g., adaptive partitioning of the response space) when low-order numerical integration methods (e.g., simple Monte Carlo) are used.

  相似文献   

19.
Abstract

A number of tests have been proposed for assessing the location-scale assumption that is often invoked by practitioners. Existing approaches include Kolmogorov–Smirnov and Cramer–von Mises statistics that each involve measures of divergence between unknown joint distribution functions and products of marginal distributions. In practice, the unknown distribution functions embedded in these statistics are typically approximated using nonsmooth empirical distribution functions (EDFs). In a recent article, Li, Li, and Racine establish the benefits of smoothing the EDF for inference, though their theoretical results are limited to the case where the covariates are observed and the distributions unobserved, while in the current setting some covariates and their distributions are unobserved (i.e., the test relies on population error terms from a location-scale model) which necessarily involves a separate theoretical approach. We demonstrate how replacing the nonsmooth distributions of unobservables with their kernel-smoothed sample counterparts can lead to substantial power improvements, and extend existing approaches to the smooth multivariate and mixed continuous and discrete data setting in the presence of unobservables. Theoretical underpinnings are provided, Monte Carlo simulations are undertaken to assess finite-sample performance, and illustrative applications are provided.  相似文献   

20.
This article examines some improperly stated but often used textbook probability problems. Moving from a probabilistic to a statistical setting provides insight into group testing (i.e., observing only whether one or more of a group responds and not the response of each individual). Exact methods are used to construct tables showing (i) that group testing n times to estimate p can be more efficient than n individual tests even for small n and large p, (ii) optimal grouping strategies for various (n, p) combinations, and (iii) the efficiencies and biases achieved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号