首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.  相似文献   

2.
3.
ABSTRACT

Efforts to address a reproducibility crisis have generated several valid proposals for improving the quality of scientific research. We argue there is also need to address the separate but related issues of relevance and responsiveness. To address relevance, researchers must produce what decision makers actually need to inform investments and public policy—that is, the probability that a claim is true or the probability distribution of an effect size given the data. The term responsiveness refers to the irregularity and delay in which issues about the quality of research are brought to light. Instead of relying on the good fortune that some motivated researchers will periodically conduct efforts to reveal potential shortcomings of published research, we could establish a continuous quality-control process for scientific research itself. Quality metrics could be designed through the application of this statistical process control for the research enterprise. We argue that one quality control metric—the probability that a research hypothesis is true—is required to address at least relevance and may also be part of the solution for improving responsiveness and reproducibility. This article proposes a “straw man” solution which could be the basis of implementing these improvements. As part of this solution, we propose one way to “bootstrap” priors. The processes required for improving reproducibility and relevance can also be part of a comprehensive statistical quality control for science itself by making continuously monitored metrics about the scientific performance of a field of research.  相似文献   

4.
This article reviews Albert Einstein's first published paper, submitted for publication in 1900. At that time, Einstein was 21 and a recent college graduate. His paper uses modeling and least squares to analyze data in support of a scientific proposition. Einstein is shown to be well trained, for his day, in using statistics as a tool in his scientific research. This paper also shows his ability to make trivial arithmetic mistakes and some clumsiness in data recording. A major aim of this article is to help provide a better appreciation of Einstein as an active user of statistical arguments in this and other of his important publications.  相似文献   

5.
Secretary's Note: At the meeting of the incoming Board and Council of December 29, 1952, the Committee on Committees submitted for consideration an outline regarding the tasks that should be undertaken by the Commission on Statistical Standards and Organization. The Board authorized President Cochran to appoint an ad hoc committee to explore the functions as outlined by the Committee on Committees. The ad hoc committee, under the chairmanship of Rensis Likert, Institute for Social Research, University of Michigan, has prepared an excellent report on this subject which should be the concern of all members of the Association.

The whole problem of standards in the statistical profession is a great and difficult one. The Board has authorized the publication of the ad hoc committee's report in The American Statistician in the hope that the members of the Association will submit their views regarding both the Report and the problem of standards. The entire ASA membership is invited to forward their thoughts and comments to the Office of the Secretary of the Association for possible publication in The American Statistician.  相似文献   

6.
7.
8.
Abstract

Collection development policies in small academic libraries may lack a formal policy statement about print periodical holdings retention. However, there is a need for a distinct policy about print periodicals holdings and their retention. Periodicals collections at academic libraries have been greatly affected by publishers’ decisions to discontinue print journal formats and move to online-only electronic versions. The move from one format to another produces challenges to the retention of an effective print periodicals collection. Given these continuous changes in publication format, it is necessary for academic libraries to rethink their print periodicals holdings retention. This article will present a literature review on and case study of periodicals collection management and explore strategies for developing holdings policies and guidelines for retention. It will argue that collection development policies ought to include a separate policy for the print periodicals collection and that unlike their reputation for being time-consuming and inflexible, periodical retention policies can improve flexibility and guide in decision making, helping to preserve core titles and acquire new titles that support academic programs and the work of the college community.  相似文献   

9.
A two-stage group acceptance sampling plan based on a truncated life test is proposed, which can be used regardless of the underlying lifetime distribution when multi-item testers are employed. The decision upon lot acceptance can be made in the first or second stage according to the number of failures from each group. The design parameters of the proposed plan such as number of groups required and the acceptance number for each of two stages are determined independently of an underlying lifetime distribution so as to satisfy the consumer's risk at the specified unreliability. Single-stage group sampling plans are also considered as special cases of the proposed plan and compared with the proposed plan in terms of the average sample number and the operating characteristics. Some important distributions are considered to explain the procedure developed here.  相似文献   

10.
11.
ABSTRACT

Such is the grip of formal methods of statistical inference—that is, frequentist methods for generalizing from sample to population in enumerative studies—in the drawing of scientific inferences that the two are routinely deemed equivalent in the social, management, and biomedical sciences. This, despite the fact that legitimate employment of said methods is difficult to implement on practical grounds alone. But supposing the adoption of these procedures were simple does not get us far; crucially, methods of formal statistical inference are ill-suited to the analysis of much scientific data. Even findings from the claimed gold standard for examination by the latter, randomized controlled trials, can be problematic.

Scientific inference is a far broader concept than statistical inference. Its authority derives from the accumulation, over an extensive period of time, of both theoretical and empirical knowledge that has won the (provisional) acceptance of the scholarly community. A major focus of scientific inference can be viewed as the pursuit of significant sameness, meaning replicable and empirically generalizable results among phenomena. Regrettably, the obsession with users of statistical inference to report significant differences in data sets actively thwarts cumulative knowledge development.

The manifold problems surrounding the implementation and usefulness of formal methods of statistical inference in advancing science do not speak well of much teaching in methods/statistics classes. Serious reflection on statistics' role in producing viable knowledge is needed. Commendably, the American Statistical Association is committed to addressing this challenge, as further witnessed in this special online, open access issue of The American Statistician.  相似文献   

12.
Abstract

A central objective of empirical research on treatment response is to inform treatment choice. Unfortunately, researchers commonly use concepts of statistical inference whose foundations are distant from the problem of treatment choice. It has been particularly common to use hypothesis tests to compare treatments. Wald’s development of statistical decision theory provides a coherent frequentist framework for use of sample data on treatment response to make treatment decisions. A body of recent research applies statistical decision theory to characterize uniformly satisfactory treatment choices, in the sense of maximum loss relative to optimal decisions (also known as maximum regret). This article describes the basic ideas and findings, which provide an appealing practical alternative to use of hypothesis tests. For simplicity, the article focuses on medical treatment with evidence from classical randomized clinical trials. The ideas apply generally, encompassing use of observational data and treatment choice in nonmedical contexts.  相似文献   

13.
14.
The study of television audience viewing behavior is very important. The results can provide broadcasters and advertisers useful information to increase the effectiveness of television programming and advertising. Based on hazard rate analysis for survival model, this research develops a new statistical model to fit the diffusion pattern of TV programs, which is a measure of the overall popularity of the program and is used as a criterion to sell the television time. The model helps the decision makers at the networks better understand the acceptance of the show and the underlying behavioral patterns of the viewers. It fits the empirical data in Hong Kong very well and outperforms the existing models. This basic model is then extended to the proportional hazard model to study the covariate effects on the likelihood of an individual watching the program at an earlier stage. Advertisers can benefit from these results in targeting their desired customers.  相似文献   

15.
ABSTRACT

Scientific research of all kinds should be guided by statistical thinking: in the design and conduct of the study, in the disciplined exploration and enlightened display of the data, and to avoid statistical pitfalls in the interpretation of the results. However, formal, probability-based statistical inference should play no role in most scientific research, which is inherently exploratory, requiring flexible methods of analysis that inherently risk overfitting. The nature of exploratory work is that data are used to help guide model choice, and under these circumstances, uncertainty cannot be precisely quantified, because of the inevitable model selection bias that results. To be valid, statistical inference should be restricted to situations where the study design and analysis plan are specified prior to data collection. Exploratory data analysis provides the flexibility needed for most other situations, including statistical methods that are regularized, robust, or nonparametric. Of course, no individual statistical analysis should be considered sufficient to establish scientific validity: research requires many sets of data along many lines of evidence, with a watchfulness for systematic error. Replicating and predicting findings in new data and new settings is a stronger way of validating claims than blessing results from an isolated study with statistical inferences.  相似文献   

16.
经济统计数据修订是官方统计工作的重要内容之一,因为相关的修订会影响到统计数据使用者的决策,所以科学规范的修订程序会提升官方统计的权威性和公信力。归纳了国际货币基金组织对统计数据修订的四种类型,从信息使用者角度阐述统计修订工作的时效性、准确性、一致性、可获得性与约束性内涵,结合西方发达国家的经济统计数据修订实践和历史经验,特别是美国国内生产总值的修订经验,为中国经济统计数据修订工作实现规范化提供借鉴。  相似文献   

17.
18.
The European Federation of Statisticians in the Pharmaceutical Industry (EFSPI) believes access to clinical trial data should be implemented in a way that supports good research, avoids misuse of such data, lies within the scope of the original informed consent and fully protects patient confidentiality. In principle, EFSPI supports responsible data sharing. EFSPI acknowledges it is in the interest of patients that their data are handled in a strictly confidential manner to avoid misuse under all possible circumstances. It is also in the interest of the altruistic nature of patients participating in trials that such data will be used for further development of science as much as possible applying good statistical principles. This paper summarises EFSPI's position on access to clinical trial data. The position was developed during the European Medicines Agency (EMA) advisory process and before the draft EMA policy on publication and access to clinical trial data was released for consultation; however, the EFSPI's position remains unchanged following the release of the draft policy. Finally, EFSPI supports a need for further guidance to be provided on important technical aspects relating to re‐analyses and additional analyses of clinical trial data, for example, multiplicity, meta‐analysis, subgroup analyses and publication bias. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

19.
A methodology is developed for estimating consumer acceptance limits on a sensory attribute of a manufactured product. In concept these limits are analogous to engineering tolerances. The method is based on a generalization of Stevens' Power Law. This generalized law is expressed as a nonlinear statistical model. Instead of restricting the analysis to this particular case, a strategy is discussed for evaluating nonlinear models in general since scientific models are frequently of nonlinear form. The strategy focuses on understanding the geometrical contrasts between linear and nonlinear model estimation and assessing the bias in estimation and the departures from a Gaussian sampling distribution. Computer simulation is employed to examine the behavior of nonlinear least squares estimation. In addition to the usual Gaussian assumption, a bootstrap sample reuse procedure and a general triangular distribution are introduced for evaluating the effects of a non-Gaussian or asymmetrical error structure. Recommendations are given for further model analysis based on the simulation results. In the case of a model for which estimation bias is not a serious issue, estimating functions of the model are considered. Application of these functions to the generalization of Stevens’ Power Law leads to a means for defining and estimating consumer acceptance limits, The statistical form of the law and the model evaluation strategy are applied to consumer research data. Estimation of consumer acceptance limits is illustrated and discussed.  相似文献   

20.
Monitoring interim accumulating data in a clinical trial for evidence of therapeutic benefit or toxicity is a frequent policy, usually carried out by an independent scientific committee. While statistical methodology has been developed to assess the significance of these interim analyses, such methods should not be viewed as absolute rules but only serve as useful guides. The decision process to terminate a trial early is very complex and many factors must be taken into account. The complexity of this decision process is illustrated by reviewing the experience of several recent clinical trials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号