首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Time sharing computer configurations have introduced a new dimension in applying statistical and mathematical models to sequential decision problems. When the outcome of one step in the process influences subsequent decisions, then an interactive time-sharing system is of great help. Since the forecasting function involves such a sequential process, it can be handled particularly well with an appropriate time-shared computer system. This paper describes such as system which allows the user to do preliminary analysis of his data to identify the forecasting technique or class of techniques most appropriate for his situation and to apply those in developing a forecast. This interactive forecasting system has met with excellent success both in teaching the fundamentals of forecasting for business decision making and in actually applying those techniques in management situations.  相似文献   

In this paper, we propose a new Bayesian inference approach for classification based on the traditional hinge loss used for classical support vector machines, which we call the Bayesian Additive Machine (BAM). Unlike existing approaches, the new model has a semiparametric discriminant function where some feature effects are nonlinear and others are linear. This separation of features is achieved automatically during model fitting without user pre-specification. Following the literature on sparse regression of high-dimensional models, we can also identify the irrelevant features. By introducing spike-and-slab priors using two sets of indicator variables, these multiple goals are achieved simultaneously and automatically, without any parameter tuning such as cross-validation. An efficient partially collapsed Markov chain Monte Carlo algorithm is developed for posterior exploration based on a data augmentation scheme for the hinge loss. Our simulations and three real data examples demonstrate that the new approach is a strong competitor to some approaches that were proposed recently for dealing with challenging classification examples with high dimensionality.  相似文献   

The Consistent System (CS) is an interactive computer system for researchers in the behavioral and policy sciences and in fields with similar requirements for data management and statistical analysis. The researcher is not expected to be a programmer. The system offers a wide range of facilities and permits the user to combine them in novel ways. In particular, tools for statistical analysis may be used in combination with a powerful relational subsystem for data base management. This paper gives an overview of the objectives, capabilities, status, and availability of the system.  相似文献   

Marron  J. S.  Udina  F. 《Statistics and Computing》1999,9(2):101-110
A tool for user choice of the local bandwidth function for kernel density and nonparametric regression estimates is developed using KDE, a graphical object-oriented package for interactive kernel density estimation written in LISP-STAT. The bandwidth function is a parameterized spline, whose knots are manipulated by the user in one window, while the resulting estimate appears in another window. A real data illustration of this method raises concerns, because an extremely large family of estimates is available. Suggestions are made to overcome this problem so that this tool can be used effectively for presenting final results of a data analysis.  相似文献   

Statistical database management systems keep raw, elementary and/or aggregated data and include query languages with facilities to calculate various statistics from this data. In this article we examine statistical database query languages with respect to the criteria identified and taxonomy developed in Ozsoyoglu and Ozsoyoglu (1985b). The criteria include statistical metadata and objects, aggregation features and interface to statistical packages. The taxonomy of statistical database query languages classifies them with respect to the data model used, the type of user interface and method of implementation. Temporal databases are rich sources of data for statistical analysis. Aggregation features of temporal query languages, as well as the issues in calculating aggregates from temporal data, are also examined.  相似文献   

This continuing education course for professionals involved in all areas of clinical trials integrates concepts related to the role of randomization in the scientific process. The course includes two interactive lecture and discussion sections and a workshop practicum. The first interactive lecture introduces basic clinical trial issues and statistical principles such as bias, blinding, randomization, control groups, and the importance of formulating clear and discriminating clinical and statistical hypotheses. It then focuses on the most commonly used clinical study designs and the corresponding patient randomization schemes. The second interactive lecture focuses on the implementation of randomization of patients and drug supply through allocation and component ID schedules. The workshop practicum, conducted in small groups, enables students to apply the lecture concepts to real clinical studies. Flexibility was built into the workshop practicum materials to allow the course content to be customized to specific audiences, and the interactive lecture sessions can be stretched to cover more advanced topics according to class interest and time availability.  相似文献   

Les Hawkins 《Serials Review》2015,41(2):106-107
The implementation of the semantic web based on linked data principles represents a movement toward transforming the World Wide Web into a database of linked resources, where data may be widely reused and shared. Web services can be enhanced by drawing on semantically aware data made available by a variety of providers. The Bibliographic Framework (BIBFRAME) Initiative seeks to expose rich library metadata to the semantic web, configured so that semantic data from other sources can be incorporated in meeting library user needs.  相似文献   

通过用户满意度调查来测量用户对统计数据质量水平的主观感知,为统计数据质量评估与控制提供了一条重要的信息渠道。基于对政府统计部门实施用户满意度调查的社会背景及其必要性的分析,借鉴在国际同领域实践中起步较早的、较具代表性的欧洲统计系统用户满意度调查的相关实践经验,对中国政府统计部门实施用户满意度调查的制度保障、目标定位与内容设计以及调查组织实施等方面的实践要领进行了探讨。  相似文献   

Bayesian model learning based on a parallel MCMC strategy   总被引:1,自引:0,他引:1  
We introduce a novel Markov chain Monte Carlo algorithm for estimation of posterior probabilities over discrete model spaces. Our learning approach is applicable to families of models for which the marginal likelihood can be analytically calculated, either exactly or approximately, given any fixed structure. It is argued that for certain model neighborhood structures, the ordinary reversible Metropolis-Hastings algorithm does not yield an appropriate solution to the estimation problem. Therefore, we develop an alternative, non-reversible algorithm which can avoid the scaling effect of the neighborhood. To efficiently explore a model space, a finite number of interacting parallel stochastic processes is utilized. Our interaction scheme enables exploration of several local neighborhoods of a model space simultaneously, while it prevents the absorption of any particular process to a relatively inferior state. We illustrate the advantages of our method by an application to a classification model. In particular, we use an extensive bacterial database and compare our results with results obtained by different methods for the same data.  相似文献   

Summary.  Multilevel or mixed effects models are commonly applied to hierarchical data. The level 2 residuals, which are otherwise known as random effects, are often of both substantive and diagnostic interest. Substantively, they are frequently used for institutional comparisons or rankings. Diagnostically, they are used to assess the model assumptions at the group level. Inference on the level 2 residuals, however, typically does not account for 'data snooping', i.e. for the harmful effects of carrying out a multitude of hypothesis tests at the same time. We provide a very general framework that encompasses both of the following inference problems: inference on the 'absolute' level 2 residuals to determine which are significantly different from 0, and inference on any prespecified number of pairwise comparisons. Thus, the user has the choice of testing the comparisons of interest. As our methods are flexible with respect to the estimation method that is invoked, the user may choose the desired estimation method accordingly. We demonstrate the methods with the London education authority data, the wafer data and the National Educational Longitudinal Study data.  相似文献   

We propose a hierarchical Bayesian model for analyzing gene expression data to identify pathways differentiating between two biological states (e.g., cancer vs. non-cancer and mutant vs. normal). Finding significant pathways can improve our understanding of biological processes. When the biological process of interest is related to a specific disease, eliciting a better understanding of the underlying pathways can lead to designing a more effective treatment. We apply our method to data obtained by interrogating the mutational status of p53 in 50 cancer cell lines (33 mutated and 17 normal). We identify several significant pathways with strong biological connections. We show that our approach provides a natural framework for incorporating prior biological information, and it has the best overall performance in terms of correctly identifying significant pathways compared to several alternative methods.  相似文献   

Statisticians fall far short of their potential as guides to enlightened decision making in business. Two important explanations are: (1) Decision makers are often more easily convinced by concrete examples, however fragmentary and misleading, than by competent statistical analysis. (2) The effective use of statistics in the process of decision making requires hard thinking by decision makers, thinking that cannot be delegated entirely to the statistical specialist. Modern developments in interactive statistical computing may help to reduce the force of these limitations on exploitation of statistics; used properly, computing can encourage, almost force, the student or business user of statistics to think statistically.  相似文献   

Patient flow modeling is a growing field of interest in health services research. Several techniques have been applied to model movement of patients within and between health-care facilities. However, individual patient experience during the delivery of care has always been overlooked. In this work, a random effects model is introduced to patient flow modeling and applied to a London Hospital Neonatal unit data. In particular, a random effects multinomial logit model is used to capture individual patient trajectories in the process of care with patient frailties modeled as random effects. Intuitively, both operational and clinical patient flow are modeled, the former being physical and the latter latent. Two variants of the model are proposed, one based on mere patient pathways and the other based on patient characteristics. Our technique could identify interesting pathways such as those that result in high probability of death (survival), pathways incurring the least (highest) cost of care or pathways with the least (highest) length of stay. Patient-specific discharge probabilities from the health care system could also be predicted. These are of interest to health-care managers in planning the scarce resources needed to run health-care institutions.  相似文献   

This article reports the findings of a study of licensed database usage among libraries in the NC LIVE consortium. Researchers developed North Carolina-based library peer groups in order to build context for libraries’ usage data reports and to identify benchmarks and trends across those libraries that are top performers within each group. Additionally, researchers examined the use of selected databases across multiple library types to determine whether certain library characteristics or activities are related to database use. Researchers found that a number of library characteristics and activities predict database use, but the results vary depending upon the type of library and the database studied.  相似文献   

Respondent-driven sampling (RDS) is a link-tracing network sampling strategy for collecting data from hard-to-reach populations, such as injection drug users or individuals at high risk of being infected with HIV. The mechanism is to find initial participants (seeds), and give each of them a fixed number of coupons allowing them to recruit people they know from the population of interest, with a mutual financial incentive. The new participants are again given coupons and the process repeats. Currently, the standard RDS estimator used in practice is known as the Volz–Heckathorn (VH) estimator. It relies on strong assumptions about the underlying social network and the RDS process. Via simulation, we study the relative performance of the plain mean and VH estimators when assumptions of the latter are not satisfied, under different network types (including homophily and rich-get-richer networks), participant referral patterns, and varying number of coupons. The analysis demonstrates that the plain mean outperforms the VH estimator in many but not all of the simulated settings, including homophily networks. Also, we highlight the implications of multiple recruitment and varying referral patterns on the depth of RDS process. We develop interactive visualizations of the findings and RDS process to further build insight into the various factors contributing to the performance of current RDS estimation techniques.  相似文献   

In a wireless sensor network, data collection is relatively cheap whereas data transmission is relatively expensive. Thus, preserving battery life is critical. If the process of interest is sufficiently predictable, the suppression in transmission can be adopted to improve efficiency of sensor networks because the loss of information is not great. The prime interest lies in finding an inference-efficient way to support suppressed data collection application. In this paper, we present a suppression scheme for a multiple nodes setting with spatio-temporal processes, especially when process knowledge is insufficient. We also explore the impact of suppression schemes on the inference of the regional processes under various suppression levels. Finally, we formalize the hierarchical Bayesian model for these schemes.  相似文献   

王华  郭红丽 《统计研究》2011,28(12):29-35
 通过实施统计用户满意度调查,测量统计用户对于政府统计部门所生产各类统计数据项目的综合质量感知,及其在主要发布渠道的具体质量感知状况。基于调查数据的分析结果表明:各类统计数据项目的综合用户质量感知水平存在较为明显的差异,用户质量感知与其使用频率之间保持了一定的正相关关系;而各类统计项目在不同发布渠道的具体质量表现也不尽相同。据此可以有效确立统计数据质量管理工作的重点环节。  相似文献   

SUMMARY Automatic identification of faces from a database given a digital view is becoming increasingly important. The question arises whether or not there can be a face identification system similar to the fingerprinting system, where a certain number of matches are regarded as sufficient to identify the person in the database. We first give a very general review of the topic of facial measurements and indicate some deep statistical problems. We then analyze a database of photographs. Certain characteristics of the population are provided, such as the modes of variation and correlation structures using shape analysis. The data involve angles as well as distances. The principal component analysis for angular data is discussed, its conversion into landmark data is established and the two approaches are compared. A new approach of anchor shape analysis for specialized distances is discussed.  相似文献   

Troubleshooting when a link resolver goes wrong can be difficult as it usually relies on the user to report the problem. Using data on interlibrary loan requests that have been cancelled because the materials are available online is one way that libraries can examine where link resolvers may be failing. For the 2012/2013 school year, the Samford University Library looked at this cancelled interlibrary loan request data to determine where their new link resolver and knowledgebase needed further customization to improve the user experience. This process not only identified a number of problems all along the link resolution chain, but it also put in place an ongoing process for identifying and troubleshooting link resolution issues in the future.  相似文献   

Software which allows interactive exploration of graphical displays is widely available. In addition there now exist sophisticated authoring tools which allow more general textual and graphical material to be presented in computer-based form. The role of an authoring tool in providing a graphical interface to a strategy for solving simple statistical problems in the context of teaching is discussed. This interface allows a variety of resources to be integrated. Specific examples, including the use of dynamic graphical displays in exploring data and in communicating the meaning of a model, are proposed. These ideas are illustrated by a problem involving the identification of the sex of a herring gull.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号