首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Nonparametric predictive inference (NPI) is a statistical approach based on few assumptions about probability distributions, with inferences based on data. NPI assumes exchangeability of random quantities, both related to observed data and future observations, and uncertainty is quantified using lower and upper probabilities. In this paper, units from several groups are placed simultaneously on a lifetime experiment and times-to-failure are observed. The experiment may be ended before all units have failed. Depending on the available data and few assumptions, we present lower and upper probabilities for selecting the best group, the subset of best groups and the subset including the best group. We also compare our approach of selecting the best group with some classical precedence selection methods. Throughout, examples are provided to demonstrate our method.  相似文献   

2.
In reliability and lifetime testing, comparison of two groups of data is a common problem. It is often attractive, or even necessary, to make a quick and efficient decision in order to save time and costs. This paper presents a nonparametric predictive inference (NPI) approach to compare two groups, say X and Y, when one (or both) is (are) progressively censored. NPI can easily be applied to different types of progressive censoring schemes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. These inferences consider the event that the lifetime of a future unit from Y is greater than the lifetime of a future unit from X.  相似文献   

3.
Abstract

Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve and surface are useful tools to assess the ability of diagnostic tests to discriminate between ordered classes or groups. To define these diagnostic tests, selecting the optimal thresholds that maximize the accuracy of these tests is required. One procedure that is commonly used to find the optimal thresholds is by maximizing what is known as Youden’s index. This article presents nonparametric predictive inference (NPI) for selecting the optimal thresholds of a diagnostic test. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. Based on multiple future observations, the NPI approach is presented for selecting the optimal thresholds for two-group and three-group scenarios. In addition, a pairwise approach has also been presented for the three-group scenario. The article ends with an example to illustrate the proposed methods and a simulation study of the predictive performance of the proposed methods along with some classical methods such as Youden index. The NPI-based methods show some interesting results that overcome some of the issues concerning the predictive performance of Youden’s index.  相似文献   

4.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning, and credit scoring. The receiver operating characteristic (ROC) surface is a useful tool to assess the ability of a diagnostic test to discriminate among three-ordered classes or groups. In this article, nonparametric predictive inference (NPI) for three-group ROC analysis for ordinal outcomes is presented. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. This article also includes results on the volumes under the ROC surfaces and consideration of the choice of decision thresholds for the diagnosis. Two examples are provided to illustrate our method.  相似文献   

5.
Nonparametric predictive inference (NPI) is a powerful frequentist statistical framework based only on an exchangeability assumption for future and past observations, made possible by the use of lower and upper probabilities. In this article, NPI is presented for ordinal data, which are categorical data with an ordering of the categories. The method uses a latent variable representation of the observations and categories on the real line. Lower and upper probabilities for events involving the next observation are presented, and briefly compared to NPI for non ordered categorical data. As application, the comparison of multiple groups of ordinal data is presented.  相似文献   

6.
In several statistical problems, nonparametric confidence intervals for population quantiles can be constructed and their coverage probabilities can be computed exactly, but cannot in general be rendered equal to a pre-determined level. The same difficulty arises for coverage probabilities of nonparametric prediction intervals for future observations. One solution to this difficulty is to interpolate between intervals which have the closest coverage probability from above and below to the pre-determined level. In this paper, confidence intervals for population quantiles are constructed based on interpolated upper and lower records. Subsequently, prediction intervals are obtained for future upper records based on interpolated upper records. Additionally, we derive upper bounds for the coverage error of these confidence and prediction intervals. Finally, our results are applied to some real data sets. Also, a comparison via a simulation study is done with similar classical intervals obtained before.  相似文献   

7.
Summary.  Sparse clustered data arise in finely stratified genetic and epidemiologic studies and pose at least two challenges to inference. First, it is difficult to model and interpret the full joint probability of dependent discrete data, which limits the utility of full likelihood methods. Second, standard methods for clustered data, such as pairwise likelihood and the generalized estimating function approach, are unsuitable when the data are sparse owing to the presence of many nuisance parameters. We present a composite conditional likelihood for use with sparse clustered data that provides valid inferences about covariate effects on both the marginal response probabilities and the intracluster pairwise association. Our primary focus is on sparse clustered binary data, in which case the method proposed utilizes doubly discordant quadruplets drawn from each stratum to conduct inference about the intracluster pairwise odds ratios.  相似文献   

8.
9.
Measuring the quality of determined protein structures is a very important problem in bioinformatics. Kernel density estimation is a well-known nonparametric method which is often used for exploratory data analysis. Recent advances, which have extended previous linear methods to multi-dimensional circular data, give a sound basis for the analysis of conformational angles of protein backbones, which lie on the torus. By using an energy test, which is based on interpoint distances, we initially investigate the dependence of the angles on the amino acid type. Then, by computing tail probabilities which are based on amino-acid conditional density estimates, a method is proposed which permits inference on a test set of data. This can be used, for example, to validate protein structures, choose between possible protein predictions and highlight unusual residue angles.  相似文献   

10.
Surveillance data provide a vital source of information for assessing the spread of a health problem or disease of interest and for planning for future health-care needs. However, the use of surveillance data requires proper adjustments of the reported caseload due to underreporting caused by reporting delays within a limited observation period. Although methods are available to address this classic statistical problem, they are largely focused on inference for the reporting delay distribution, with inference about caseload of disease incidence based on estimates for the delay distribution. This approach limits the complexity of models for disease incidence to provide reliable estimates and projections of incidence. Also, many of the available methods lack robustness since they require parametric distribution assumptions. We propose a new approach to overcome such limitations by allowing for separate models for the incidence and the reporting delay in a distribution-free fashion, but with joint inference for both modeling components, based on functional response models. In addition, we discuss inference about projections of future disease incidence to help identify significant shifts in temporal trends modeled based on the observed data. This latter issue on detecting ‘change points’ is not sufficiently addressed in the literature, despite the fact that such warning signs of potential outbreak are critically important for prevention purposes. We illustrate the approach with both simulated and real data, with the latter involving data for suicide attempts from the Veteran Healthcare Administration.  相似文献   

11.
This paper considers the problem of making statistical inferences about a parameter when a narrow interval centred at a given value of the parameter is considered special, which is interpreted as meaning that there is a substantial degree of prior belief that the true value of the parameter lies in this interval. A clear justification of the practical importance of this problem is provided. The main difficulty with the standard Bayesian solution to this problem is discussed and, as a result, a pseudo-Bayesian solution is put forward based on determining lower limits for the posterior probability of the parameter lying in the special interval by means of a sensitivity analysis. Since it is not assumed that prior beliefs necessarily need to be expressed in terms of prior probabilities, nor that post-data probabilities must be Bayesian posterior probabilities, hybrid methods of inference are also proposed that are based on specific ways of measuring and interpreting the classical concept of significance. The various methods that are outlined are compared and contrasted at both a foundational level, and from a practical viewpoint by applying them to real data from meta-analyses that appeared in a well-known medical article.  相似文献   

12.
The use of lower probabilities is considered for inferences in basic jury scenarios to study aspects of the size of juries and their composition if society consists of subpopulations. The use of lower probability seems natural in law, as it leads to robust inference in the sense of providing a defendant with the benefit of the doubt. The method presented in this paper focusses on how representative a jury is for the whole population, using a novel concept of a second ’imaginary’ jury together with exchangeability assumptions. It has the advantage that there is an explicit absence of any assumption with regard to guilt of a defendant. Although the concept of a jury in law is central in the presentation, the novel approach and the conclusions of this paper hold for representative decision making processes in many fields, and it also provides a new perspective to stratified sampling.  相似文献   

13.
ABSTRACT

Scientific research of all kinds should be guided by statistical thinking: in the design and conduct of the study, in the disciplined exploration and enlightened display of the data, and to avoid statistical pitfalls in the interpretation of the results. However, formal, probability-based statistical inference should play no role in most scientific research, which is inherently exploratory, requiring flexible methods of analysis that inherently risk overfitting. The nature of exploratory work is that data are used to help guide model choice, and under these circumstances, uncertainty cannot be precisely quantified, because of the inevitable model selection bias that results. To be valid, statistical inference should be restricted to situations where the study design and analysis plan are specified prior to data collection. Exploratory data analysis provides the flexibility needed for most other situations, including statistical methods that are regularized, robust, or nonparametric. Of course, no individual statistical analysis should be considered sufficient to establish scientific validity: research requires many sets of data along many lines of evidence, with a watchfulness for systematic error. Replicating and predicting findings in new data and new settings is a stronger way of validating claims than blessing results from an isolated study with statistical inferences.  相似文献   

14.
Abstract

This paper focuses on inference based on the confidence distributions of the nonparametric regression function and its derivatives, in which dependent inferences are combined by obtaining information about their dependency structure. We first give a motivating example in production operation system to illustrate the necessity of the problems studied in this paper in practical applications. A goodness-of-fit test for polynomial regression model is proposed on the basis of the idea of combined confidence distribution inference, which is the Fisher’s combination statistic in some cases. On the basis of this testing results, a combined estimator for the p-order derivative of nonparametric regression function is provided as well as its large sample size properties. Consequently, the performances of the proposed test and estimation method are illustrated by three specific examples. Finally, the motivating example is analyzed in detail. The simulated and real data examples illustrate the good performance and practicability of the proposed methods based on confidence distribution.  相似文献   

15.
Recently, Zhang [Simultaneous confidence intervals for several inverse Gaussian populations. Stat Probab Lett. 2014;92:125–131] proposed simultaneous pairwise confidence intervals (SPCIs) based on the fiducial generalized pivotal quantity concept to make inferences about the inverse Gaussian means under heteroscedasticity. In this paper, we propose three new methods for constructing SPCIs to make inferences on the means of several inverse Gaussian distributions when scale parameters and sample sizes are unequal. One of the methods results in a set of classic SPCIs (in the sense that it is not simulation-based inference) and the two others are based on a parametric bootstrap approach. The advantages of our proposed methods over Zhang’s (2014) method are: (i) the simulation results show that the coverage probability of the proposed parametric bootstrap approaches is fairly close to the nominal confidence coefficient while the coverage probability of Zhang’s method is smaller than the nominal confidence coefficient when the number of groups and the variance of groups are large and (ii) the proposed set of classic SPCIs is conservative in contrast to Zhang’s method.  相似文献   

16.
We develop both nonparametric and parametric methods for obtaining prediction bands for the empirical distribution function (EDF) of a future sample. These methods yield simultaneous prediction intervals for all order statistics of the future sample, and they also correspond to tests for the two-sample problem. The nonparametric prediction bands correspond to the two-sample Kolmogorov-Smirnov test and related nonparametric tests, but the parametric prediction bands correspond to entirely new parametric two-sample tests. The parametric prediction bands tend to outperform the nonparametric bands when the parametric assumptions hold, but they may have true coverage probabilities well below their nominal levels when the parametric assumptions fail. A new computational algorithm is used to obtain critical values in the nonparametric case.  相似文献   

17.
This paper describes a nonparametric approach to make inferences for aggregate loss models in the insurance framework. We assume that an insurance company provides a historical sample of claims given by claim occurrence times and claim sizes. Furthermore, information may be incomplete as claims may be censored and/or truncated. In this context, the main goal of this work consists of fitting a probability model for the total amount that will be paid on all claims during a fixed future time period. In order to solve this prediction problem, we propose a new methodology based on nonparametric estimators for the density functions with censored and truncated data, the use of Monte Carlo simulation methods and bootstrap resampling. The developed methodology is useful to compare alternative pricing strategies in different insurance decision problems. The proposed procedure is illustrated with a real dataset provided by the insurance department of an international commercial company.  相似文献   

18.
Empirical Bayes estimates of the local false discovery rate can reflect uncertainty about the estimated prior by supplementing their Bayesian posterior probabilities with confidence levels as posterior probabilities. This use of coherent fiducial inference with hierarchical models generates set estimators that propagate uncertainty to varying degrees. Some of the set estimates approach estimates from plug-in empirical Bayes methods for high numbers of comparisons and can come close to the usual confidence sets given a sufficiently low number of comparisons.  相似文献   

19.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. Good methods for determining diagnostic accuracy provide useful guidance on selection of patient treatment, and the ability to compare different diagnostic tests has a direct impact on quality of care. In this paper Nonparametric Predictive Inference (NPI) methods for accuracy of diagnostic tests with continuous test results are presented and discussed. For such tests, Receiver Operating Characteristic (ROC) curves have become popular tools for describing the performance of diagnostic tests. We present the NPI approach to ROC curves, and some important summaries of these curves. As NPI does not aim at inference for an entire population but instead explicitly considers a future observation, this provides an attractive alternative to standard methods. We show how NPI can be used to compare two continuous diagnostic tests.  相似文献   

20.
This article evaluates the usefulness of a nonparametric approach to Bayesian inference by presenting two applications. Our first application considers an educational choice problem. We focus on obtaining a predictive distribution for earnings corresponding to various levels of schooling. This predictive distribution incorporates the parameter uncertainty, so that it is relevant for decision making under uncertainty in the expected utility framework of microeconomics. The second application is to quantile regression. Our point here is to examine the potential of the nonparametric framework to provide inferences without relying on asymptotic approximations. Unlike in the first application, the standard asymptotic normal approximation turns out not to be a good guide.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号