首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Variability arises due to differences in the value of a quantity among different members of a population. Uncertainty arises due to lack of knowledge regarding the true value of a quantity for a given member of a population. We describe and evaluate two methods for quantifying both variability and uncertainty. These methods, bootstrap simulation and a likelihood-based method, are applied to three datasets. The datasets include a synthetic sample of 19 values from a Lognormal distribution, a sample of nine values obtained from measurements of the PCB concentration in leafy produce, and a sample of five values for the partitioning of chromium in the flue gas desulfurization system of coal-fired power plants. For each of these datasets, we employ the two methods to characterize uncertainty in the arithmetic mean and standard deviation, cumulative distribution functions based upon fitted parametric distributions, the 95th percentile of variability, and the 63rd percentile of uncertainty for the 81st percentile of variability. The latter is intended to show that it is possible to describe any point within the uncertain frequency distribution by specifying an uncertainty percentile and a variability percentile. Using the bootstrap method, we compare results based upon use of the method of matching moments and the method of maximum likelihood for fitting distributions to data. Our results indicate that with only 5–19 data points as in the datasets we have evaluated, there is substantial uncertainty based upon random sampling error. Both the boostrap and likelihood-based approaches yield comparable uncertainty estimates in most cases.  相似文献   

2.
Variability is the heterogeneity of values within a population. Uncertainty refers to lack of knowledge regarding the true value of a quantity. Mixture distributions have the potential to improve the goodness of fit to data sets not adequately described by a single parametric distribution. Uncertainty due to random sampling error in statistics of interests can be estimated based upon bootstrap simulation. In order to evaluate the robustness of using mixture distribution as a basis for estimating both variability and uncertainty, 108 synthetic data sets generated from selected population mixture log-normal distributions were investigated, and properties of variability and uncertainty estimates were evaluated with respect to variation in sample size, mixing weight, and separation between components of mixtures. Furthermore, mixture distributions were compared with single-component distributions. Findings include: (1). mixing weight influences the stability of variability and uncertainty estimates; (2). bootstrap simulation results tend to be more stable for larger sample sizes; (3). when two components are well separated, the stability of bootstrap simulation is improved; however, a larger degree of uncertainty arises regarding the percentiles coinciding with the separated region; (4). when two components are not well separated, a single distribution may often be a better choice because it has fewer parameters and better numerical stability; and (5). dependencies exist in sampling distributions of parameters of mixtures and are influenced by the amount of separation between the components. An emission factor case study based upon NO(x) emissions from coal-fired tangential boilers is used to illustrate the application of the approach.  相似文献   

3.
Annual concentrations of toxic air contaminants are of primary concern from the perspective of chronic human exposure assessment and risk analysis. Despite recent advances in air quality monitoring technology, resource and technical constraints often impose limitations on the availability of a sufficient number of ambient concentration measurements for performing environmental risk analysis. Therefore, sample size limitations, representativeness of data, and uncertainties in the estimated annual mean concentration must be examined before performing quantitative risk analysis. In this paper, we discuss several factors that need to be considered in designing field-sampling programs for toxic air contaminants and in verifying compliance with environmental regulations. Specifically, we examine the behavior of SO2, TSP, and CO data as surrogates for toxic air contaminants and as examples of point source, area source, and line source-dominated pollutants, respectively, from the standpoint of sampling design. We demonstrate the use of bootstrap resampling method and normal theory in estimating the annual mean concentration and its 95% confidence bounds from limited sampling data, and illustrate the application of operating characteristic (OC) curves to determine optimum sample size and other sampling strategies. We also outline a statistical procedure, based on a one-sided t-test, that utilizes the sampled concentration data for evaluating whether a sampling site is compliance with relevant ambient guideline concentrations for toxic air contaminants.  相似文献   

4.
Food‐borne infection is caused by intake of foods or beverages contaminated with microbial pathogens. Dose‐response modeling is used to estimate exposure levels of pathogens associated with specific risks of infection or illness. When a single dose‐response model is used and confidence limits on infectious doses are calculated, only data uncertainty is captured. We propose a method to estimate the lower confidence limit on an infectious dose by including model uncertainty and separating it from data uncertainty. The infectious dose is estimated by a weighted average of effective dose estimates from a set of dose‐response models via a Kullback information criterion. The confidence interval for the infectious dose is constructed by the delta method, where data uncertainty is addressed by a bootstrap method. To evaluate the actual coverage probabilities of the lower confidence limit, a Monte Carlo simulation study is conducted under sublinear, linear, and superlinear dose‐response shapes that can be commonly found in real data sets. Our model‐averaging method achieves coverage close to nominal in almost all cases, thus providing a useful and efficient tool for accurate calculation of lower confidence limits on infectious doses.  相似文献   

5.
Bayesian Monte Carlo (BMC) decision analysis adopts a sampling procedure to estimate likelihoods and distributions of outcomes, and then uses that information to calculate the expected performance of alternative strategies, the value of information, and the value of including uncertainty. These decision analysis outputs are therefore subject to sample error. The standard error of each estimate and its bias, if any, can be estimated by the bootstrap procedure. The bootstrap operates by resampling (with replacement) from the original BMC sample, and redoing the decision analysis. Repeating this procedure yields a distribution of decision analysis outputs. The bootstrap approach to estimating the effect of sample error upon BMC analysis is illustrated with a simple value-of-information calculation along with an analysis of a proposed control structure for Lake Erie. The examples show that the outputs of BMC decision analysis can have high levels of sample error and bias.  相似文献   

6.
The benchmark dose (BMD) is an exposure level that would induce a small risk increase (BMR level) above the background. The BMD approach to deriving a reference dose for risk assessment of noncancer effects is advantageous in that the estimate of BMD is not restricted to experimental doses and utilizes most available dose-response information. To quantify statistical uncertainty of a BMD estimate, we often calculate and report its lower confidence limit (i.e., BMDL), and may even consider it as a more conservative alternative to BMD itself. Computation of BMDL may involve normal confidence limits to BMD in conjunction with the delta method. Therefore, factors, such as small sample size and nonlinearity in model parameters, can affect the performance of the delta method BMDL, and alternative methods are useful. In this article, we propose a bootstrap method to estimate BMDL utilizing a scheme that consists of a resampling of residuals after model fitting and a one-step formula for parameter estimation. We illustrate the method with clustered binary data from developmental toxicity experiments. Our analysis shows that with moderately elevated dose-response data, the distribution of BMD estimator tends to be left-skewed and bootstrap BMDL s are smaller than the delta method BMDL s on average, hence quantifying risk more conservatively. Statistically, the bootstrap BMDL quantifies the uncertainty of the true BMD more honestly than the delta method BMDL as its coverage probability is closer to the nominal level than that of delta method BMDL. We find that BMD and BMDL estimates are generally insensitive to model choices provided that the models fit the data comparably well near the region of BMD. Our analysis also suggests that, in the presence of a significant and moderately strong dose-response relationship, the developmental toxicity experiments under the standard protocol support dose-response assessment at 5% BMR for BMD and 95% confidence level for BMDL.  相似文献   

7.
Slob  W.  Pieters  M. N. 《Risk analysis》1998,18(6):787-798
The use of uncertainty factors in the standard method for deriving acceptable intake or exposure limits for humans, such as the Reference Dose (RfD), may be viewed as a conservative method of taking various uncertainties into account. As an obvious alternative, the use of uncertainty distributions instead of uncertainty factors is gaining attention. This paper presents a comprehensive discussion of a general framework that quantifies both the uncertainties in the no-adverse-effect level in the animal (using a benchmark-like approach) and the uncertainties in the various extrapolation steps involved (using uncertainty distributions). This approach results in an uncertainty distribution for the no-adverse-effect level in the sensitive human subpopulation, reflecting the overall scientific uncertainty associated with that level. A lower percentile of this distribution may be regarded as an acceptable exposure limit (e.g., RfD) that takes account of the various uncertainties in a nonconservative fashion. The same methodology may also be used as a tool to derive a distribution for possible human health effects at a given exposure level. We argue that in a probabilistic approach the uncertainty in the estimated no-adverse-effect-level in the animal should be explicitly taken into account. Not only is this source of uncertainty too large to be ignored, it also has repercussions for the quantification of the other uncertainty distributions.  相似文献   

8.
Schulz  Terry W.  Griffin  Susan 《Risk analysis》1999,19(4):577-584
The U.S. Environmental Protection Agency (EPA) recommends the use of the one-sided 95% upper confidence limit of the arithmetic mean based on either a normal or lognormal distribution for the contaminant (or exposure point) concentration term in the Superfund risk assessment process. When the data are not normal or lognormal this recommended approach may overestimate the exposure point concentration (EPC) and may lead to unecessary cleanup at a hazardous waste site. The EPA concentration term only seems to perform like alternative EPC methods when the data are well fit by a lognormal distribution. Several alternative methods for calculating the EPC are investigated and compared using soil data collected from three hazardous waste sites in Montana, Utah, and Colorado. For data sets that are well fit by a lognormal distribution, values for the Chebychev inequality or the EPA concentration term may be appropriate EPCs. For data sets where the soil concentration data are well fit by gamma distributions, Wong's method may be used for calculating EPCs. The studentized bootstrap-t and Hall's bootstrap-t transformation are recommended for EPC calculation when all distribution fits are poor. If a data set is well fit by a distribution, parametric bootstrap may provide a suitable EPC.  相似文献   

9.
In this article, the performance objectives (POs) for Bacillus cereus group (BC) in celery, cheese, and spelt added as ingredients in a ready‐to‐eat mixed spelt salad, packaged under modified atmosphere, were calculated using a Bayesian approach. In order to derive the POs, BC detection and enumeration were performed in nine lots of naturally contaminated ingredients and final product. Moreover, the impact of specific production steps on the BC contamination was quantified. Finally, a sampling plan to verify the ingredient lots' compliance with each PO value at a 95% confidence level (CL) was defined. To calculate the POs, detection results as well as results above the limit of detection but below the limit of quantification (i.e., censored data) were analyzed. The most probable distribution of the censored data was determined and two‐dimensional (2D) Monte Carlo simulations were performed. The PO values were calculated to meet a food safety objective of 4 log10 cfu of BC for g of spelt salad at the time of consumption. When BC grows during storage between 0.90 and 1.90 log10 cfu/g, the POs for BC in celery, cheese, and spelt ranged between 1.21 log10 cfu/g for celery and 2.45 log10 cfu/g for spelt. This article represents the first attempt to manage the concept of PO and 2D Monte Carlo simulation in the flow chart of a complex food matrix, including raw and cooked ingredients.  相似文献   

10.
Counterfactual distributions are important ingredients for policy analysis and decomposition analysis in empirical economics. In this article, we develop modeling and inference tools for counterfactual distributions based on regression methods. The counterfactual scenarios that we consider consist of ceteris paribus changes in either the distribution of covariates related to the outcome of interest or the conditional distribution of the outcome given covariates. For either of these scenarios, we derive joint functional central limit theorems and bootstrap validity results for regression‐based estimators of the status quo and counterfactual outcome distributions. These results allow us to construct simultaneous confidence sets for function‐valued effects of the counterfactual changes, including the effects on the entire distribution and quantile functions of the outcome as well as on related functionals. These confidence sets can be used to test functional hypotheses such as no‐effect, positive effect, or stochastic dominance. Our theory applies to general counterfactual changes and covers the main regression methods including classical, quantile, duration, and distribution regressions. We illustrate the results with an empirical application to wage decompositions using data for the United States. As a part of developing the main results, we introduce distribution regression as a comprehensive and flexible tool for modeling and estimating the entire conditional distribution. We show that distribution regression encompasses the Cox duration regression and represents a useful alternative to quantile regression. We establish functional central limit theorems and bootstrap validity results for the empirical distribution regression process and various related functionals.  相似文献   

11.
The delta method and continuous mapping theorem are among the most extensively used tools in asymptotic derivations in econometrics. Extensions of these methods are provided for sequences of functions that are commonly encountered in applications and where the usual methods sometimes fail. Important examples of failure arise in the use of simulation‐based estimation methods such as indirect inference. The paper explores the application of these methods to the indirect inference estimator (IIE) in first order autoregressive estimation. The IIE uses a binding function that is sample size dependent. Its limit theory relies on a sequence‐based delta method in the stationary case and a sequence‐based implicit continuous mapping theorem in unit root and local to unity cases. The new limit theory shows that the IIE achieves much more than (partial) bias correction. It changes the limit theory of the maximum likelihood estimator (MLE) when the autoregressive coefficient is in the locality of unity, reducing the bias and the variance of the MLE without affecting the limit theory of the MLE in the stationary case. Thus, in spite of the fact that the IIE is a continuously differentiable function of the MLE, the limit distribution of the IIE is not simply a scale multiple of the MLE, but depends implicitly on the full binding function mapping. The unit root case therefore represents an important example of the failure of the delta method and shows the need for an implicit mapping extension of the continuous mapping theorem.  相似文献   

12.
Using probability plots and Maximum Likelihood Estimation (MLE), we fit lognormal distributions to data compiled by Ershow et al. for daily intake of total water and tap water by three groups of women (controls, pregnant, and lactating; all between 15–49 years of age) in the United States. We also develop bivariate lognormal distributions for the joint distribution of water ingestion and body weight for these three groups. Overall, we recommend the marginal distributions for water intake as fit by MLE for use in human health risk assessments.  相似文献   

13.
We propose a tractable, data‐driven demand estimation procedure based on the use of maximum entropy (ME) distributions, and apply it to a stochastic capacity control problem motivated from airline revenue management. Specifically, we study the two fare class “Littlewood” problem in a setting where the firm has access to only potentially censored sales observations; this is also known as the repeated newsvendor problem. We propose a heuristic that iteratively fits an ME distribution to all observed sales data, and in each iteration selects a protection level based on the estimated distribution. When the underlying demand distribution is discrete, we show that the sequence of protection levels converges to the optimal one almost surely, and that the ME demand forecast converges to the true demand distribution for all values below the optimal protection level. That is, the proposed heuristic avoids the “spiral down” effect, making it attractive for problems of joint forecasting and revenue optimization problems in the presence of censored observations.  相似文献   

14.
Exposure guidelines for potentially toxic substances are often based on a reference dose (RfD) that is determined by dividing a no-observed-adverse-effect-level (NOAEL), lowest-observed-adverse-effect-level (LOAEL), or benchmark dose (BD) corresponding to a low level of risk, by a product of uncertainty factors. The uncertainty factors for animal to human extrapolation, variable sensitivities among humans, extrapolation from measured subchronic effects to unknown results for chronic exposures, and extrapolation from a LOAEL to a NOAEL can be thought of as random variables that vary from chemical to chemical. Selected databases are examined that provide distributions across chemicals of inter- and intraspecies effects, ratios of LOAELs to NOAELs, and differences in acute and chronic effects, to illustrate the determination of percentiles for uncertainty factors. The distributions of uncertainty factors tend to be approximately lognormally distributed. The logarithm of the product of independent uncertainty factors is approximately distributed as the sum of normally distributed variables, making it possible to estimate percentiles for the product. Hence, the size of the products of uncertainty factors can be selected to provide adequate safety for a large percentage (e.g., approximately 95%) of RfDs. For the databases used to describe the distributions of uncertainty factors, using values of 10 appear to be reasonable and conservative. For the databases examined the following simple "Rule of 3s" is suggested that exceeds the estimated 95th percentile of the product of uncertainty factors: If only a single uncertainty factor is required use 33, for any two uncertainty factors use 3 x 33 approximately 100, for any three uncertainty factors use a combined factor of 3 x 100 = 300, and if all four uncertainty factors are needed use a total factor of 3 x 300 = 900. If near the 99th percentile is desired use another factor of 3. An additional factor may be needed for inadequate data or a modifying factor for other uncertainties (e.g., different routes of exposure) not covered above.  相似文献   

15.
Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway‐Maxwell Poisson (COM‐Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM‐Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM‐Poisson GLM, and (2) estimate the prediction accuracy of the COM‐Poisson GLM using simulated data sets. The results of the study indicate that the COM‐Poisson GLM is flexible enough to model under‐, equi‐, and overdispersed data sets with different sample mean values. The results also show that the COM‐Poisson GLM yields accurate parameter estimates. The COM‐Poisson GLM provides a promising and flexible approach for performing count data regression.  相似文献   

16.
A. E. Ades  G. Lu 《Risk analysis》2003,23(6):1165-1172
Monte Carlo simulation has become the accepted method for propagating parameter uncertainty through risk models. It is widely appreciated, however, that correlations between input variables must be taken into account if models are to deliver correct assessments of uncertainty in risk. Various two-stage methods have been proposed that first estimate a correlation structure and then generate Monte Carlo simulations, which incorporate this structure while leaving marginal distributions of parameters unchanged. Here we propose a one-stage alternative, in which the correlation structure is estimated from the data directly by Bayesian Markov Chain Monte Carlo methods. Samples from the posterior distribution of the outputs then correctly reflect the correlation between parameters, given the data and the model. Besides its computational simplicity, this approach utilizes the available evidence from a wide variety of structures, including incomplete data and correlated and uncorrelated repeat observations. The major advantage of a Bayesian approach is that, rather than assuming the correlation structure is fixed and known, it captures the joint uncertainty induced by the data in all parameters, including variances and covariances, and correctly propagates this through the decision or risk model. These features are illustrated with examples on emissions of dioxin congeners from solid waste incinerators.  相似文献   

17.
Estimates of the lifetime-absorbed daily dose (LADD) of acrylamide resulting from use of representative personal-care products containing polyacrylamides have been developed. All of the parameters that determine the amount of acrylamide absorbed by an individual vary from one individual to another. Moreover, for some parameters there is uncertainty as to which is the correct or representative value from a range of values. Consequently, the parameters used in the estimation of the LADD of acrylamide from usage of a particular product type (e.g., deodorant, makeup, etc.) were represented by distributions evaluated using Monte Carlo analyses.((1-4)) From these data, distributions of values for key parameters, such as the amount of acrylamide in polyacrylamide, absorption fraction, etc., were defined and used to provide a distribution of LADDs for each personal-care product. The estimated total acrylamide LADD (across all products) for males and females at the median, mean, and 95th percentile of the distribution of individual LADD values were 4.7 x 10(-8), 2.3 x 10(-7), and 7.3 x 10(-7) mg/kg/day for females and 3.6 x 10(-8), 1.7 x 10(-7), and 5.4 x 10(-7) mg/kg/day for males. The ratio of the LADDs to risk-specific dose corresponding to a target risk level of 1 x 10(-5), the acceptable risk level for this investigation, derived using approaches typically used by the FDA, the USEPA, and proposed for use by the European Union (EU) were also calculated. All ratios were well below 1, indicating that all the extra lifetime cancer risk from the use of polyacrylamide-containing personal-care products, in the manner assumed in this assessment, are well below acceptable levels. Even if it were assumed that an individual used all of the products together, the estimated LADD would still provide a dose that was well below the acceptable risk levels.  相似文献   

18.
I use an analogy with the history of physical measurements, population and energy projections, and analyze the trends in several data sets to quantify the overconfidence of the experts in the reliability of their uncertainty estimates. Data sets include (i) time trends in the sequential measurements of the same physical quantity; (ii) national population projections; and (iii) projections for the U.S., energy sector. Probabilities of large deviations for the true values are parametrized by an exponential distribution with the slope determined by the data. Statistics of past errors can be used in probabilistic risk assessment to hedge against unsuspected uncertainties and to include the possibility of human error into the framework of uncertainty analysis. By means of a sample Monte Carlo simulation of cancer risk caused by ingestion of benzene in soil, I demonstrate how the upper 95th percentiles of risk are changed when unsuspected uncertainties are included. I recommend to inflate the estimated uncertainties by default safety factors determined from the relevant historical data sets.  相似文献   

19.
The aging domestic oil production infrastructure represents a high risk to the environment because of the type of fluids being handled (oil and brine) and the potential for accidental release of these fluids into sensitive ecosystems. Currently, there is not a quantitative risk model directly applicable to onshore oil exploration and production (E&P) facilities. We report on a probabilistic reliability model created for onshore exploration and production (E&P) facilities. Reliability theory, failure modes and effects analysis (FMEA), and event trees were used to develop the model estimates of the failure probability of typical oil production equipment. Monte Carlo simulation was used to translate uncertainty in input parameter values to uncertainty in the model output. The predicted failure rates were calibrated to available failure rate information by adjusting probability density function parameters used as random variates in the Monte Carlo simulations. The mean and standard deviation of normal variate distributions from which the Weibull distribution characteristic life was chosen were used as adjustable parameters in the model calibration. The model was applied to oil production leases in the Tallgrass Prairie Preserve, Oklahoma. We present the estimated failure probability due to the combination of the most significant failure modes associated with each type of equipment (pumps, tanks, and pipes). The results show that the estimated probability of failure for tanks is about the same as that for pipes, but that pumps have much lower failure probability. The model can provide necessary equipment reliability information for proactive risk management at the lease level by providing quantitative information to base allocation of maintenance resources to high-risk equipment that will minimize both lost production and ecosystem damage.  相似文献   

20.
The appearance of measurement error in exposure and risk factor data potentially affects any inferences regarding variability and uncertainty because the distribution representing the observed data set deviates from the distribution that represents an error-free data set. A methodology for improving the characterization of variability and uncertainty with known measurement errors in data is demonstrated in this article based on an observed data set, known measurement error, and a measurement-error model. A practical method for constructing an error-free data set is presented and a numerical method based upon bootstrap pairs, incorporating two-dimensional Monte Carlo simulation, is introduced to address uncertainty arising from measurement error in selected statistics. When measurement error is a large source of uncertainty, substantial differences between the distribution representing variability of the observed data set and the distribution representing variability of the error-free data set will occur. Furthermore, the shape and range of the probability bands for uncertainty differ between the observed and error-free data set. Failure to separately characterize contributions from random sampling error and measurement error will lead to bias in the variability and uncertainty estimates. However, a key finding is that total uncertainty in mean can be properly quantified even if measurement and random sampling errors cannot be separated. An empirical case study is used to illustrate the application of the methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号