首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Steven M. Quiring 《Risk analysis》2011,31(12):1897-1906
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out‐of‐sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy.  相似文献   

2.
Choice models and neural networks are two approaches used in modeling selection decisions. Defining model performance as the out‐of‐sample prediction power of a model, we test two hypotheses: (i) choice models and neural network models are equal in performance, and (ii) hybrid models consisting of a combination of choice and neural network models perform better than each stand‐alone model. We perform statistical tests for two classes of linear and nonlinear hybrid models and compute the empirical integrated rank (EIR) indices to compare the overall performances of the models. We test the above hypotheses by using data for various brand and store choices for three consumer products. Extensive jackknifing and out‐of‐sample tests for four different model specifications are applied for increasing the external validity of the results. Our results show that using neural networks has a higher probability of resulting in a better performance. Our findings also indicate that hybrid models outperform stand‐alone models, in that using hybrid models guarantee overall results equal or better than the two stand‐alone models. The improvement is particularly significant in cases where neither of the two stand‐alone models is very accurate in prediction, indicating that the proposed hybrid models may capture aspects of predictive accuracy that neither stand‐alone model is capable of on their own. Our results are particularly important in brand management and customer relationship management, indicating that multiple technologies and mixture of technologies may yield more accurate and reliable outcomes than individual ones.  相似文献   

3.
Electric power is a critical infrastructure service after hurricanes, and rapid restoration of electric power is important in order to minimize losses in the impacted areas. However, rapid restoration of electric power after a hurricane depends on obtaining the necessary resources, primarily repair crews and materials, before the hurricane makes landfall and then appropriately deploying these resources as soon as possible after the hurricane. This, in turn, depends on having sound estimates of both the overall severity of the storm and the relative risk of power outages in different areas. Past studies have developed statistical, regression-based approaches for estimating the number of power outages in advance of an approaching hurricane. However, these approaches have either not been applicable for future events or have had lower predictive accuracy than desired. This article shows that a different type of regression model, a generalized additive model (GAM), can outperform the types of models used previously. This is done by developing and validating a GAM based on power outage data during past hurricanes in the Gulf Coast region and comparing the results from this model to the previously used generalized linear models.  相似文献   

4.
This article presents a regression‐tree‐based meta‐analysis of rodent pulmonary toxicity studies of uncoated, nonfunctionalized carbon nanotube (CNT) exposure. The resulting analysis provides quantitative estimates of the contribution of CNT attributes (impurities, physical dimensions, and aggregation) to pulmonary toxicity indicators in bronchoalveolar lavage fluid: neutrophil and macrophage count, and lactate dehydrogenase and total protein concentrations. The method employs classification and regression tree (CART) models, techniques that are relatively insensitive to data defects that impair other types of regression analysis: high dimensionality, nonlinearity, correlated variables, and significant quantities of missing values. Three types of analysis are presented: the RT, the random forest (RF), and a random‐forest‐based dose‐response model. The RT shows the best single model supported by all the data and typically contains a small number of variables. The RF shows how much variance reduction is associated with every variable in the data set. The dose‐response model is used to isolate the effects of CNT attributes from the CNT dose, showing the shift in the dose‐response caused by the attribute across the measured range of CNT doses. It was found that the CNT attributes that contribute the most to pulmonary toxicity were metallic impurities (cobalt significantly increased observed toxicity, while other impurities had mixed effects), CNT length (negatively correlated with most toxicity indicators), CNT diameter (significantly positively associated with toxicity), and aggregate size (negatively correlated with cell damage indicators and positively correlated with immune response indicators). Increasing CNT N2‐BET‐specific surface area decreased toxicity indicators.  相似文献   

5.
This study presents a tree‐based logistic regression approach to assessing work zone casualty risk, which is defined as the probability of a vehicle occupant being killed or injured in a work zone crash. First, a decision tree approach is employed to determine the tree structure and interacting factors. Based on the Michigan M‐94\I‐94\I‐94BL\I‐94BR highway work zone crash data, an optimal tree comprising four leaf nodes is first determined and the interacting factors are found to be airbag, occupant identity (i.e., driver, passenger), and gender. The data are then split into four groups according to the tree structure. Finally, the logistic regression analysis is separately conducted for each group. The results show that the proposed approach outperforms the pure decision tree model because the former has the capability of examining the marginal effects of risk factors. Compared with the pure logistic regression method, the proposed approach avoids the variable interaction effects so that it significantly improves the prediction accuracy.  相似文献   

6.
Threshold models have a wide variety of applications in economics. Direct applications include models of separating and multiple equilibria. Other applications include empirical sample splitting when the sample split is based on a continuously‐distributed variable such as firm size. In addition, threshold models may be used as a parsimonious strategy for nonparametric function estimation. For example, the threshold autoregressive model (TAR) is popular in the nonlinear time series literature. Threshold models also emerge as special cases of more complex statistical frameworks, such as mixture models, switching models, Markov switching models, and smooth transition threshold models. It may be important to understand the statistical properties of threshold models as a preliminary step in the development of statistical tools to handle these more complicated structures. Despite the large number of potential applications, the statistical theory of threshold estimation is undeveloped. It is known that threshold estimates are super‐consistent, but a distribution theory useful for testing and inference has yet to be provided. This paper develops a statistical theory for threshold estimation in the regression context. We allow for either cross‐section or time series observations. Least squares estimation of the regression parameters is considered. An asymptotic distribution theory for the regression estimates (the threshold and the regression slopes) is developed. It is found that the distribution of the threshold estimate is nonstandard. A method to construct asymptotic confidence intervals is developed by inverting the likelihood ratio statistic. It is shown that this yields asymptotically conservative confidence regions. Monte Carlo simulations are presented to assess the accuracy of the asymptotic approximations. The empirical relevance of the theory is illustrated through an application to the multiple equilibria growth model of Durlauf and Johnson (1995).  相似文献   

7.
In this article, we develop statistical models to predict the number and geographic distribution of fires caused by earthquake ground motion and tsunami inundation in Japan. Using new, uniquely large, and consistent data sets from the 2011 Tōhoku earthquake and tsunami, we fitted three types of models—generalized linear models (GLMs), generalized additive models (GAMs), and boosted regression trees (BRTs). This is the first time the latter two have been used in this application. A simple conceptual framework guided identification of candidate covariates. Models were then compared based on their out‐of‐sample predictive power, goodness of fit to the data, ease of implementation, and relative importance of the framework concepts. For the ground motion data set, we recommend a Poisson GAM; for the tsunami data set, a negative binomial (NB) GLM or NB GAM. The best models generate out‐of‐sample predictions of the total number of ignitions in the region within one or two. Prefecture‐level prediction errors average approximately three. All models demonstrate predictive power far superior to four from the literature that were also tested. A nonlinear relationship is apparent between ignitions and ground motion, so for GLMs, which assume a linear response‐covariate relationship, instrumental intensity was the preferred ground motion covariate because it captures part of that nonlinearity. Measures of commercial exposure were preferred over measures of residential exposure for both ground motion and tsunami ignition models. This may vary in other regions, but nevertheless highlights the value of testing alternative measures for each concept. Models with the best predictive power included two or three covariates.  相似文献   

8.
Mitchell J. Small 《Risk analysis》2011,31(10):1561-1575
A methodology is presented for assessing the information value of an additional dosage experiment in existing bioassay studies. The analysis demonstrates the potential reduction in the uncertainty of toxicity metrics derived from expanded studies, providing insights for future studies. Bayesian methods are used to fit alternative dose‐response models using Markov chain Monte Carlo (MCMC) simulation for parameter estimation and Bayesian model averaging (BMA) is used to compare and combine the alternative models. BMA predictions for benchmark dose (BMD) are developed, with uncertainty in these predictions used to derive the lower bound BMDL. The MCMC and BMA results provide a basis for a subsequent Monte Carlo analysis that backcasts the dosage where an additional test group would have been most beneficial in reducing the uncertainty in the BMD prediction, along with the magnitude of the expected uncertainty reduction. Uncertainty reductions are measured in terms of reduced interval widths of predicted BMD values and increases in BMDL values that occur as a result of this reduced uncertainty. The methodology is illustrated using two existing data sets for TCDD carcinogenicity, fitted with two alternative dose‐response models (logistic and quantal‐linear). The example shows that an additional dose at a relatively high value would have been most effective for reducing the uncertainty in BMA BMD estimates, with predicted reductions in the widths of uncertainty intervals of approximately 30%, and expected increases in BMDL values of 5–10%. The results demonstrate that dose selection for studies that subsequently inform dose‐response models can benefit from consideration of how these models will be fit, combined, and interpreted.  相似文献   

9.
In this article, we discuss an outage‐forecasting model that we have developed. This model uses very few input variables to estimate hurricane‐induced outages prior to landfall with great predictive accuracy. We also show the results for a series of simpler models that use only publicly available data and can still estimate outages with reasonable accuracy. The intended users of these models are emergency response planners within power utilities and related government agencies. We developed our models based on the method of random forest, using data from a power distribution system serving two states in the Gulf Coast region of the United States. We also show that estimates of system reliability based on wind speed alone are not sufficient for adequately capturing the reliability of system components. We demonstrate that a multivariate approach can produce more accurate power outage predictions.  相似文献   

10.
Projecting losses associated with hurricanes is a complex and difficult undertaking that is fraught with uncertainties. Hurricane Charley, which struck southwest Florida on August 13, 2004, illustrates the uncertainty of forecasting damages from these storms. Due to shifts in the track and the rapid intensification of the storm, real-time estimates grew from 2 billion dollars to 3 billion dollars in losses late on the 12th to a peak of 50 billion dollars for a brief time as the storm appeared to be headed for the Tampa Bay area. The storm struck the resort areas of Charlotte Harbor and moved across the densely populated central part of the state, with early poststorm estimates in the 28 dollars to 31 billion dollars range, and final estimates converging at 15 billion dollars as the actual intensity at landfall became apparent. The Florida Commission on Hurricane Loss Projection Methodology (FCHLPM) has a great appreciation for the role of computer models in projecting losses from hurricanes. The FCHLPM contracts with a professional team to perform onsite (confidential) audits of computer models developed by several different companies in the United States that seek to have their models approved for use in insurance rate filings in Florida. The team's members represent the fields of actuarial science, computer science, meteorology, statistics, and wind and structural engineering. An important part of the auditing process requires uncertainty and sensitivity analyses to be performed with the applicant's proprietary model. To influence future such analyses, an uncertainty and sensitivity analysis has been completed for loss projections arising from use of a sophisticated computer model based on the Holland wind field. Sensitivity analyses presented in this article utilize standardized regression coefficients to quantify the contribution of the computer input variables to the magnitude of the wind speed.  相似文献   

11.
Incident data about disruptions to the electric power grid provide useful information that can be used as inputs into risk management policies in the energy sector for disruptions from a variety of origins, including terrorist attacks. This article uses data from the Disturbance Analysis Working Group (DAWG) database, which is maintained by the North American Electric Reliability Council (NERC), to look at incidents over time in the United States and Canada for the period 1990-2004. Negative binomial regression, logistic regression, and weighted least squares regression are used to gain a better understanding of how these disturbances varied over time and by season during this period, and to analyze how characteristics such as number of customers lost and outage duration are related to different characteristics of the outages. The results of the models can be used as inputs to construct various scenarios to estimate potential outcomes of electric power outages, encompassing the risks, consequences, and costs of such outages.  相似文献   

12.
Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway‐Maxwell Poisson (COM‐Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM‐Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM‐Poisson GLM, and (2) estimate the prediction accuracy of the COM‐Poisson GLM using simulated data sets. The results of the study indicate that the COM‐Poisson GLM is flexible enough to model under‐, equi‐, and overdispersed data sets with different sample mean values. The results also show that the COM‐Poisson GLM yields accurate parameter estimates. The COM‐Poisson GLM provides a promising and flexible approach for performing count data regression.  相似文献   

13.
Ensemble species distribution models combine the strengths of several species environmental matching models, while minimizing the weakness of any one model. Ensemble models may be particularly useful in risk analysis of recently arrived, harmful invasive species because species may not yet have spread to all suitable habitats, leaving species‐environment relationships difficult to determine. We tested five individual models (logistic regression, boosted regression trees, random forest, multivariate adaptive regression splines (MARS), and maximum entropy model or Maxent) and ensemble modeling for selected nonnative plant species in Yellowstone and Grand Teton National Parks, Wyoming; Sequoia and Kings Canyon National Parks, California, and areas of interior Alaska. The models are based on field data provided by the park staffs, combined with topographic, climatic, and vegetation predictors derived from satellite data. For the four invasive plant species tested, ensemble models were the only models that ranked in the top three models for both field validation and test data. Ensemble models may be more robust than individual species‐environment matching models for risk analysis.  相似文献   

14.
Hurricanes frequently cause damage to electric power systems in the United States, leading to widespread and prolonged loss of electric service. Restoring service quickly requires the use of repair crews and materials that must be requested, at considerable cost, prior to the storm. U.S. utilities have struggled to strike a good balance between over‐ and underpreparation largely because of a lack of methods for rigorously estimating the impacts of an approaching hurricane on their systems. Previous work developed methods for estimating the risk of power outages and customer loss of power, with an outage defined as nontransitory activation of a protective device. In this article, we move beyond these previous approaches to directly estimate damage to the electric power system. Our approach is based on damage data from past storms together with regression and data mining techniques to estimate the number of utility poles that will need to be replaced. Because restoration times and resource needs are more closely tied to the number of poles and transformers that need to be replaced than to the number of outages, this pole‐based assessment provides a much stronger basis for prestorm planning by utilities. Our results show that damage to poles during hurricanes can be assessed accurately, provided that adequate past damage data are available. However, the availability of data can, and currently often is, the limiting factor in developing these types of models in practice. Opportunities for further enhancing the damage data recorded during hurricanes are also discussed.  相似文献   

15.
This paper examines the abilities of learning models to describe subject behavior in experiments. A new experiment involving multistage asymmetric‐information games is conducted, and the experimental data are compared with the predictions of Nash equilibrium and two types of learning model: a reinforcement‐based model similar to that used by Roth and Erev (1995), and belief‐based models similar to the ‘cautious fictitious play’ of Fudenberg and Levine (1995, 1998) These models make predictions that are qualitatively similar cycling around the Nash equilibrium that is much more apparent than movement toward it. While subject behavior is not adequately described by Nash equilibrium, it is consistent with the qualitative predictions of the learning models. We examine several criteria for quantitatively comparing the predictions of alternative models. According to almost all of these criteria, both types of learning model outperform Nash equilibrium. According to some criteria, the reinforcement‐based model performs better than any version of the belief‐based model; according to others, there exist versions of the belief‐based model that outperform the reinforcement‐based model. The abilities of these models are further tested with respect to the results of other published experiments. The relative performance of the two learning models depends on the experiment, and varies according to which criterion of success is used. Again, both models perform better than equilibrium in most cases.  相似文献   

16.
Eren Demir 《决策科学》2014,45(5):849-880
The number of emergency (or unplanned) readmissions in the United Kingdom National Health Service (NHS) has been rising for many years. This trend, which is possibly related to poor patient care, places financial pressures on hospitals and on national healthcare budgets. As a result, clinicians and key decision makers (e.g., managers and commissioners) are interested in predicting patients at high risk of readmission. Logistic regression is the most popular method of predicting patient‐specific probabilities. However, these studies have produced conflicting results with poor prediction accuracies. We compared the predictive accuracy of logistic regression with that of regression trees for predicting emergency readmissions within 45 days after been discharged from hospital. We also examined the predictive ability of two other types of data‐driven models: generalized additive models (GAMs) and multivariate adaptive regression splines (MARS). We used data on 963 patients readmitted to hospitals with chronic obstructive pulmonary disease and asthma. We used repeated split‐sample validation: the data were divided into derivation and validation samples. Predictive models were estimated using the derivation sample and the predictive accuracy of the resultant model was assessed using a number of performance measures, such as area under the receiver operating characteristic (ROC) curve in the validation sample. This process was repeated 1,000 times—the initial data set was divided into derivation and validation samples 1,000 times, and the predictive accuracy of each method was assessed each time. The mean ROC curve area for the regression tree models in the 1,000 derivation samples was .928, while the mean ROC curve area of a logistic regression model was .924. Our study shows that logistic regression model and regression trees had performance comparable to that of more flexible, data‐driven models such as GAMs and MARS. Given that the models have produced excellent predictive accuracies, this could be a valuable decision support tool for clinicians (healthcare managers, policy makers, etc.) for informed decision making in the management of diseases, which ultimately contributes to improved measures for hospital performance management.  相似文献   

17.
In this paper, we present a comparative analysis of the forecasting accuracy of univariate and multivariate linear models that incorporate fundamental accounting variables (i.e., inventory, accounts receivable, and so on) with the forecast accuracy of neural network models. Unique to this study is the focus of our comparison on the multivariate models to examine whether the neural network models incorporating the fundamental accounting variables can generate more accurate forecasts of future earnings than the models assuming a linear combination of these same variables. We investigate four types of models: univariate‐linear, multivariate‐linear, univariate‐neural network, and multivariate‐neural network using a sample of 283 firms spanning 41 industries. This study shows that the application of the neural network approach incorporating fundamental accounting variables results in forecasts that are more accurate than linear forecasting models. The results also reveal limitations of the forecasting capacity of investors in the security market when compared to neural network models.  相似文献   

18.
In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback‐Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed.  相似文献   

19.
Nonparametric estimation of a structural cointegrating regression model is studied. As in the standard linear cointegrating regression model, the regressor and the dependent variable are jointly dependent and contemporaneously correlated. In nonparametric estimation problems, joint dependence is known to be a major complication that affects identification, induces bias in conventional kernel estimates, and frequently leads to ill‐posed inverse problems. In functional cointegrating regressions where the regressor is an integrated or near‐integrated time series, it is shown here that inverse and ill‐posed inverse problems do not arise. Instead, simple nonparametric kernel estimation of a structural nonparametric cointegrating regression is consistent and the limit distribution theory is mixed normal, giving straightforward asymptotics that are useable in practical work. It is further shown that use of augmented regression, as is common in linear cointegration modeling to address endogeneity, does not lead to bias reduction in nonparametric regression, but there is an asymptotic gain in variance reduction. The results provide a convenient basis for inference in structural nonparametric regression with nonstationary time series when there is a single integrated or near‐integrated regressor. The methods may be applied to a range of empirical models where functional estimation of cointegrating relations is required.  相似文献   

20.
We study inference in structural models with a jump in the conditional density, where location and size of the jump are described by regression curves. Two prominent examples are auction models, where the bid density jumps from zero to a positive value at the lowest cost, and equilibrium job‐search models, where the wage density jumps from one positive level to another at the reservation wage. General inference in such models remained a long‐standing, unresolved problem, primarily due to nonregularities and computational difficulties caused by discontinuous likelihood functions. This paper develops likelihood‐based estimation and inference methods for these models, focusing on optimal (Bayes) and maximum likelihood procedures. We derive convergence rates and distribution theory, and develop Bayes and Wald inference. We show that Bayes estimators and confidence intervals are attractive both theoretically and computationally, and that Bayes confidence intervals, based on posterior quantiles, provide a valid large sample inference method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号