首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Models with large parameter (i.e., hundreds or thousands of parameters) often behave as if they depend upon only a few parameters, with the rest having comparatively little influence. One challenge of sensitivity analysis with such models is screening parameters to identify the influential ones, and then characterizing their influences.

Large models often require significant computing resources to evaluate their output, and so a good screening mechanism should be efficient: it should minimize the number of times a model must be exercised. This paper describes an efficient procedure to perform sensitivity analysis on deterministic models with specified ranges or probability distributions for each parameter.

It is based on repeated exercising of the model, which can be treated as a black box. Statistical checks can ensure that the screening identified parameters that account for the bulk of the model variation. Subsequent sensitivity analysis can use the screening information to reduce the investment required to characterize the influence of influential and other parameters.

The procedure exploits simplifications in the dependence of a model output on model inputs. It works best where a small number of parameters are much more influential than all the rest. The method is much more sensitive to the number of influential parameters than to the total number of parameters. It is most effective when linear or quadratic effects dominate higher order effects and complex interactions.

The paper presents a set of M athematica functions that can be used to create a variety of types of experimental designs useful for sensitivity analysis, including simple random, latin hypercube and fractional factorial sampling. Each sampling method can use discretization, folding, grouping and replication to create composite designs. These techniques have beencombined in a composite approach called Iterated Fractional Factorial Design (IFFD).

The procedure is applied to model of nuclear fuel waste disposal, and to simplified example models to demonstrate the concepts involved.  相似文献   

2.
Two parameter screening techniques, a sequential bifurcation technique and a factorial sampling method, have been applied to a building thermal model, used to predict thermal comfort performance of a building in its design stage. Combined application of both screening methods revealed a set of 12 important model parameters out of a total of 81, explaining 94% of the variability in the model output. These important parameters were identified by the factorial sampling method on the basis of 246 model evaluations, while sequential bifurcation only needed 52 evaluations. However, the factorial sampling scheme was effective in identifying of not only the important parameters, but also the directions of parameter main effects and the severity of interaction effects. This additional information showed that isolated application of the sequential bifurcation method would have been unreliable, as satisfaction of the inherent assumptions could not be guaranteed. Only on the basis of proper knowledge of the sign of the parameter main effects, adequate clustering of important parameters and transformation of the model output, all obtained from the results of the factorial sampling scheme, reliable and economic application of sequential bifurcation was possible.  相似文献   

3.
ABSTRACT

Supersaturated designs (SSDs) constitute a large class of fractional factorial designs which can be used for screening out the important factors from a large set of potentially active ones. A major advantage of these designs is that they reduce the experimental cost dramatically, but their crucial disadvantage is the confounding involved in the statistical analysis. Identification of active effects in SSDs has been the subject of much recent study. In this article we present a two-stage procedure for analyzing two-level SSDs assuming a main-effect only model, without including any interaction terms. This method combines sure independence screening (SIS) with different penalty functions; such as Smoothly Clipped Absolute Deviation (SCAD), Lasso and MC penalty achieving both the down-selection and the estimation of the significant effects, simultaneously. Insights on using the proposed methodology are provided through various simulation scenarios and several comparisons with existing approaches, such as stepwise in combination with SCAD and Dantzig Selector (DS) are presented as well. Results of the numerical study and real data analysis reveal that the proposed procedure can be considered as an advantageous tool due to its extremely good performance for identifying active factors.  相似文献   

4.
In this paper we consider the measures for detecting the influential observations w.r.t. one or several parameters of interest at the design stage. We also consider the Cook's measure for detecting the influential observations at the inference stage. We study the interrelationship between two kinds of measures.  相似文献   

5.
A partially balanced nested row-column design, referred to as PBNRC, is defined as an arrangement of v treatments in b p × q blocks for which, with the convention that p q, the information matrix for the estimation of treatment parameters is equal to that of the column component design which is itself a partially balanced incomplete block design. In this paper, previously known optimal incomplete block designs, and row-column and nested row-column designs are utilized to develop some methods of constructing optimal PBNRC designs. In particular, it is shown that an optimal group divisible PBNRC design for v = mn kn treatments in p × q blocks can be constructed whenever a balanced incomplete block design for m treatments in blocks of size k each and a group divisible PBNRC design for kn treatments in p × q blocks exist. A simple sufficient condition is given under which a group divisible PBNRC is Ψf-better for all f> 0 than the corresponding balanced nested row-column designs having binary blocks. It is also shown that the construction techniques developed particularly for group divisible designs can be generalized to obtain PBNRC designs based on rectangular association schemes.  相似文献   

6.
In this paper, we consider statistical planning of experiments when the parameters in the linear model assumed are divided into disjoint sets; the parameters in one set are more influential than the parameters in the other set and require more precise estimation. Me characterize the plans (or, designs) and discuss some symmetric designs which are easier to find.  相似文献   

7.
Abstract. Real‐world phenomena are frequently modelled by Bayesian hierarchical models. The building‐blocks in such models are the distribution of each variable conditional on parent and/or neighbour variables in the graph. The specifications of centre and spread of these conditional distributions may be well motivated, whereas the tail specifications are often left to convenience. However, the posterior distribution of a parameter may depend strongly on such arbitrary tail specifications. This is not easily detected in complex models. In this article, we propose a graphical diagnostic, the Local critique plot, which detects such influential statistical modelling choices at the node level. It identifies the properties of the information coming from the parents and neighbours (the local prior) and from the children and co‐parents (the lifted likelihood) that are influential on the posterior distribution, and examines local conflict between these distinct information sources. The Local critique plot can be derived for all parameters in a chain graph model.  相似文献   

8.
In this study, we introduced a method for building a Bayesian nomogram and proposed an appropriate nomogram for type 2 diabetes (T2D) using data from 13,474 subjects collected from the 2013–2015 Korean National Health and Nutrition Examination Survey (KNHANES) data. We identified risk factors related to T2D, proposed a visual nomogram for T2D from a naïve Bayesian classifier model, and predicted incidence rates. Additionally, we computed confidence intervals for the influence of risk factors (attributes) and verified the proposed Bayesian nomogram using a receiver operating characteristic curve. Finally, we compared logistic regression and the Bayesian nomogram for T2D. The results of the analysis of the T2D data showed that the most influential factor among all attributes in the Bayesian nomogram was age group, and the highest risk factor for T2D incidence was cardiovascular disease. Dyslipidemia and hypertension also had significant impacts on T2D incidence while the effects of sex, smoking status, and employment status were relatively small compared to those of other variables. Using the proposed Bayesian nomogram, we can easily predict the incidence rate of T2D in an individual, and treatment plans can be established based on this information.  相似文献   

9.
Detection of outliers or influential observations is an important work in statistical modeling, especially for the correlated time series data. In this paper we propose a new procedure to detect patch of influential observations in the generalized autoregressive conditional heteroskedasticity (GARCH) model. Firstly we compare the performance of innovative perturbation scheme, additive perturbation scheme and data perturbation scheme in local influence analysis. We find that the innovative perturbation scheme give better result than other two schemes although this perturbation scheme may suffer from masking effects. Then we use the stepwise local influence method under innovative perturbation scheme to detect patch of influential observations and uncover the masking effects. The simulated studies show that the new technique can successfully detect a patch of influential observations or outliers under innovative perturbation scheme. The analysis based on simulation studies and two real data sets show that the stepwise local influence method under innovative perturbation scheme is efficient for detecting multiple influential observations and dealing with masking effects in the GARCH model.  相似文献   

10.
The identification of influential observations has drawn a great deal of attention in regression diagnostics. Most of these identification techniques are based on single case deletion and among them DFFITS has become very popular with the statisticians. But this technique along with all other single case diagnostics may be ineffective in the presence of multiple influential observations. In this paper we develop a generalized version of DFFITS based on group deletion and then propose a new technique to identify multiple influential observations using this. The advantage of using the proposed method in the identification of multiple influential cases is then investigated through several well-referred data sets.  相似文献   

11.
In this paper, we obtain balanced resolution V plans for 2m factorial experiments (4 ≤ m ≤ 8), which have an additional feature. Instead of assuming that the three factor and higher order effects are all zero, we assume that there is at most one nonnegligible effect among them; however, we do not know which particular effect is nonnegligible. The problem is to search which effect is non-negligible and to estimate it, along with estimating the main effects and two factor interactions etc., as in an ordinary resolution V design. For every value of N (the number of treatments) within a certain practical range, we present a design using which the search and estimation can be carried out. (Of course, as in all statistical problems, the probability of correct search will depend upon the size of “error” or “noise” present in the observations. However, the designs obtained are such that, at least in the noiseless case, this probability equals 1.) It is found that many of these designs are identical with optimal balanced resolution V designs obtained earlier in the work of Srivastava and Chopra.  相似文献   

12.
One of the main advantages of factorial experiments is the information that they can offer on interactions. When there are many factors to be studied, some or all of this information is often sacrificed to keep the size of an experiment economically feasible. Two strategies for group screening are presented for a large number of factors, over two stages of experimentation, with particular emphasis on the detection of interactions. One approach estimates only main effects at the first stage (classical group screening), whereas the other new method (interaction group screening) estimates both main effects and key two-factor interactions at the first stage. Three criteria are used to guide the choice of screening technique, and also the size of the groups of factors for study in the first-stage experiment. The criteria seek to minimize the expected total number of observations in the experiment, the probability that the size of the experiment exceeds a prespecified target and the proportion of active individual factorial effects which are not detected. To implement these criteria, results are derived on the relationship between the grouped and individual factorial effects, and the probability distributions of the numbers of grouped factors whose main effects or interactions are declared active at the first stage. Examples are used to illustrate the methodology, and some issues and open questions for the practical implementation of the results are discussed.  相似文献   

13.
In this paper we focus on the problem of supersaturated (fewer runs than factors) screening experiments. We consider two major types of designs which have been proposed in this situ¬ation: random balance and two-stage group screening. We discuss the relative merits and demerits of each strategy. In addition, we compare the performance of these strategies by means of a case study in which 100 factors are screened in 20,42,62, and 84 runs.  相似文献   

14.
Detection of multiple unusual observations such as outliers, high leverage points and influential observations (IOs) in regression is still a challenging task for statisticians due to the well-known masking and swamping effects. In this paper we introduce a robust influence distance that can identify multiple IOs, and propose a sixfold plotting technique based on the well-known group deletion approach to classify regular observations, outliers, high leverage points and IOs simultaneously in linear regression. Experiments through several well-referred data sets and simulation studies demonstrate that the proposed algorithm performs successfully in the presence of multiple unusual observations and can avoid masking and/or swamping effects.  相似文献   

15.
Bechhofer and Tamhane (1981) proposed a new class of incomplete block designs called BTIB designs for comparing p ≥ 2 test treatments with a control treatment in blocks of equal size k < p + 1. All BTIB designs for given (p,k) can be constructed by forming unions of replications of a set of elementary BTIB designs called generator designs for that (p,k). In general, there are many generator designs for given (p,k) but only a small subset (called the minimal complete set) of these suffices to obtain all admissible BTIB designs (except possibly any equivalent ones). Determination of the minimal complete set of generator designs for given (p,k) was stated as an open problem in Bechhofer and Tamhane (1981). In this paper we solve this problem for k = 3. More specifically, we give the minimal complete sets of generator designs for k = 3, p = 3(1)10; the relevant proofs are given only for the cases p = 3(1)6. Some additional combinatorial results concerning BTIB designs are also given.  相似文献   

16.
This paper introduces a screening procedure called step-wise group screening for isolating defective factors from a population consisting of defective (or important) and non-defective  相似文献   

17.
Summary This paper solves some D-optimal design problems for certain Generalized Linear Models where the mean depends on two parameters and two explanatory variables. In all of the cases considered the support point of the optimal designs are found to be independent of the unknown parameters. While in some cases the optimal design measures are given by two points with equal weights, in others the support is given by three point with weights depending on the unknown parameters, hence the designs are locally optimal in general. Empirical results on the efficiency of the locally optimal designs are also given. Some of the designs found can also be used for planning D-optimal experiments for the normal linear model, where the mean must be positive. This research was carried out in part at University College, London as an M.Sc. project. Thanks are due to Prof. I. Ford (University of Glasgow) and Prof. A. Giovagnoli (University of Perugia) for their valuable suggestions and critical observations.  相似文献   

18.
Search design is searching and estimating for a few non zero effects in a large set of effects along with estimation of elements in a set of unknown parameters. In presence of noise, the probability of discrimination between the true non zero effect from an alternative one depends on the design and an unknown parameter, say ρ. We develop a new criterion for design comparison which is independent of ρ and for a family density weight function show that it discriminates and ranks the designs precisely. This criterion is invariance to the variable noise which may be present between designs due to noise factors. This allows us to extend the design comparison to classes of equivalent designs.  相似文献   

19.
A new class of row–column designs is proposed. These designs are saturated in terms of eliminating two-way heterogeneity with an additive model. The (m,s)-criterion is used to select optimal designs. It turns out that all (m,s)-optimal designs are binary. Square (m,s)-optimal designs are constructed and they are treatment-connected. Thus, all treatment contrasts are estimable regardless of the row and column effects.  相似文献   

20.
Supersaturated designs are factorial designs in which the number of potential effects is greater than the run size. They are commonly used in screening experiments, with the aim of identifying the dominant active factors with low cost. However, an important research field, which is poorly developed, is the analysis of such designs with non-normal response. In this article, we develop a variable selection strategy, through the modification of the PageRank algorithm, which is commonly used in the Google search engine for ranking Webpages. The proposed method incorporates an appropriate information theoretical measure into this algorithm and as a result, it can be efficiently used for factor screening. A noteworthy advantage of this procedure is that it allows the use of supersaturated designs for analyzing discrete data and therefore a generalized linear model is assumed. As it is depicted via a thorough simulation study, in which the Type I and Type II error rates are computed for a wide range of underlying models and designs, the presented approach can be considered quite advantageous and effective.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号