首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
Model selection criteria are frequently developed by constructing estimators of discrepancy measures that assess the disparity between the 'true' model and a fitted approximating model. The Akaike information criterion (AIC) and its variants result from utilizing Kullback's directed divergence as the targeted discrepancy. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternative directed divergence can be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence. In the framework of linear models, a comparison of the two directed divergences reveals an important distinction between the measures. When used to evaluate fitted approximating models that are improperly specified, the directed divergence which serves as the basis for AIC is more sensitive towards detecting overfitted models, whereas its counterpart is more sensitive towards detecting underfitted models. Since the symmetric divergence combines the information in both measures, it functions as a gauge of model disparity which is arguably more balanced than either of its individual components. With this motivation, the paper proposes a new class of criteria for linear model selection based on targeting the symmetric divergence. The criteria can be regarded as analogues of AIC and two of its variants: 'corrected' AIC or AICc and 'modified' AIC or MAIC. The paper examines the selection tendencies of the new criteria in a simulation study and the results indicate that they perform favourably when compared to their AIC analogues.  相似文献   

2.
On Parametric Bootstrapping and Bayesian Prediction   总被引:1,自引:0,他引:1  
Abstract.  We investigate bootstrapping and Bayesian methods for prediction. The observations and the variable being predicted are distributed according to different distributions. Many important problems can be formulated in this setting. This type of prediction problem appears when we deal with a Poisson process. Regression problems can also be formulated in this setting. First, we show that bootstrap predictive distributions are equivalent to Bayesian predictive distributions in the second-order expansion when some conditions are satisfied. Next, the performance of predictive distributions is compared with that of a plug-in distribution with an estimator. The accuracy of prediction is evaluated by using the Kullback–Leibler divergence. Finally, we give some examples.  相似文献   

3.
The authors introduce the formal notion of an approximately specified nonlinear regression model and investigate sequential design methodologies when the fitted model is possibly of an incorrect parametric form. They present small‐sample simulation studies which indicate that their new designs can be very successful, relative to some common competitors, in reducing mean squared error due to model misspecifi‐cation and to heteroscedastic variation. Their simulations also suggest that standard normal‐theory inference procedures remain approximately valid under the sequential sampling schemes. The methods are illustrated both by simulation and in an example using data from an experiment described in the chemical engineering literature.  相似文献   

4.
We consider the problem of the sequential choice of design points in an approximately linear model. It is assumed that the fitted linear model is only approximately correct, in that the true response function contains a nonrandom, unknown term orthogonal to the fitted response. We also assume that the parameters are estimated by M-estimation. The goal is to choose the next design point in such a way as to minimize the resulting integrated squared bias of the estimated response, to order n-1. Explicit applications to analysis of variance and regression are given. In a simulation study the sequential designs compare favourably with some fixed-sample-size designs which are optimal for the true response to which the sequential designs must adapt.  相似文献   

5.
The purpose of this paper is to develop a Bayesian analysis for the zero-inflated hyper-Poisson model. Markov chain Monte Carlo methods are used to develop a Bayesian procedure for the model and the Bayes estimators are compared by simulation with the maximum-likelihood estimators. Regression modeling and model selection are also discussed and case deletion influence diagnostics are developed for the joint posterior distribution based on the functional Bregman divergence, which includes ψ-divergence and several others, divergence measures, such as the Itakura–Saito, Kullback–Leibler, and χ2 divergence measures. Performance of our approach is illustrated in artificial, real apple cultivation experiment data, related to apple cultivation.  相似文献   

6.
The objective of this paper is to investigate exact slopes of test statistics { Tn } when the random vectors X 1, ..., Xn are distributed according to an unknown member of an exponential family { P θ; θ∈Ω. Here Ω is a parameter set. We will be concerned with the hypothesis testing problem of H 0θ∈Ω0 vs H 1: θ∉Ω0 where Ω0 is a subset of Ω. It will be shown that for an important class of problems and test statistics the exact slope of { Tn } at η in Ω−Ω0 is determined by the shortest Kullback–Leibler distance from {θ: Tn (λ(θ)) = Tn (λ(π))} to Ω0, λθ = E θ)( X ).  相似文献   

7.
In many experiments, not all explanatory variables can be controlled. When the units arise sequentially, different approaches may be used. The authors study a natural sequential procedure for “marginally restricted” D‐optimal designs. They assume that one set of explanatory variables (x1) is observed sequentially, and that the experimenter responds by choosing an appropriate value of the explanatory variable x2. In order to solve the sequential problem a priori, the authors consider the problem of constructing optimal designs with a prior marginal distribution for x1. This eliminates the influence of units already observed on the next unit to be designed. They give explicit designs for various cases in which the mean response follows a linear regression model; they also consider a case study with a nonlinear logistic response. They find that the optimal strategy often consists of randomizing the assignment of the values of x2.  相似文献   

8.
To measure the distance between a robust function evaluated under the true regression model and under a fitted model, we propose generalized Kullback–Leibler information. Using this generalization we have developed three robust model selection criteria, AICR*, AICCR* and AICCR, that allow the selection of candidate models that not only fit the majority of the data but also take into account non-normally distributed errors. The AICR* and AICCR criteria can unify most existing Akaike information criteria; three examples of such unification are given. Simulation studies are presented to illustrate the relative performance of each criterion.  相似文献   

9.
The problem of designing an experiment to estimate the point at which a quadratic regression is a maximum, or minimum. is studied. The efficiency of a design depends on the value of the unknown parameters and sequential design is, therefore, more efficient than non-sequential design. We use a Bayesian criterion which is a weighted trace of the inverse of the information matrix with the weights depending on a prior distribution. If design occurs sequentially the weights can be updated. Both sequential and non-sequential Bayesian designs are compared to non-Bayesian sequential designs. The comparison is both theoretical and by simulation.  相似文献   

10.
In this paper, we consider a Bayesian mixture model that allows us to integrate out the weights of the mixture in order to obtain a procedure in which the number of clusters is an unknown quantity. To determine clusters and estimate parameters of interest, we develop an MCMC algorithm denominated by sequential data-driven allocation sampler. In this algorithm, a single observation has a non-null probability to create a new cluster and a set of observations may create a new cluster through the split-merge movements. The split-merge movements are developed using a sequential allocation procedure based in allocation probabilities that are calculated according to the Kullback–Leibler divergence between the posterior distribution using the observations previously allocated and the posterior distribution including a ‘new’ observation. We verified the performance of the proposed algorithm on the simulated data and then we illustrate its use on three publicly available real data sets.  相似文献   

11.
Summary. A general theorem on the asymptotically optimal sequential selection of experiments is presented and applied to a Bayesian classification problem when the parameter space is a finite partially ordered set. The main results include establishing conditions under which the posterior probability of the true state converges to 1 almost surely and determining optimal rates of convergence. Properties of a class of experiment selection rules are explored.  相似文献   

12.
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum–Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum–Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback–Leibler divergence. The developed procedures are illustrated with a real data set.  相似文献   

13.
Abstract.  A new semiparametric method for density deconvolution is proposed, based on a model in which only the ratio of the unconvoluted to convoluted densities is specified parametrically. Deconvolution results from reweighting the terms in a standard kernel density estimator, where the weights are defined by the parametric density ratio. We propose that in practice, the density ratio be modelled on the log-scale as a cubic spline with a fixed number of knots. Parameter estimation is based on maximization of a type of semiparametric likelihood. The resulting asymptotic properties for our deconvolution estimator mirror the convergence rates in standard density estimation without measurement error when attention is restricted to our semiparametric class of densities. Furthermore, numerical studies indicate that for practical sample sizes our weighted kernel estimator can provide better results than the classical non-parametric kernel estimator for a range of densities outside the specified semiparametric class.  相似文献   

14.
We construct approximate optimal designs for minimising absolute covariances between least‐squares estimators of the parameters (or linear functions of the parameters) of a linear model, thereby rendering relevant parameter estimators approximately uncorrelated with each other. In particular, we consider first the case of the covariance between two linear combinations. We also consider the case of two such covariances. For this we first set up a compound optimisation problem which we transform to one of maximising two functions of the design weights simultaneously. The approaches are formulated for a general regression model and are explored through some examples including one practical problem arising in chemistry.  相似文献   

15.
Comparison of Four New General Classes of Search Designs   总被引:1,自引:0,他引:1  
A factor screening experiment identifies a few important factors from a large list of factors that potentially influence the response. If a list consists of m factors each at three levels, a design is a subset of all possible 3 m runs. This paper considers the problem of finding designs with small numbers of runs, using the search linear model introduced in Srivastava (1975). The paper presents four new general classes of these 'search designs', each with 2 m −1 runs, which permit, at most, two important factors out of m factors to be searched for and identified. The paper compares the designs for 4 ≤ m ≤ 10, using arithmetic and geometric means of the determinants, traces and maximum characteristic roots of particular matrices. Two of the designs are found to be superior in all six criteria studied. The four designs are identical for m = 3 and this design is an optimal design in the class of all search designs under the six criteria. The four designs are also identical for m = 4 under some row and column permutations.  相似文献   

16.
Classical regression analysis is usually performed in two steps. In the first step, an appropriate model is identified to describe the data generating process and in the second step, statistical inference is performed in the identified model. An intuitively appealing approach to the design of experiment for these different purposes are sequential strategies, which use parts of the sample for model identification and adapt the design according to the outcome of the identification steps. In this article, we investigate the finite sample properties of two sequential design strategies, which were recently proposed in the literature. A detailed comparison of sequential designs for model discrimination in several regression models is given by means of a simulation study. Some non-sequential designs are also included in the study.  相似文献   

17.
We discuss the problem of selecting among alternative parametric models within the Bayesian framework. For model selection problems, which involve non‐nested models, the common objective choice of a prior on the model space is the uniform distribution. The same applies to situations where the models are nested. It is our contention that assigning equal prior probability to each model is over simplistic. Consequently, we introduce a novel approach to objectively determine model prior probabilities, conditionally, on the choice of priors for the parameters of the models. The idea is based on the notion of the worth of having each model within the selection process. At the heart of the procedure is the measure of this worth using the Kullback–Leibler divergence between densities from different models.  相似文献   

18.
Supersaturated designs (SSDs) are defined as fractional factorial designs whose experimental run size is smaller than the number of main effects to be estimated. While most of the literature on SSDs has focused only on main effects designs, the construction and analysis of such designs involving interactions has not been developed to a great extent. In this paper, we propose a backward elimination design-driven optimization (BEDDO) method, with one main goal in mind, to eliminate the factors which are identified to be fully aliased or highly partially aliased with each other in the design. Under the proposed BEDDO method, we implement and combine correlation-based statistical measures taken from classical test theory and design of experiments field, and we also present an optimality criterion which is a modified form of Cronbach's alpha coefficient. In this way, we provide a new class of computer-aided unbalanced SSDs involving interactions, that derive directly from BEDDO optimization.  相似文献   

19.
20.
This paper presents some considerations about the numerical procedures for generating D–optimal design in a finite design space. The influence of starting procedures and the finite set of points on the design efficiency is considered. Some modifications of the existing procedures for D–optimal designs generation are described. It is shown that for large number of factors the sequential procedures are more appropriate than the nonsequential ones  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号