首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Although prediction in mixed effects models usually concerns the random effects, in this paper we deal with the problem of prediction of a future, or yet unobserved, response random variable, belonging to a given cluster. In particular, the aim is to define computationally tractable prediction intervals, with conditional and unconditional coverage probability close to the target nominal value. This solution involves the conditional density of the future response random variable given the observed data, or a suitable high-order approximation based on the Laplace method. We prove that, unless the amount of data is very limited, the estimative or naive predictive procedure gives a relatively simple, feasible solution for response prediction. An application to generalized linear mixed models is presented.  相似文献   

2.
We consider exact and approximate Bayesian computation in the presence of latent variables or missing data. Specifically we explore the application of a posterior predictive distribution formula derived in Sweeting And Kharroubi (2003), which is a particular form of Laplace approximation, both as an importance function and a proposal distribution. We show that this formula provides a stable importance function for use within poor man’s data augmentation schemes and that it can also be used as a proposal distribution within a Metropolis-Hastings algorithm for models that are not analytically tractable. We illustrate both uses in the case of a censored regression model and a normal hierarchical model, with both normal and Student t distributed random effects. Although the predictive distribution formula is motivated by regular asymptotic theory, it is not necessary that the likelihood has a closed form or that it possesses a local maximum.  相似文献   

3.
Abstract.  The plug-in solution is usually not entirely adequate for computing prediction intervals, as their coverage probability may differ substantially from the nominal value. Prediction intervals with improved coverage probability can be defined by adjusting the plug-in ones, using rather complicated asymptotic procedures or suitable simulation techniques. Other approaches are based on the concept of predictive likelihood for a future random variable. The contribution of this paper is the definition of a relatively simple predictive distribution function giving improved prediction intervals. This distribution function is specified as a first-order unbiased modification of the plug-in predictive distribution function based on the constrained maximum likelihood estimator. Applications of the results to the Gaussian and the generalized extreme-value distributions are presented.  相似文献   

4.
Summary. We propose a kernel estimator of integrated squared density derivatives, from a sample that has been contaminated by random noise. We derive asymptotic expressions for the bias and the variance of the estimator and show that the squared bias term dominates the variance term. This coincides with results that are available for non-contaminated observations. We then discuss the selection of the bandwidth parameter when estimating integrated squared density derivatives based on contaminated data. We propose a data-driven bandwidth selection procedure of the plug-in type and investigate its finite sample performance via a simulation study.  相似文献   

5.
The choice of the bandwidth is a crucial issue for kernel density estimation. Among all the data-dependent methods for choosing the bandwidth, the direct plug-in method has shown a particularly good performance in practice. This procedure is based on estimating an asymptotic approximation of the optimal bandwidth, using two “pilot” kernel estimation stages. Although two pilot stages seem to be enough for most densities, for a long time the problem of how to choose an appropriate number of stages has remained open. Here we propose an automatic (i.e., data-based) method for choosing the number of stages to be employed in the plug-in bandwidth selector. Asymptotic properties of the method are presented and an extensive simulation study is carried out to compare its small-sample performance with that of the most recommended bandwidth selectors in the literature.  相似文献   

6.
Massive correlated data with many inputs are often generated from computer experiments to study complex systems. The Gaussian process (GP) model is a widely used tool for the analysis of computer experiments. Although GPs provide a simple and effective approximation to computer experiments, two critical issues remain unresolved. One is the computational issue in GP estimation and prediction where intensive manipulations of a large correlation matrix are required. For a large sample size and with a large number of variables, this task is often unstable or infeasible. The other issue is how to improve the naive plug-in predictive distribution which is known to underestimate the uncertainty. In this article, we introduce a unified framework that can tackle both issues simultaneously. It consists of a sequential split-and-conquer procedure, an information combining technique using confidence distributions (CD), and a frequentist predictive distribution based on the combined CD. It is shown that the proposed method maintains the same asymptotic efficiency as the conventional likelihood inference under mild conditions, but dramatically reduces the computation in both estimation and prediction. The predictive distribution contains comprehensive information for inference and provides a better quantification of predictive uncertainty as compared with the plug-in approach. Simulations are conducted to compare the estimation and prediction accuracy with some existing methods, and the computational advantage of the proposed method is also illustrated. The proposed method is demonstrated by a real data example based on tens of thousands of computer experiments generated from a computational fluid dynamic simulator.  相似文献   

7.
The celebrated Black–Scholes model made the assumption of constant volatility but empirical studies on implied volatility and asset dynamics motivated the use of stochastic volatilities. Christoffersen in 2009 showed that multi-factor stochastic volatilities models capture the asset dynamics more realistically. Fouque in 2012 used it to price European options. In 2013, Chiarella and Ziveyi considered Christoffersen’s ideas and introduced an asset dynamics where the two volatilities of the Heston type act separately and independently on the asset price, and using Fourier transform for the asset price process and double Laplace transform for the two volatilities processes, solved a pricing problem for American options. This paper considers the Chiarella and Ziveyi model and parameterizes it so that the volatilities revert to the long-run-mean with reversion rates that mimic fast (for example daily) and slow (for example seasonal) random effects. Applying asymptotic expansion method presented by Fouque in 2012, we make an extensive and detailed derivation of the approximation prices for European options. We also present numerical studies on the behavior and accuracy of our first- and second-order asymptotic expansion formulas.  相似文献   

8.
Abstract. Frailty models with a non‐parametric baseline hazard are widely used for the analysis of survival data. However, their maximum likelihood estimators can be substantially biased in finite samples, because the number of nuisance parameters associated with the baseline hazard increases with the sample size. The penalized partial likelihood based on a first‐order Laplace approximation still has non‐negligible bias. However, the second‐order Laplace approximation to a modified marginal likelihood for a bias reduction is infeasible because of the presence of too many complicated terms. In this article, we find adequate modifications of these likelihood‐based methods by using the hierarchical likelihood.  相似文献   

9.
In this study, testing the equality of mean vectors in a one-way multivariate analysis of variance (MANOVA) is considered when each dataset has a monotone pattern of missing observations. The likelihood ratio test (LRT) statistic in a one-way MANOVA with monotone missing data is given. Furthermore, the modified test (MT) statistic based on likelihood ratio (LR) and the modified LRT (MLRT) statistic with monotone missing data are proposed using the decomposition of the LR and an asymptotic expansion for each decomposed LR. The accuracy of the approximation for the Chi-square distribution is investigated using a Monte Carlo simulation. Finally, an example is given to illustrate the methods.  相似文献   

10.
On Parametric Bootstrapping and Bayesian Prediction   总被引:1,自引:0,他引:1  
Abstract.  We investigate bootstrapping and Bayesian methods for prediction. The observations and the variable being predicted are distributed according to different distributions. Many important problems can be formulated in this setting. This type of prediction problem appears when we deal with a Poisson process. Regression problems can also be formulated in this setting. First, we show that bootstrap predictive distributions are equivalent to Bayesian predictive distributions in the second-order expansion when some conditions are satisfied. Next, the performance of predictive distributions is compared with that of a plug-in distribution with an estimator. The accuracy of prediction is evaluated by using the Kullback–Leibler divergence. Finally, we give some examples.  相似文献   

11.
In this paper, we investigate the asymptotic properties of a non-parametric conditional mode estimation given a functional explanatory variable, when functional stationary ergodic data and missing at random responses are observed. First of all, we establish asymptotic properties for a conditional density estimator from which we derive almost sure convergence (with rate) and asymptotic normality of a conditional mode estimator. This new estimate take into account missing data, and a simulation study is performed to illustrate how this fact allows to get higher predictive performances than those obtained with standard estimates.  相似文献   

12.
Abstract.  We consider classification of the realization of a multivariate spatial–temporal Gaussian random field into one of two populations with different regression mean models and factorized covariance matrices. Unknown means and common feature vector covariance matrix are estimated from training samples with observations correlated in space and time, assuming spatial–temporal correlations to be known. We present the first-order asymptotic expansion of the expected error rate associated with a linear plug-in discriminant function. Our results are applied to ecological data collected from the Lithuanian Economic Zone in the Baltic Sea.  相似文献   

13.
Interval-grouped data are defined, in general, when the event of interest cannot be directly observed and it is only known to have been occurred within an interval. In this framework, a nonparametric kernel density estimator is proposed and studied. The approach is based on the classical Parzen–Rosenblatt estimator and on the generalisation of the binned kernel density estimator. The asymptotic bias and variance of the proposed estimator are derived under usual assumptions, and the effect of using non-equally spaced grouped data is analysed. Additionally, a plug-in bandwidth selector is proposed. Through a comprehensive simulation study, the behaviour of both the estimator and the plug-in bandwidth selector considering different scenarios of data grouping is shown. An application to real data confirms the simulation results, revealing the good performance of the estimator whenever data are not heavily grouped.  相似文献   

14.
Looking at predictive accuracy is a traditional method for comparing models. A natural method for approximating out-of-sample predictive accuracy is leave-one-out cross-validation (LOOCV)—we alternately hold out each case from a full dataset and then train a Bayesian model using Markov chain Monte Carlo without the held-out case; at last we evaluate the posterior predictive distribution of all cases with their actual observations. However, actual LOOCV is time-consuming. This paper introduces two methods, namely iIS and iWAIC, for approximating LOOCV with only Markov chain samples simulated from a posterior based on a full dataset. iIS and iWAIC aim at improving the approximations given by importance sampling (IS) and WAIC in Bayesian models with possibly correlated latent variables. In iIS and iWAIC, we first integrate the predictive density over the distribution of the latent variables associated with the held-out without reference to its observation, then apply IS and WAIC approximations to the integrated predictive density. We compare iIS and iWAIC with other approximation methods in three kinds of models: finite mixture models, models with correlated spatial effects, and a random effect logistic regression model. Our empirical results show that iIS and iWAIC give substantially better approximates than non-integrated IS and WAIC and other methods.  相似文献   

15.
ESTIMATION, PREDICTION AND INFERENCE FOR THE LASSO RANDOM EFFECTS MODEL   总被引:1,自引:0,他引:1  
The least absolute shrinkage and selection operator (LASSO) can be formulated as a random effects model with an associated variance parameter that can be estimated with other components of variance. In this paper, estimation of the variance parameters is performed by means of an approximation to the marginal likelihood of the observed outcomes. The approximation is based on an alternative but equivalent formulation of the LASSO random effects model. Predictions can be made using point summaries of the predictive distribution of the random effects given the data with the parameters set to their estimated values. The standard LASSO method uses the mode of this distribution as the predictor. It is not the only choice, and a number of other possibilities are defined and empirically assessed in this article. The predictive mode is competitive with the predictive mean (best predictor), but no single predictor performs best across in all situations. Inference for the LASSO random effects is performed using predictive probability statements, which are more appropriate under the random effects formulation than tests of hypothesis.  相似文献   

16.
A typical model for geostatistical data when the observations are counts is the spatial generalised linear mixed model. We present a criterion for optimal sampling design under this framework which aims to minimise the error in the prediction of the underlying spatial random effects. The proposed criterion is derived by performing an asymptotic expansion to the conditional prediction variance. We argue that the mean of the spatial process needs to be taken into account in the construction of the predictive design, which we demonstrate through a simulation study where we compare the proposed criterion against the widely used space-filling design. Furthermore, our results are applied to the Norway precipitation data and the rhizoctonia disease data.  相似文献   

17.
Synthetic likelihood is an attractive approach to likelihood-free inference when an approximately Gaussian summary statistic for the data, informative for inference about the parameters, is available. The synthetic likelihood method derives an approximate likelihood function from a plug-in normal density estimate for the summary statistic, with plug-in mean and covariance matrix obtained by Monte Carlo simulation from the model. In this article, we develop alternatives to Markov chain Monte Carlo implementations of Bayesian synthetic likelihoods with reduced computational overheads. Our approach uses stochastic gradient variational inference methods for posterior approximation in the synthetic likelihood context, employing unbiased estimates of the log likelihood. We compare the new method with a related likelihood-free variational inference technique in the literature, while at the same time improving the implementation of that approach in a number of ways. These new algorithms are feasible to implement in situations which are challenging for conventional approximate Bayesian computation methods, in terms of the dimensionality of the parameter and summary statistic.  相似文献   

18.
Joint models for longitudinal and time-to-event data have been applied in many different fields of statistics and clinical studies. However, the main difficulty these models have to face with is the computational problem. The requirement for numerical integration becomes severe when the dimension of random effects increases. In this paper, a modified two-stage approach has been proposed to estimate the parameters in joint models. In particular, in the first stage, the linear mixed-effects models and best linear unbiased predictorsare applied to estimate parameters in the longitudinal submodel. In the second stage, an approximation of the fully joint log-likelihood is proposed using the estimated the values of these parameters from the longitudinal submodel. Survival parameters are estimated bymaximizing the approximation of the fully joint log-likelihood. Simulation studies show that the approach performs well, especially when the dimension of random effects increases. Finally, we implement this approach on AIDS data.  相似文献   

19.
Quasi-likelihood nonlinear models with random effects (QLNMWRE) include generalized linear models with random effects and quasi-likelihood nonlinear models as special cases. In this paper, some regularity conditions analogous to those given by Breslow and Clatyton (1993) are proposed. On the basis of the proposed regularity conditions and Laplace approximation, the existence, the strong consistency and asymptotic normality of the approximate maximum quasi-likelihood estimation of the fixed effects are proved in QLNMWRE.  相似文献   

20.
The geographical relative risk function is a useful tool for investigating the spatial distribution of disease based on case and control data. The most common way of estimating this function is using the ratio of bivariate kernel density estimates constructed from the locations of cases and controls, respectively. An alternative is to use a local-linear (LL) estimator of the log-relative risk function. In both cases, the choice of bandwidth is critical. In this article, we examine the relative performance of the two estimation techniques using a variety of data-driven bandwidth selection methods, including likelihood cross-validation (CV), least-squares CV, rule-of-thumb reference methods, and a new approximate plug-in (PI) bandwidth for the LL estimator. Our analysis includes the comparison of asymptotic results; a simulation study; and application of the estimators on two real data sets. Our findings suggest that the density ratio method implemented with the least-squares CV bandwidth selector is generally best, with the LL estimator with PI bandwidth being competitive in applications with strong large-scale trends but much worse in situations with elliptical clusters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号