首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Likelihood computation in spatial statistics requires accurate and efficient calculation of the normalizing constant (i.e. partition function) of the Gibbs distribution of the model. Two available methods to calculate the normalizing constant by Markov chain Monte Carlo methods are compared by simulation experiments for an Ising model, a Gaussian Markov field model and a pairwise interaction point field model.  相似文献   

2.
New recursive algorithms for fast computation of the normalizing constant for the autologistic model on the lattice make feasible a sample-based maximum likelihood estimation (MLE) of the autologistic parameters. We demonstrate by sampling from 12 simulated 420×420 binary lattices with square lattice plots of size 4×4, …, 7×7 and sample sizes between 20 and 600. Sample-based results are compared with ‘benchmark’ MCMC estimates derived from all binary observations on a lattice. Sample-based estimates are, on average, biased systematically by 3%–7%, a bias that can be reduced by more than half by a set of calibrating equations. MLE estimates of sampling variances are large and usually conservative. The variance of the parameter of spatial association is about 2–10 times higher than the variance of the parameter of abundance. Sample distributions of estimates were mostly non-normal. We conclude that sample-based MLE estimation of the autologistic parameters with an appropriate sample size and post-estimation calibration will furnish fully acceptable estimates. Equations for predicting the expected sampling variance are given.  相似文献   

3.
In geostatistics, the prediction of unknown quantities at given locations is commonly made by the kriging technique. In addition to the kriging technique for modeling regular lattice spatial data, the spatial autoregressive models can also be used. In this article, the spatial autoregressive model and the kriging technique are introduced. We extend prediction method proposed by Basu and Reinsel for SAR(2,1) model. Then, using a simulation study and real data, we compare prediction accuracy of the spatial autoregressive models with that of the kriging prediction. The results of simulation study show that predictions made by the autoregressive models are good competitor for the kriging method.  相似文献   

4.
Summary.  Gaussian Markov random-field (GMRF) models are frequently used in a wide variety of applications. In most cases parts of the GMRF are observed through mutually independent data; hence the full conditional of the GMRF, a hidden GMRF (HGMRF), is of interest. We are concerned with the case where the likelihood is non-Gaussian, leading to non-Gaussian HGMRF models. Several researchers have constructed block sampling Markov chain Monte Carlo schemes based on approximations of the HGMRF by a GMRF, using a second-order expansion of the log-density at or near the mode. This is possible as the GMRF approximation can be sampled exactly with a known normalizing constant. The Markov property of the GMRF approximation yields computational efficiency.The main contribution in the paper is to go beyond the GMRF approximation and to construct a class of non-Gaussian approximations which adapt automatically to the particular HGMRF that is under study. The accuracy can be tuned by intuitive parameters to nearly any precision. These non-Gaussian approximations share the same computational complexity as those which are based on GMRFs and can be sampled exactly with computable normalizing constants. We apply our approximations in spatial disease mapping and model-based geostatistical models with different likelihoods, obtain procedures for block updating and construct Metropolized independence samplers.  相似文献   

5.
While conjugate Bayesian inference in decomposable Gaussian graphical models is largely solved, the non-decomposable case still poses difficulties concerned with the specification of suitable priors and the evaluation of normalizing constants. In this paper we derive the DY-conjugate prior ( Diaconis & Ylvisaker, 1979 ) for non-decomposable models and show that it can be regarded as a generalization to an arbitrary graph G of the hyper inverse Wishart distribution ( Dawid & Lauritzen, 1993 ). In particular, if G is an incomplete prime graph it constitutes a non-trivial generalization of the inverse Wishart distribution. Inference based on marginal likelihood requires the evaluation of a normalizing constant and we propose an importance sampling algorithm for its computation. Examples of structural learning involving non-decomposable models are given. In order to deal efficiently with the set of all positive definite matrices with non-decomposable zero-pattern we introduce the operation of triangular completion of an incomplete triangular matrix. Such a device turns out to be extremely useful both in the proof of theoretical results and in the implementation of the Monte Carlo procedure.  相似文献   

6.
A density estimation method in a Bayesian nonparametric framework is presented when recorded data are not coming directly from the distribution of interest, but from a length biased version. From a Bayesian perspective, efforts to computationally evaluate posterior quantities conditionally on length biased data were hindered by the inability to circumvent the problem of a normalizing constant. In this article, we present a novel Bayesian nonparametric approach to the length bias sampling problem that circumvents the issue of the normalizing constant. Numerical illustrations as well as a real data example are presented and the estimator is compared against its frequentist counterpart, the kernel density estimator for indirect data of Jones.  相似文献   

7.
Multivariate Logit models are convenient to describe multivariate correlated binary choices as they provide closed-form likelihood functions. However, the computation time required for calculating choice probabilities increases exponentially with the number of choices, which makes maximum likelihood-based estimation infeasible when many choices are considered. To solve this, we propose three novel estimation methods: (i) stratified importance sampling, (ii) composite conditional likelihood (CCL), and (iii) generalized method of moments, which yield consistent estimates and still have similar small-sample bias to maximum likelihood. Our simulation study shows that computation times for CCL are much smaller and that its efficiency loss is small.  相似文献   

8.
A Bayesian approach to modelling binary data on a regular lattice is introduced. The method uses a hierarchical model where the observed data is the sign of a hidden conditional autoregressive Gaussian process. This approach essentially extends the familiar probit model to dependent data. Markov chain Monte Carlo simulations are used on real and simulated data to estimate the posterior distribution of the spatial dependency parameters and the method is shown to work well. The method can be straightforwardly extended to regression models.  相似文献   

9.
Markov random fields (MRFs) express spatial dependence through conditional distributions, although their stochastic behavior is defined by their joint distribution. These joint distributions are typically difficult to obtain in closed form, the problem being a normalizing constant that is a function of unknown parameters. The Gaussian MRF (or conditional autoregressive model) is one case where the normalizing constant is available in closed form; however, when sample sizes are moderate to large (thousands to tens of thousands), and beyond, its computation can be problematic. Because the conditional autoregressive (CAR) model is often used for spatial-data modeling, we develop likelihood-inference methodology for this model in situations where the sample size is too large for its normalizing constant to be computed directly. In particular, we use simulation methodology to obtain maximum likelihood estimators of mean, variance, and spatial-depencence parameters (including their asymptotic variances and covariances) of CAR models.  相似文献   

10.
Fitting cross-classified multilevel models with binary response is challenging. In this setting a promising method is Bayesian inference through Integrated Nested Laplace Approximations (INLA), which performs well in several latent variable models. We devise a systematic simulation study to assess the performance of INLA with cross-classified binary data under different scenarios defined by the magnitude of the variances of the random effects, the number of observations, the number of clusters, and the degree of cross-classification. In the simulations INLA is systematically compared with the popular method of Maximum Likelihood via Laplace Approximation. By an application to the classical salamander mating data, we compare INLA with the best performing methods. Given the computational speed and the generally good performance, INLA turns out to be a valuable method for fitting logistic cross-classified models.  相似文献   

11.
Asymptotic methods are commonly used in statistical inference for unknown parameters in binary data models. These methods are based on large sample theory, a condition which may be in conflict with small sample size and hence leads to poor results in the optimal designs theory. In this paper, we apply the second order expansions of the maximum likelihood estimator and derive a matrix formula for the mean square error (MSE) to obtain more precise optimal designs based on the MSE. Numerical results indicate the new optimal designs are more efficient than the optimal designs based on the information matrix.  相似文献   

12.
The Poisson-binomial distribution is useful in many applied problems in engineering, actuarial science and data mining. The Poisson-binomial distribution models the distribution of the sum of independent but non-identically distributed random indicators whose success probabilities vary. In this paper, we extend the Poisson-binomial distribution to a generalized Poisson-binomial (GPB) distribution. The GPB distribution corresponds to the case where the random indicators are replaced by two-point random variables, which can take two arbitrary values instead of 0 and 1 as in the case of random indicators. The GPB distribution has found applications in many areas such as voting theory, actuarial science, warranty prediction and probability theory. As the GPB distribution has not been studied in detail so far, we introduce this distribution first and then derive its theoretical properties. We develop an efficient algorithm for the computation of its distribution function, using the fast Fourier transform. We test the accuracy of the developed algorithm by comparing it with enumeration-based exact method and the results from the binomial distribution. We also study the computational time of the algorithm under various parameter settings. Finally, we discuss the factors affecting the computational efficiency of the proposed algorithm and illustrate the use of the software package.  相似文献   

13.
Bayesian selection of variables is often difficult to carry out because of the challenge in specifying prior distributions for the regression parameters for all possible models, specifying a prior distribution on the model space and computations. We address these three issues for the logistic regression model. For the first, we propose an informative prior distribution for variable selection. Several theoretical and computational properties of the prior are derived and illustrated with several examples. For the second, we propose a method for specifying an informative prior on the model space, and for the third we propose novel methods for computing the marginal distribution of the data. The new computational algorithms only require Gibbs samples from the full model to facilitate the computation of the prior and posterior model probabilities for all possible models. Several properties of the algorithms are also derived. The prior specification for the first challenge focuses on the observables in that the elicitation is based on a prior prediction y 0 for the response vector and a quantity a 0 quantifying the uncertainty in y 0. Then, y 0 and a 0 are used to specify a prior for the regression coefficients semi-automatically. Examples using real data are given to demonstrate the methodology.  相似文献   

14.
Discrete Markov random fields form a natural class of models to represent images and spatial datasets. The use of such models is, however, hampered by a computationally intractable normalising constant. This makes parameter estimation and a fully Bayesian treatment of discrete Markov random fields difficult. We apply approximation theory for pseudo-Boolean functions to binary Markov random fields and construct approximations and upper and lower bounds for the associated computationally intractable normalising constant. As a by-product of this process we also get a partially ordered Markov model approximation of the binary Markov random field. We present numerical examples with both the pairwise interaction Ising model and with higher-order interaction models, showing the quality of our approximations and bounds. We also present simulation examples and one real data example demonstrating how the approximations and bounds can be applied for parameter estimation and to handle a fully Bayesian model computationally.  相似文献   

15.
In the estimation of a population mean or total from a random sample, certain methods based on linear models are known to be automatically design consistent, regardless of how well the underlying model describes the population. A sufficient condition is identified for this type of robustness to model failure; the condition, which we call 'internal bias calibration', relates to the combination of a model and the method used to fit it. Included among the internally bias-calibrated models, in addition to the aforementioned linear models, are certain canonical link generalized linear models and nonparametric regressions constructed from them by a particular style of local likelihood fitting. Other models can often be made robust by using a suboptimal fitting method. Thus the class of model-based, but design consistent, analyses is enlarged to include more realistic models for certain types of survey variable such as binary indicators and counts. Particular applications discussed are the estimation of the size of a population subdomain, as arises in tax auditing for example, and the estimation of a bootstrap tail probability.  相似文献   

16.
Several methods for generating variates with univariate and multivariate Walleniu' and Fisher's noncentral hypergeometric distributions are developed. Methods for the univariate distributions include: simulation of urn experiments, inversion by binary search, inversion by chop-down search from the mode, ratio-of-uniforms rejection method, and rejection by sampling in the τ domain. Methods for the multivariate distributions include: simulation of urn experiments, conditional method, Gibbs sampling, and Metropolis-Hastings sampling. These methods are useful for Monte Carlo simulation of models of biased sampling and models of evolution and for calculating moments and quantiles of the distributions.  相似文献   

17.
We extend the log‐mean linear parameterization for binary data to discrete variables with arbitrary number of levels and show that also in this case it can be used to parameterize bi‐directed graph models. Furthermore, we show that the log‐mean linear parameterization allows one to simultaneously represent marginal independencies among variables and marginal independencies that only appear when certain levels are collapsed into a single one. We illustrate the application of this property by means of an example based on genetic association studies involving single‐nucleotide polymorphisms. More generally, this feature provides a natural way to reduce the parameter count, while preserving the independence structure, by means of substantive constraints that give additional insight into the association structure of the variables. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics  相似文献   

18.
19.
Clustered binary data are common in medical research and can be fitted to the logistic regression model with random effects which belongs to a wider class of models called the generalized linear mixed model. The likelihood-based estimation of model parameters often has to handle intractable integration which leads to several estimation methods to overcome such difficulty. The penalized quasi-likelihood (PQL) method is the one that is very popular and computationally efficient in most cases. The expectation–maximization (EM) algorithm allows to estimate maximum-likelihood estimates, but requires to compute possibly intractable integration in the E-step. The variants of the EM algorithm to evaluate the E-step are introduced. The Monte Carlo EM (MCEM) method computes the E-step by approximating the expectation using Monte Carlo samples, while the Modified EM (MEM) method computes the E-step by approximating the expectation using the Laplace's method. All these methods involve several steps of approximation so that corresponding estimates of model parameters contain inevitable errors (large or small) induced by approximation. Understanding and quantifying discrepancy theoretically is difficult due to the complexity of approximations in each method, even though the focus is on clustered binary data. As an alternative competing computational method, we consider a non-parametric maximum-likelihood (NPML) method as well. We review and compare the PQL, MCEM, MEM and NPML methods for clustered binary data via simulation study, which will be useful for researchers when choosing an estimation method for their analysis.  相似文献   

20.
A spatial lattice model for binary data is constructed from two spatial scales linked through conditional probabilities. A coarse grid of lattice locations is specified, and all remaining locations (which we call the background) capture fine-scale spatial dependence. Binary data on the coarse grid are modelled with an autologistic distribution, conditional on the binary process on the background. The background behaviour is captured through a hidden Gaussian process after a logit transformation on its Bernoulli success probabilities. The likelihood is then the product of the (conditional) autologistic probability distribution and the hidden Gaussian–Bernoulli process. The parameters of the new model come from both spatial scales. A series of simulations illustrates the spatial-dependence properties of the model and likelihood-based methods are used to estimate its parameters. Presence–absence data of corn borers in the roots of corn plants are used to illustrate how the model is fitted.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号