首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

2.
Modeling spatial interactions that arise in spatially referenced data is commonly done by incorporating the spatial dependence into the covariance structure either explicitly or implicitly via an autoregressive model. In the case of lattice (regional summary) data, two common autoregressive models used are the conditional autoregressive model (CAR) and the simultaneously autoregressive model (SAR). Both of these models produce spatial dependence in the covariance structure as a function of a neighbor matrix W and often a fixed unknown spatial correlation parameter. This paper examines in detail the correlation structures implied by these models as applied to an irregular lattice in an attempt to demonstrate their many counterintuitive or impractical results. A data example is used for illustration where US statewide average SAT verbal scores are modeled and examined for spatial structure using different spatial models.  相似文献   

3.
In this study, an evaluation of Bayesian hierarchical models is made based on simulation scenarios to compare single-stage and multi-stage Bayesian estimations. Simulated datasets of lung cancer disease counts for men aged 65 and older across 44 wards in the London Health Authority were analysed using a range of spatially structured random effect components. The goals of this study are to determine which of these single-stage models perform best given a certain simulating model, how estimation methods (single- vs. multi-stage) compare in yielding posterior estimates of fixed effects in the presence of spatially structured random effects, and finally which of two spatial prior models – the Leroux or ICAR model, perform best in a multi-stage context under different assumptions concerning spatial correlation. Among the fitted single-stage models without covariates, we found that when there is low amount of variability in the distribution of disease counts, the BYM model is relatively robust to misspecification in terms of DIC, while the Leroux model is the least robust to misspecification. When these models were fit to data generated from models with covariates, we found that when there was one set of covariates – either spatially correlated or non-spatially correlated, changing the values of the fixed coefficients affected the ability of either the Leroux or ICAR model to fit the data well in terms of DIC. When there were multiple sets of spatially correlated covariates in the simulating model, however, we could not distinguish the goodness of fit to the data between these single-stage models. We found that the multi-stage modelling process via the Leroux and ICAR models generally reduced the variance of the posterior estimated fixed effects for data generated from models with covariates and a UH term compared to analogous single-stage models. Finally, we found the multi-stage Leroux model compares favourably to the multi-stage ICAR model in terms of DIC. We conclude that the mutli-stage Leroux model should be seriously considered in applications of Bayesian disease mapping when an investigator desires to fit a model with both fixed effects and spatially structured random effects to Poisson count data.  相似文献   

4.
Abstract. We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model. Under this model, the points can be of one of three types: a ‘background point’ an ‘independent cluster point’ or a ‘dependent cluster point’. The background and independent cluster points are thought to exhibit ‘complete spatial randomness’, whereas the dependent cluster points are likely to occur close to previous cluster points. We demonstrate the flexibility of the model for producing point patterns with linear structures and propose to use the model as the likelihood in a Bayesian setting when analysing a spatial point pattern exhibiting linear structures. We illustrate this methodology by analysing two spatial point pattern datasets (locations of bronze age graves in Denmark and locations of mountain tops in Spain).  相似文献   

5.
ABSTRACT

Seasonal autoregressive (SAR) models have been modified and extended to model high frequency time series characterized by exhibiting double seasonal patterns. Some researchers have introduced Bayesian inference for double seasonal autoregressive (DSAR) models; however, none has tackled the problem of Bayesian identification of DSAR models. Therefore, in order to fill this gap, we present a Bayesian methodology to identify the order of DSAR models. Assuming the model errors are normally distributed and using three priors, i.e. natural conjugate, g, and Jeffreys’ priors, on the model parameters, we derive the joint posterior mass function of the model order in a closed-form. Accordingly, the posterior mass function can be investigated and the best order of DSAR model is chosen as a value with the highest posterior probability for the time series being analyzed. We evaluate the proposed Bayesian methodology using simulation study, and we then apply it to real-world hourly internet amount of traffic dataset.  相似文献   

6.
Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

7.
In this paper, we describe an analysis for data collected on a three-dimensional spatial lattice with treatments applied at the horizontal lattice points. Spatial correlation is accounted for using a conditional autoregressive model. Observations are defined as neighbours only if they are at the same depth. This allows the corresponding variance components to vary by depth. We use the Markov chain Monte Carlo method with block updating, together with Krylov subspace methods, for efficient estimation of the model. The method is applicable to both regular and irregular horizontal lattices and hence to data collected at any set of horizontal sites for a set of depths or heights, for example, water column or soil profile data. The model for the three-dimensional data is applied to agricultural trial data for five separate days taken roughly six months apart in order to determine possible relationships over time. The purpose of the trial is to determine a form of cropping that leads to less moist soils in the root zone and beyond. We estimate moisture for each date, depth and treatment accounting for spatial correlation and determine relationships of these and other parameters over time.  相似文献   

8.
Experimental designs can be constructed to be efficient in the presence of spatial correlation. Available construction methods include those based on autoregressive and linear variance models. This paper investigates spatial designs across a range of assumed autoregressive structures. Results show that when the spatial component is low relative to the independent error term, efficient spatial designs can be constructed without having to specify parameters for the spatial structure.  相似文献   

9.
Time-varying coefficient models with autoregressive and moving-average–generalized autoregressive conditional heteroscedasticity structure are proposed for examining the time-varying effects of risk factors in longitudinal studies. Compared with existing models in the literature, the proposed models give explicit patterns for the time-varying coefficients. Maximum likelihood and marginal likelihood (based on a Laplace approximation) are used to estimate the parameters in the proposed models. Simulation studies are conducted to evaluate the performance of these two estimation methods, which is measured in terms of the Kullback–Leibler divergence and the root mean square error. The marginal likelihood approach leads to the more accurate parameter estimates, although it is more computationally intensive. The proposed models are applied to the Framingham Heart Study to investigate the time-varying effects of covariates on coronary heart disease incidence. The Bayesian information criterion is used for specifying the time series structures of the coefficients of the risk factors.  相似文献   

10.
In this article, we apply the Bayesian approach to the linear mixed effect models with autoregressive(p) random errors under mixture priors obtained with the Markov chain Monte Carlo (MCMC) method. The mixture structure of a point mass and continuous distribution can help to select the variables in fixed and random effects models from the posterior sample generated using the MCMC method. Bayesian prediction of future observations is also one of the major concerns. To get the best model, we consider the commonly used highest posterior probability model and the median posterior probability model. As a result, both criteria tend to be needed to choose the best model from the entire simulation study. In terms of predictive accuracy, a real example confirms that the proposed method provides accurate results.  相似文献   

11.
Among the diverse frameworks that have been proposed for regression analysis of angular data, the projected multivariate linear model provides a particularly appealing and tractable methodology. In this model, the observed directional responses are assumed to correspond to the angles formed by latent bivariate normal random vectors that are assumed to depend upon covariates through a linear model. This implies an angular normal distribution for the observed angles, and incorporates a regression structure through a familiar and convenient relationship. In this paper we extend this methodology to accommodate clustered data (e.g., longitudinal or repeated measures data) by formulating a marginal version of the model and basing estimation on an EM‐like algorithm in which correlation among within‐cluster responses is taken into account by incorporating a working correlation matrix into the M step. A sandwich estimator is used for the parameter estimates’ covariance matrix. The methodology is motivated and illustrated using an example involving clustered measurements of microbril angle on loblolly pine (Pinus taeda L.) Simulation studies are presented that evaluate the finite sample properties of the proposed fitting method. In addition, the relationship between within‐cluster correlation on the latent Euclidean vectors and the corresponding correlation structure for the observed angles is explored.  相似文献   

12.
In the survey sampling estimation or prediction of both population’s and subopulation’s (domain’s) characteristics is one of the key issues. In the case of the estimation or prediction of domain’s characteristics one of the problems is looking for additional sources of information that can be used to increase the accuracy of estimators or predictors. One of these sources may be spatial and temporal autocorrelation. Due to the mean squared error (MSE) estimation, the standard assumption is that random variables are independent for population elements from different domains. If the assumption is taken into account, spatial correlation may be assumed only inside domains. In the paper, we assume some special case of the linear mixed model with two random components that obey assumptions of the first-order spatial autoregressive model SAR(1) (but inside groups of domains instead of domains) and first-order temporal autoregressive model AR(1). Based on the model, the empirical best linear unbiased predictor will be proposed together with an estimator of its MSE taking the spatial correlation between domains into account.  相似文献   

13.
Assessing the selective influence of amino acid properties is important in understanding evolution at the molecular level. A collection of methods and models has been developed in recent years to determine if amino acid sites in a given DNA sequence alignment display substitutions that are altering or conserving a prespecified set of amino acid properties. Residues showing an elevated number of substitutions that favorably alter a physicochemical property are considered targets of positive natural selection. Such approaches usually perform independent analyses for each amino acid property under consideration, without taking into account the fact that some of the properties may be highly correlated. We propose a Bayesian hierarchical regression model with latent factor structure that allows us to determine which sites display substitutions that conserve or radically change a set of amino acid properties, while accounting for the correlation structure that may be present across such properties. We illustrate our approach by analyzing simulated data sets and an alignment of lysin sperm DNA.  相似文献   

14.
Within the context of California's public report of coronary artery bypass graft (CABG) surgery outcomes, we first thoroughly review popular statistical methods for profiling healthcare providers. Extensive simulation studies are then conducted to compare profiling schemes based on hierarchical logistic regression (LR) modeling under various conditions. Both Bayesian and frequentist's methods are evaluated in classifying hospitals into ‘better’, ‘normal’ or ‘worse’ service providers. The simulation results suggest that no single method would dominate others on all accounts. Traditional schemes based on LR tend to identify too many false outliers, while those based on hierarchical modeling are relatively conservative. The issue of over shrinkage in hierarchical modeling is also investigated using the 2005–2006 California CABG data set. The article provides theoretical and empirical evidence in choosing the right methodology for provider profiling.  相似文献   

15.
Periodic autoregressive (PAR) models with symmetric innovations are widely used on time series analysis, whereas its asymmetric counterpart inference remains a challenge, because of a number of problems related to the existing computational methods. In this paper, we use an interesting relationship between periodic autoregressive and vector autoregressive (VAR) models to study maximum likelihood and Bayesian approaches to the inference of a PAR model with normal and skew-normal innovations, where different kinds of estimation methods for the unknown parameters are examined. Several technical difficulties which are usually complicated to handle are reported. Results are compared with the existing classical solutions and the practical implementations of the proposed algorithms are illustrated via comprehensive simulation studies. The methods developed in the study are applied and illustrate a real-time series. The Bayes factor is also used to compare the multivariate normal model versus the multivariate skew-normal model.  相似文献   

16.
With the ready availability of spatial databases and geographical information system software, statisticians are increasingly encountering multivariate modelling settings featuring associations of more than one type: spatial associations between data locations and associations between the variables within the locations. Although flexible modelling of multivariate point-referenced data has recently been addressed by using a linear model of co-regionalization, existing methods for multivariate areal data typically suffer from unnecessary restrictions on the covariance structure or undesirable dependence on the conditioning order of the variables. We propose a class of Bayesian hierarchical models for multivariate areal data that avoids these restrictions, permitting flexible and order-free modelling of correlations both between variables and across areal units. Our framework encompasses a rich class of multivariate conditionally autoregressive models that are computationally feasible via modern Markov chain Monte Carlo methods. We illustrate the strengths of our approach over existing models by using simulation studies and also offer a real data application involving annual lung, larynx and oesophageal cancer death-rates in Minnesota counties between 1990 and 2000.  相似文献   

17.
We introduce a Bayesian approach to test linear autoregressive moving-average (ARMA) models against threshold autoregressive moving-average (TARMA) models. First, the marginal posterior densities of all parameters, including the threshold and delay, of a TARMA model are obtained by using Gibbs sampler with Metropolis–Hastings algorithm. Second, reversible-jump Markov chain Monte Carlo (RJMCMC) method is adopted to calculate the posterior probabilities for ARMA and TARMA models: Posterior evidence in favor of TARMA models indicates threshold nonlinearity. Finally, based on RJMCMC scheme and Akaike information criterion (AIC) or Bayesian information criterion (BIC), the procedure for modeling TARMA models is exploited. Simulation experiments and a real data example show that our method works well for distinguishing an ARMA from a TARMA model and for building TARMA models.  相似文献   

18.
Bayesian analyses frequently employ two-stage hierarchical models involving two-variance parameters: one controlling measurement error and the other controlling the degree of smoothing implied by the model's higher level. These analyses can be hampered by poorly identified variances which may lead to difficulty in computing and in choosing reference priors for these parameters. In this paper, we introduce the class of two-variance hierarchical linear models and characterize the aspects of these models that lead to well-identified or poorly identified variances. These ideas are illustrated with a spatial analysis of a periodontal data set and examined in some generality for specific two-variance models including the conditionally autoregressive (CAR) and one-way random effect models. We also connect this theory with other constrained regression methods and suggest a diagnostic that can be used to search for missing spatially varying fixed effects in the CAR model.  相似文献   

19.
A general framework is presented for Bayesian inference of multivariate time series exhibiting long-range dependence. The series are modelled using a vector autoregressive fractionally integrated moving-average (VARFIMA) process, which can capture both short-term correlation structure and long-range dependence characteristics of the individual series, as well as interdependence and feedback relationships between the series. To facilitate a sampling-based Bayesian approach, the exact joint posterior density is derived for the parameters, in a form that is computationally simpler than direct evaluation of the likelihood, and a modified Gibbs sampling algorithm is used to generate samples from the complete conditional distribution associated with each parameter. The paper also shows how an approximate form of the joint posterior density may be used for long time series. The procedure is illustrated using sea surface temperatures measured at three locations along the central California coast. These series are believed to be interdependent due to similarities in local atmospheric conditions at the different locations, and previous studies have found that they exhibit ‘long memory’ when studied individually. The approach adopted here permits investigation of the effects on model estimation of the interdependence and feedback relationships between the series.  相似文献   

20.
Summary. We describe a model-based approach to analyse space–time surveillance data on meningococcal disease. Such data typically comprise a number of time series of disease counts, each representing a specific geographical area. We propose a hierarchical formulation, where latent parameters capture temporal, seasonal and spatial trends in disease incidence. We then add—for each area—a hidden Markov model to describe potential additional (autoregressive) effects of the number of cases at the previous time point. Different specifications for the functional form of this autoregressive term are compared which involve the number of cases in the same or in neighbouring areas. The two states of the Markov chain can be interpreted as representing an 'endemic' and a 'hyperendemic' state. The methodology is applied to a data set of monthly counts of the incidence of meningococcal disease in the 94 départements of France from 1985 to 1997. Inference is carried out by using Markov chain Monte Carlo simulation techniques in a fully Bayesian framework. We emphasize that a central feature of our model is the possibility of calculating—for each region and each time point—the posterior probability of being in a hyperendemic state, adjusted for global spatial and temporal trends, which we believe is of particular public health interest.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号