期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A spatiotemporal model for Mexico City ozone levels 总被引：9，自引：1，他引：8

Gabriel Huerta Bruno Sansó Jonathan R. Stroud 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(2):231-248

Summary. We consider hourly readings of concentrations of ozone over Mexico City and propose a model for spatial as well as temporal interpolation and prediction. The model is based on a time-varying regression of the observed readings on air temperature. Such a regression requires interpolated values of temperature at locations and times where readings are not available. These are obtained from a time-varying spatiotemporal model that is coupled to the model for the ozone readings. Two location-dependent harmonic components are added to account for the main periodicities that ozone presents during a given day and that are not explained through the covariate. The model incorporates spatial covariance structure for the observations and the parameters that define the harmonic components. Using the dynamic linear model framework, we show how to compute smoothed means and predictive values for ozone. We illustrate the methodology on data from September 1997. 相似文献

2.

Discretisation for inference on normal mixture models

Mark J. Brewer 《Statistics and Computing》2003,13(3):209-219

The problem of inference in Bayesian Normal mixture models is known to be difficult. In particular, direct Bayesian inference (via quadrature) suffers from a combinatorial explosion in having to consider every possible partition of n observations into k mixture components, resulting in a computation time which is O(k ⁿ). This paper explores the use of discretised parameters and shows that for equal-variance mixture models, direct computation time can be reduced to O(D ^k n ^k), where relevant continuous parameters are each divided into D regions. As a consequence, direct inference is now possible on genuine data sets for small k, where the quality of approximation is determined by the level of discretisation. For large problems, where the computational complexity is still too great in O(D ^k n ^k) time, discretisation can provide a convergence diagnostic for a Markov chain Monte Carlo analysis. 相似文献

3.

Relabelling algorithms for mixture models with applications for large data sets

《Journal of Statistical Computation and Simulation》2012,82(2):394-413

Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises because the posterior is invariant to permutations of the component parameters. Methods for dealing with label switching have been studied fairly extensively in the literature, with the most popular approaches being those based on loss functions. However, many of these algorithms turn out to be too slow in practice, and can be infeasible as the size and/or dimension of the data grow. We propose a new, computationally efficient algorithm based on a loss function interpretation, and show that it can scale up well in large data set scenarios. Then, we review earlier solutions which can scale up well for large data set, and compare their performances on simulated and real data sets. We conclude with some discussions and recommendations of all the methods studied. 相似文献

4.

A hierarchical model for space–time surveillance data on meningococcal disease incidence

Leonhard Knorr-Held Sylvia Richardson 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(2):169-183

Summary. We describe a model-based approach to analyse space–time surveillance data on meningococcal disease. Such data typically comprise a number of time series of disease counts, each representing a specific geographical area. We propose a hierarchical formulation, where latent parameters capture temporal, seasonal and spatial trends in disease incidence. We then add—for each area—a hidden Markov model to describe potential additional (autoregressive) effects of the number of cases at the previous time point. Different specifications for the functional form of this autoregressive term are compared which involve the number of cases in the same or in neighbouring areas. The two states of the Markov chain can be interpreted as representing an 'endemic' and a 'hyperendemic' state. The methodology is applied to a data set of monthly counts of the incidence of meningococcal disease in the 94 départements of France from 1985 to 1997. Inference is carried out by using Markov chain Monte Carlo simulation techniques in a fully Bayesian framework. We emphasize that a central feature of our model is the possibility of calculating—for each region and each time point—the posterior probability of being in a hyperendemic state, adjusted for global spatial and temporal trends, which we believe is of particular public health interest. 相似文献

5.

Parallelizing MCMC for Bayesian spatiotemporal geostatistical models

Jun Yan Mary Kathryn Cowles Shaowen Wang Marc P. Armstrong 《Statistics and Computing》2007,17(4):323-335

When MCMC methods for Bayesian spatiotemporal modeling are applied to large geostatistical problems, challenges arise as a consequence of memory requirements, computing costs, and convergence monitoring. This article describes the parallelization of a reparametrized and marginalized posterior sampling (RAMPS) algorithm, which is carefully designed to generate posterior samples efficiently. The algorithm is implemented using the Parallel Linear Algebra Package (PLAPACK). The scalability of the algorithm is investigated via simulation experiments that are implemented using a cluster with 25 processors. The usefulness of the method is illustrated with an application to sulfur dioxide concentration data from the Air Quality System database of the U.S. Environmental Protection Agency. 相似文献

6.

Deterministic approximate inference techniques for conditionally Gaussian state space models

Onno Zoeter Tom Heskes 《Statistics and Computing》2006,16(3):279-292

We describe a novel deterministic approximate inference technique for conditionally Gaussian state space models, i.e. state space models where the latent state consists of both multinomial and Gaussian distributed variables. The method can be interpreted as a smoothing pass and iteration scheme symmetric to an assumed density filter. It improves upon previously proposed smoothing passes by not making more approximations than implied by the projection onto the chosen parametric form, the assumed density. Experimental results show that the novel scheme outperforms these alternative deterministic smoothing passes. Comparisons with sampling methods suggest that the performance does not degrade with longer sequences. 相似文献

7.

Dynamic generalized linear models with application to environmental epidemiology

Monica Chiogna Carlo Gaetan Carlo Gaetan 《Journal of the Royal Statistical Society. Series C, Applied statistics》2002,51(4):453-468

Summary. We propose modelling short-term pollutant exposure effects on health by using dynamic generalized linear models. The time series of count data are modelled by a Poisson distribution having mean driven by a latent Markov process; estimation is performed by the extended Kalman filter and smoother. This modelling strategy allows us to take into account possible overdispersion and time-varying effects of the covariates. These ideas are illustrated by reanalysing data on the relationship between daily non-accidental deaths and air pollution in the city of Birmingham, Alabama. 相似文献

8.

On a theorem of Stein relating Bayesian and classical inferences in group models

Ted Chang Cesareo Villegas 《Revue canadienne de statistique》1986,14(4):289-296

Let a group G act on the sample space. This paper gives another proof of a theorem of Stein relating a group invariant family of posterior Bayesian probability regions to classical confidence regions when an appropriate prior is used. The example of the central multivariate normal distribution is discussed. 相似文献

9.

Venezuelan Rainfall Data Analysed by Using a Bayesian Space–time Model

Bruno Sansó & Lelys Guenni 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(3):345-362

We consider a set of data from 80 stations in the Venezuelan state of Guárico consisting of accumulated monthly rainfall in a time span of 16 years. The problem of modelling rainfall accumulated over fixed periods of time and recorded at meteorological stations at different sites is studied by using a model based on the assumption that the data follow a truncated and transformed multivariate normal distribution. The spatial correlation is modelled by using an exponentially decreasing correlation function and an interpolating surface for the means. Missing data and dry periods are handled within a Markov chain Monte Carlo framework using latent variables. We estimate the amount of rainfall as well as the probability of a dry period by using the predictive density of the data. We considered a model based on a full second-degree polynomial over the spatial co-ordinates as well as the first two Fourier harmonics to describe the variability during the year. Predictive inferences on the data show very realistic results, capturing the typical rainfall variability in time and space for that region. Important extensions of the model are also discussed. 相似文献

10.

Hybrid Dirichlet mixture models for functional data

Sonia Petrone Michele Guindani Alan E. Gelfand 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(4):755-782

Summary. In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of n individuals. Often, the curves are homogeneous, except perhaps for individual-specific regions that provide heterogeneous behaviour (e.g. 'damaged' areas of irregular shape on an otherwise smooth surface). Motivated by applications with functional data of this nature, we propose a Bayesian mixture model, with the aim of dimension reduction, by representing the sample of n curves through a smaller set of canonical curves. We propose a novel prior on the space of probability measures for a random curve which extends the popular Dirichlet priors by allowing local clustering: non-homogeneous portions of a curve can be allocated to different clusters and the n individual curves can be represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior proposed envisions a conceptual hidden factor with k -levels that acts locally on each curve. We discuss several models incorporating this prior and illustrate its performance with simulated and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet mixtures, specifically, their behaviour as the number of the mixture components goes to ∞ and their connection with Dirichlet process mixtures. 相似文献

11.

A generalization of Jeffreys' rule for non regular models

Francisco J. Ortega Jesús Basulto 《统计学通讯:理论与方法》2013,42(15):4433-4444

ABSTRACT

We propose a generalization of the one-dimensional Jeffreys' rule in order to obtain non informative prior distributions for non regular models, taking into account the comments made by Jeffreys in his article of 1946. These non informatives are parameterization invariant and the Bayesian intervals have good behavior in frequentist inference. In some important cases, we can generate non informative distributions for multi-parameter models with non regular parameters. In non regular models, the Bayesian method offers a satisfactory solution to the inference problem and also avoids the problem that the maximum likelihood estimator has with these models. Finally, we obtain non informative distributions in job-search and deterministic frontier production homogenous models. 相似文献

12.

Parallel algorithms for Markov chain Monte Carlo methods in latent spatial Gaussian models 总被引：2，自引：1，他引：1

Matt Whiley Simon P. Wilson 《Statistics and Computing》2004,14(3):171-179

Markov chain Monte Carlo (MCMC) implementations of Bayesian inference for latent spatial Gaussian models are very computationally intensive, and restrictions on storage and computation time are limiting their application to large problems. Here we propose various parallel MCMC algorithms for such models. The algorithms' performance is discussed with respect to a simulation study, which demonstrates the increase in speed with which the algorithms explore the posterior distribution as a function of the number of processors. We also discuss how feasible problem size is increased by use of these algorithms. 相似文献

13.

A new class of mixture models for differential gene expression in DNA microarray data

Ming-Hui Chen Joseph G. Ibrahim Yueh-Yun Chi 《Journal of statistical planning and inference》2008

One of the fundamental issues in analyzing microarray data is to determine which genes are expressed and which ones are not for a given group of subjects. In datasets where many genes are expressed and many are not expressed (i.e., underexpressed), a bimodal distribution for the gene expression levels often results, where one mode of the distribution represents the expressed genes and the other mode represents the underexpressed genes. To model this bimodality, we propose a new class of mixture models that utilize a random threshold value for accommodating bimodality in the gene expression distribution. Theoretical properties of the proposed model are carefully examined. We use this new model to examine the problem of differential gene expression between two groups of subjects, develop prior distributions, and derive a new criterion for determining which genes are differentially expressed between the two groups. Prior elicitation is carried out using empirical Bayes methodology in order to estimate the threshold value as well as elicit the hyperparameters for the two component mixture model. The new gene selection criterion is demonstrated via several simulations to have excellent false positive rate and false negative rate properties. A gastric cancer dataset is used to motivate and illustrate the proposed methodology. 相似文献

14.

Bayesian emulation of complex multi-output and dynamic computer models 总被引：1，自引：0，他引：1

Stefano Conti Anthony O’Hagan 《Journal of statistical planning and inference》2010

Computer models are widely used in scientific research to study and predict the behaviour of complex systems. The run times of computer-intensive simulators are often such that it is impractical to make the thousands of model runs that are conventionally required for sensitivity analysis, uncertainty analysis or calibration. In response to this problem, highly efficient techniques have recently been developed based on a statistical meta-model (the emulator) that is built to approximate the computer model. The approach, however, is less straightforward for dynamic simulators, designed to represent time-evolving systems. Generalisations of the established methodology to allow for dynamic emulation are here proposed and contrasted. Advantages and difficulties are discussed and illustrated with an application to the Sheffield Dynamic Global Vegetation Model, developed within the UK Centre for Terrestrial Carbon Dynamics. 相似文献

15.

Mixture of measurement errors and their impact on parameter inferences

Jianjun Gan Hongmei Zhang Robert Best 《Journal of Statistical Computation and Simulation》2013,83(4):613-626

A mixture measurement error model built upon skew normal distributions and normal distributions is developed to evaluate various impacts of measurement errors to parameter inferences in logistic regressions. Data generated from survey questionnaires are usually error contaminated. We consider two types of errors: person-specific bias and random errors. Person-specific bias is modelled using skew normal distribution, and the distribution of random errors is described by a normal distribution. Intensive simulations are conducted to evaluate the contribution of each component in the mixture to outcomes of interest. The proposed method is then applied to a questionnaire data set generated from a neural tube defect study. Simulation results and real data application indicate that ignoring measurement errors or misspecifying measurement error components can both produce misleading results, especially when measurement errors are actually skew distributed. The inferred parameters can be attenuated or inflated depending on how the measurement error components are specified. We expect the findings will self-explain the importance of adjusting measurement errors and thus benefit future data collection effort. 相似文献

16.

Bayesian inference for random coefficient dynamic panel data models

Fang Liu Peng Zhang Ibrahim Erkan 《Journal of applied statistics》2017,44(9):1543-1559

We develop a hierarchical Bayesian approach for inference in random coefficient dynamic panel data models. Our approach allows for the initial values of each unit's process to be correlated with the unit-specific coefficients. We impose a stationarity assumption for each unit's process by assuming that the unit-specific autoregressive coefficient is drawn from a logitnormal distribution. Our method is shown to have favorable properties compared to the mean group estimator in a Monte Carlo study. We apply our approach to analyze energy and protein intakes among individuals from the Philippines. 相似文献

17.

Estimation across multiple models with application to Bayesian computing and software development

Richard J. Stevens Trevor J. Sweeting 《Statistics and Computing》2007,17(3):245-252

Statistical models are sometimes incorporated into computer software for making predictions about future observations. When the computer model consists of a single statistical model this corresponds to estimation of a function of the model parameters. This paper is concerned with the case that the computer model implements multiple, individually-estimated statistical sub-models. This case frequently arises, for example, in models for medical decision making that derive parameter information from multiple clinical studies. We develop a method for calculating the posterior mean of a function of the parameter vectors of multiple statistical models that is easy to implement in computer software, has high asymptotic accuracy, and has a computational cost linear in the total number of model parameters. The formula is then used to derive a general result about posterior estimation across multiple models. The utility of the results is illustrated by application to clinical software that estimates the risk of fatal coronary disease in people with diabetes. 相似文献

18.

Bayesian variable selection in a finite mixture of linear mixed-effects models

Kuo-Jung Lee Ray-Bing Chen 《Journal of Statistical Computation and Simulation》2019,89(13):2434-2453

Mixture of linear mixed-effects models has received considerable attention in longitudinal studies, including medical research, social science and economics. The inferential question of interest is often the identification of critical factors that affect the responses. We consider a Bayesian approach to select the important fixed and random effects in the finite mixture of linear mixed-effects models. To accomplish our goal, latent variables are introduced to facilitate the identification of influential fixed and random components and to classify the membership of observations in the longitudinal data. A spike-and-slab prior for the regression coefficients is adopted to sidestep the potential complications of highly collinear covariates and to handle large p and small n issues in the variable selection problems. Here we employ Markov chain Monte Carlo (MCMC) sampling techniques for posterior inferences and explore the performance of the proposed method in simulation studies, followed by an actual psychiatric data analysis concerning depressive disorder. 相似文献

19.

Mixture models for capture-recapture count data

Dankmar Böhning Ekkehart Dietz Ronny Kuhnert Dieter Schön 《Statistical Methods and Applications》2005,14(1):29-43

The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution. 相似文献

20.

Forecasting for some stochastic process models related to sow farm management

Juan Miguel Marí n Lluis M. Pl David Rí os-Insua 《Journal of applied statistics》2005,32(8):797-812

Sow farm management requires appropriate methods to forecast the sow population structure evolution. We describe two models for such purpose. The first is a semi-Markov process model, used for long-term predictions and strategic management. The second is a state-space model for continuous proportions, used for short-term predictions and operational management. 相似文献