期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dynamic changepoint detection in count time series: a particle filter approach

Paulo H. D. da Silva Cibele Q. da-Silva 《Journal of Statistical Computation and Simulation》2017,87(1):42-68

We study Bayesian dynamic models for detecting changepoints in count time series that present structural breaks. As the inferential approach, we develop a parameter learning version of the algorithm proposed by Chopin [Chopin N. Dynamic detection of changepoints in long time series. Annals of the Institute of Statistical Mathematics 2007;59:349–366.], called the Chopin filter with parameter learning, which allows us to estimate the static parameters in the model. In this extension, the static parameters are addressed by using the kernel smoothing approximations proposed by Liu and West [Liu J, West M. Combined parameters and state estimation in simulation-based filtering. In: Doucet A, de Freitas N, Gordon N, editors. Sequential Monte Carlo methods in practice. New York: Springer-Verlag; 2001]. The proposed methodology is then applied to both simulated and real data sets and the time series models include distributions that allow for overdispersion and/or zero inflation. Since our procedure is general, robust and naturally adaptive because the particle filter approach does not require restrictive specifications to ensure its validity and effectiveness, we believe it is a valuable alternative for dealing with the problem of detecting changepoints in count time series. The proposed methodology is also suitable for count time series with no changepoints and for independent count data. 相似文献

2.

Exact and efficient Bayesian inference for multiple changepoint problems

Paul Fearnhead 《Statistics and Computing》2006,16(2):203-213

We demonstrate how to perform direct simulation from the posterior distribution of a class of multiple changepoint models where the number of changepoints is unknown. The class of models assumes independence between the posterior distribution of the parameters associated with segments of data between successive changepoints. This approach is based on the use of recursions, and is related to work on product partition models. The computational complexity of the approach is quadratic in the number of observations, but an approximate version, which introduces negligible error, and whose computational cost is roughly linear in the number of observations, is also possible. Our approach can be useful, for example within an MCMC algorithm, even when the independence assumptions do not hold. We demonstrate our approach on coal-mining disaster data and on well-log data. Our method can cope with a range of models, and exact simulation from the posterior distribution is possible in a matter of minutes. 相似文献

3.

Efficient Bayesian analysis of multiple changepoint models with dependence across segments

Paul Fearnhead Zhen Liu 《Statistics and Computing》2011,21(2):217-229

We consider Bayesian analysis of a class of multiple changepoint models. While there are a variety of efficient ways to analyse these models if the parameters associated with each segment are independent, there are few general approaches for models where the parameters are dependent. Under the assumption that the dependence is Markov, we propose an efficient online algorithm for sampling from an approximation to the posterior distribution of the number and position of the changepoints. In a simulation study, we show that the approximation introduced is negligible. We illustrate the power of our approach through fitting piecewise polynomial models to data, under a model which allows for either continuity or discontinuity of the underlying curve at each changepoint. This method is competitive with, or outperform, other methods for inferring curves from noisy data; and uniquely it allows for inference of the locations of discontinuities in the underlying curve. 相似文献

4.

A changepoint statistic with uniform type I error probabilities

Peter Rogerson Peter Kedron 《统计学通讯:理论与方法》2013,42(16):4663-4672

ABSTRACT

Likelihood ratio tests for a change in mean in a sequence of independent, normal random variables are based on the maximum two-sample t-statistic, where the maximum is taken over all possible changepoints. The maximum t-statistic has the undesirable characteristic that Type I errors are not uniformly distributed across possible changepoints. False positives occur more frequently near the ends of the sequence and occur less frequently near the middle of the sequence. In this paper we describe an alternative statistic that is based upon a minimum p-value, where the minimum is taken over all possible changepoints. The p-value at any particular changepoint is based upon both the two-sample t-statistic at that changepoint and the probability that the maximum two-sample t-statistic is achieved at that changepoint. The new statistic has a more uniform distribution of Type I errors across potential changepoints and it compares favorably with respect to statistical power, false discovery rates, and the mean square error of changepoint estimates. 相似文献

5.

Robust mixture modelling using the t distribution 总被引：2，自引：0，他引：2

Peel D. McLachlan G. J. 《Statistics and Computing》2000,10(4):339-348

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise. 相似文献

6.

Likelihood-based approaches for multivariate linear models under inequality constraints for incomplete data

Shurong Zheng Jianhua Guo Ning-Zhong Shi Guo-Liang Tian 《Journal of statistical planning and inference》2012

In this paper, we consider a multivariate linear model with complete/incomplete data, where the regression coefficients are subject to a set of linear inequality restrictions. We first develop an expectation/conditional maximization (ECM) algorithm for calculating restricted maximum likelihood estimates of parameters of interest. We then establish the corresponding convergence properties for the proposed ECM algorithm. Applications to growth curve models and linear mixed models are presented. Confidence interval construction via the double-bootstrap method is provided. Some simulation studies are performed and a real example is used to illustrate the proposed methods. 相似文献

7.

Modelling growth and decline in lung function in Duchenne's muscular dystrophy with an augmented linear mixed effects model

Marc A. Scott Robert G. Norman Kenneth I. Berger 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(3):507-521

Summary. Longitudinal modelling of lung function in Duchenne's muscular dystrophy is complicated by a mixture of both growth and decline in lung function within each subject, an unknown point of separation between these phases and significant heterogeneity between individual trajectories. Linear mixed effects models can be used, assuming a single changepoint for all cases; however, this assumption may be incorrect. The paper describes an extension of linear mixed effects modelling in which random changepoints are integrated into the model as parameters and estimated by using a stochastic EM algorithm. We find that use of this 'mixture modelling' approach improves the fit significantly. 相似文献

8.

Bayesian P-splines and advanced computing in R for a changepoint analysis on spatio-temporal point processes

L. Altieri D. Cocchi F. Greco J.B. Illian E.M. Scott 《Journal of Statistical Computation and Simulation》2016,86(13):2531-2545

ABSTRACT

This work presents advanced computational aspects of a new method for changepoint detection on spatio-temporal point process data. We summarize the methodology, based on building a Bayesian hierarchical model for the data and declaring prior conjectures on the number and positions of the changepoints, and show how to take decisions regarding the acceptance of potential changepoints. The focus of this work is about choosing an approach that detects the correct changepoint and delivers smooth reliable estimates in a feasible computational time; we propose Bayesian P-splines as a suitable tool for managing spatial variation, both under a computational and a model fitting performance perspective. The main computational challenges are outlined and a solution involving parallel computing in R is proposed and tested on a simulation study. An application is also presented on a data set of seismic events in Italy over the last 20 years. 相似文献

9.

Bayesian binary segmentation procedure for detecting streakiness in sports

Tae Young Yang 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2004,167(4):627-637

Summary. When an individual player or team enjoys periods of good form, and when these occur, is a widely observed phenomenon typically called 'streakiness'. It is interesting to assess which team is a streaky team, or who is a streaky player in sports. Such competitors might have a large number of successes during some periods and few or no successes during other periods. Thus, their success rate is not constant over time. We provide a Bayesian binary segmentation procedure for locating changepoints and the associated success rates simultaneously for these competitors. The procedure is based on a series of nested hypothesis tests each using the Bayes factor or the Bayesian information criterion. At each stage, we only need to compare a model with one changepoint with a model based on a constant success rate. Thus, the method circumvents the computational complexity that we would normally face in problems with an unknown number of changepoints. We apply the procedure to data corresponding to sports teams and players from basketball, golf and baseball. 相似文献

10.

Modeling with Mixtures of Linear Regressions

Viele Kert Tong Barbara 《Statistics and Computing》2002,12(4):315-330

Consider data (x ₁,y ₁),...,(x _n,y _n), where each x _i may be vector valued, and the distribution of y _i given x _i is a mixture of linear regressions. This provides a generalization of mixture models which do not include covariates in the mixture formulation. This mixture of linear regressions formulation has appeared in the computer science literature under the name Hierarchical Mixtures of Experts model.This model has been considered from both frequentist and Bayesian viewpoints. We focus on the Bayesian formulation. Previously, estimation of the mixture of linear regression model has been done through straightforward Gibbs sampling with latent variables. This paper contributes to this field in three major areas. First, we provide a theoretical underpinning to the Bayesian implementation by demonstrating consistency of the posterior distribution. This demonstration is done by extending results in Barron, Schervish and Wasserman (Annals of Statistics 27: 536–561, 1999) on bracketing entropy to the regression setting. Second, we demonstrate through examples that straightforward Gibbs sampling may fail to effectively explore the posterior distribution and provide alternative algorithms that are more accurate. Third, we demonstrate the usefulness of the mixture of linear regressions framework in Bayesian robust regression. The methods described in the paper are applied to two examples. 相似文献

11.

Change-points in stochastic ordering

Chengjie Xiong George A. Milliken 《统计学通讯:理论与方法》2013,42(2):381-400

This article studies the problem of testing and locating changepoints in stochas¬tic ordering. We propose a sequential process to detect the changepoints from two multinomial distributions. We also obtain the maximum likelihood estimators of two multinomial probability vectors under the assumption that the cumulative distribu¬tions have a changepoint. Asymptotically unbiased Akaike's information criterion is used to estimate the changepoints of two discrete probability distributions. Finally. we demonstrate our procedure by studying a data set pertaining to average daily insulin dose from the Boston Collaborative Drug Surveillance Program and locate the changepoints in stochastic ordering. 相似文献

12.

A computationally efficient nonparametric approach for changepoint detection

Kaylea Haynes Paul Fearnhead Idris A. Eckley 《Statistics and Computing》2017,27(5):1293-1305

In this paper we build on an approach proposed by Zou et al. (2014) for nonparametric changepoint detection. This approach defines the best segmentation for a data set as the one which minimises a penalised cost function, with the cost function defined in term of minus a non-parametric log-likelihood for data within each segment. Minimising this cost function is possible using dynamic programming, but their algorithm had a computational cost that is cubic in the length of the data set. To speed up computation, Zou et al. (2014) resorted to a screening procedure which means that the estimated segmentation is no longer guaranteed to be the global minimum of the cost function. We show that the screening procedure adversely affects the accuracy of the changepoint detection method, and show how a faster dynamic programming algorithm, pruned exact linear time (PELT) (Killick et al. 2012), can be used to find the optimal segmentation with a computational cost that can be close to linear in the amount of data. PELT requires a penalty to avoid under/over-fitting the model which can have a detrimental effect on the quality of the detected changepoints. To overcome this issue we use a relatively new method, changepoints over a range of penalties (Haynes et al. 2016), which finds all of the optimal segmentations for multiple penalty values over a continuous range. We apply our method to detect changes in heart-rate during physical activity. 相似文献

13.

A linear mixed model for analyzing longitudinal skew-normal responses with random dropout

M. Ganjali T. Baghfalaki M. Khazaei 《Journal of the Korean Statistical Society》2013,42(2):149-160

In this paper, a linear mixed effects model is used to fit skewed longitudinal data in the presence of dropout. Two distributional assumptions are considered to produce background for heavy tailed models. One is the linear mixed model with skew-normal random effects and normal errors and the other one is the linear mixed model with skew-normal errors and normal random effects. An ECM algorithm is developed to obtain the parameter estimates. Also an empirical Bayes approach is used for estimating random effects. A simulation study is implemented to investigate the performance of the presented algorithm. Results of an application are also reported where standard errors of estimates are calculated using the Bootstrap approach. 相似文献

14.

Particle filters for mixture models with an unknown number of components 总被引：2，自引：1，他引：1

Fearnhead Paul 《Statistics and Computing》2004,14(1):11-21

We consider the analysis of data under mixture models where the number of components in the mixture is unknown. We concentrate on mixture Dirichlet process models, and in particular we consider such models under conjugate priors. This conjugacy enables us to integrate out many of the parameters in the model, and to discretize the posterior distribution. Particle filters are particularly well suited to such discrete problems, and we propose the use of the particle filter of Fearnhead and Clifford for this problem. The performance of this particle filter, when analyzing both simulated and real data from a Gaussian mixture model, is uniformly better than the particle filter algorithm of Chen and Liu. In many situations it outperforms a Gibbs Sampler. We also show how models without the required amount of conjugacy can be efficiently analyzed by the same particle filter algorithm. 相似文献

15.

An ECM Estimation Approach for Analyzing Multivariate Skew-Normal Data with Dropout

T. Baghfalaki 《统计学通讯:模拟与计算》2013,42(10):1970-1988

In this article, an ECM algorithm is developed to obtain the maximum likelihood estimates of parameters where multivariate skew-normal distribution is used for analyzing longitudinal skewed normal regression data with dropout. A simulation study is performed to investigate the performance of the presented algorithm. Also, the methodology is illustrated through two applications and the results of proposed methodology are compared with ECM under multivariate normal assumption using AIC and BIC criteria. Standard errors of parameter estimates are obtained by asymptotic observed information matrix. 相似文献

16.

A class of finite mixture of quantile regressions with its applications

Yuzhu Tian Maozai Tian 《Journal of applied statistics》2016,43(7):1240-1252

Mixture of linear regression models provide a popular treatment for modeling nonlinear regression relationship. The traditional estimation of mixture of regression models is based on Gaussian error assumption. It is well known that such assumption is sensitive to outliers and extreme values. To overcome this issue, a new class of finite mixture of quantile regressions (FMQR) is proposed in this article. Compared with the existing Gaussian mixture regression models, the proposed FMQR model can provide a complete specification on the conditional distribution of response variable for each component. From the likelihood point of view, the FMQR model is equivalent to the finite mixture of regression models based on errors following asymmetric Laplace distribution (ALD), which can be regarded as an extension to the traditional mixture of regression models with normal error terms. An EM algorithm is proposed to obtain the parameter estimates of the FMQR model by combining a hierarchical representation of the ALD. Finally, the iterated weighted least square estimation for each mixture component of the FMQR model is derived. Simulation studies are conducted to illustrate the finite sample performance of the estimation procedure. Analysis of an aphid data set is used to illustrate our methodologies. 相似文献

17.

Simultaneous confidence band for the difference of segmented linear models

Greg Yothers Allan R. Sampson 《Journal of statistical planning and inference》2011,141(2):1059-1068

Consider comparing between two treatments a response variable, whose expectation depends on the value of a continuous covariate in some nonlinear fashion. We fit separate segmented linear models to each treatment to approximate the nonlinear relationship. For this setting, we provide a simultaneous confidence band for the difference between treatments of the expected value functions. The treatments are said to differ significantly on intervals of the covariate where the simultaneous confidence band does not contain zero. We consider segmented linear models where the locations of the changepoints are both known and unknown. The band is obtained from asymptotic results. 相似文献

18.

On-line changepoint detection and parameter estimation with application to genomic data

Fran?ois Caron Arnaud Doucet Raphael Gottardo 《Statistics and Computing》2012,22(2):579-595

相似文献

19.

Bayesian Analysis of a Queueing System with a Long-Tailed Arrival Process

Pepa Ramírez Rosa E. Lillo Michael P. Wiper 《统计学通讯:模拟与计算》2013,42(4):697-712

Internet traffic data is characterized by some unusual statistical properties, in particular, the presence of heavy-tailed variables. A typical model for heavy-tailed distributions is the Pareto distribution although this is not adequate in many cases. In this article, we consider a mixture of two-parameter Pareto distributions as a model for heavy-tailed data and use a Bayesian approach based on the birth-death Markov chain Monte Carlo algorithm to fit this model. We estimate some measures of interest related to the queueing system k-Par/M/1 where k-Par denotes a mixture of k Pareto distributions. Heavy-tailed variables are difficult to model in such queueing systems because of the lack of a simple expression for the Laplace Transform (LT). We use a procedure based on recent LT approximating results for the Pareto/M/1 system. We illustrate our approach with both simulated and real data. 相似文献

20.

A SPARSE CONDITIONAL GAUSSIAN GRAPHICAL MODEL FOR ANALYSIS OF GENETICAL GENOMICS DATA

J Yin H Li 《The annals of applied statistics》2011,5(4):2630-2650

相似文献