首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
While the literature on multivariate models for continuous data flourishes, there is a lack of models for multivariate counts. We aim to contribute to this framework by extending the well known class of univariate hidden Markov models to the multidimensional case, by introducing multivariate Poisson hidden Markov models. Each state of the extended model is associated with a different multivariate discrete distribution. We consider different distributions with Poisson marginals, starting from the multivariate Poisson distribution and then extending to copula based distributions to allow flexible dependence structures. An EM type algorithm is developed for maximum likelihood estimation. A real data application is presented to illustrate the usefulness of the proposed models. In particular, we apply the models to the occurrence of strong earthquakes (surface wave magnitude ≥5), in three seismogenic subregions in the broad region of the North Aegean Sea for the time period from 1 January 1981 to 31 December 2008. Earthquakes occurring in one subregion may trigger events in adjacent ones and hence the observed time series of events are cross‐correlated. It is evident from the results that the three subregions interact with each other at times differing by up to a few months. This migration of seismic activity is captured by the model as a transition to a state of higher seismicity.  相似文献   

2.
Considerable work has been devoted to developing model selection criteria for normal theory regression models. Less attention has been paid to discrete data. We develop two loglinear model selection criteria for Poisson counts. These criteria are based on an estimated bias adjustment of the Akaike information criterion. We observe in a simulation study that the corrected statistics provide good model choices and relatively accurate estimates of the mean structure.  相似文献   

3.
We derive two types of Akaike information criterion (AIC)‐like model‐selection formulae for the semiparametric pseudo‐maximum likelihood procedure. We first adapt the arguments leading to the original AIC formula, related to empirical estimation of a certain Kullback–Leibler information distance. This gives a significantly different formula compared with the AIC, which we name the copula information criterion. However, we show that such a model‐selection procedure cannot exist for copula models with densities that grow very fast near the edge of the unit cube. This problem affects most popular copula models. We then derive what we call the cross‐validation copula information criterion, which exists under weak conditions and is a first‐order approximation to exact cross validation. This formula is very similar to the standard AIC formula but has slightly different motivation. A brief illustration with real data is given.  相似文献   

4.
Mengya Liu  Qi Li 《Statistics》2019,53(1):1-25
This article studies an observation-driven model for time series of counts, which allows for overdispersion and negative serial dependence in the observations. The observations are supposed to follow a negative binomial distribution conditioned on past information with the form of thresh old models, which generates a two-regime structure on the basis of the magnitude of the lagged observations. We use the weak dependence approach to establish the stationarity and ergodicity, and the inference for regression parameters are obtained by the quasi-likelihood. Moreover, asymptotic properties of both quasi-maximum likelihood estimators and the threshold estimator are established, respectively. Simulation studies are considered and so are two applications, one of which is the trading volume of a stock and another is the number of major earthquakes.  相似文献   

5.
In this paper, we extend the focused information criterion (FIC) to copula models. Copulas are often used for applications where the joint tail behavior of the variables is of particular interest, and selecting a copula that captures this well is then essential. Traditional model selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) aim at finding the overall best‐fitting model, which is not necessarily the one best suited for the application at hand. The FIC, on the other hand, evaluates and ranks candidate models based on the precision of their point estimates of a context‐given focus parameter. This could be any quantity of particular interest, for example, the mean, a correlation, conditional probabilities, or measures of tail dependence. We derive FIC formulae for the maximum likelihood estimator, the two‐stage maximum likelihood estimator, and the so‐called pseudo‐maximum‐likelihood (PML) estimator combined with parametric margins. Furthermore, we confirm the validity of the AIC formula for the PML estimator combined with parametric margins. To study the numerical behavior of FIC, we have carried out a simulation study, and we have also analyzed a multivariate data set pertaining to abalones. The results from the study show that the FIC successfully ranks candidate models in terms of their performance, defined as how well they estimate the focus parameter. In terms of estimation precision, FIC clearly outperforms AIC, especially when the focus parameter relates to only a specific part of the model, such as the conditional upper‐tail probability.  相似文献   

6.
This paper proposes a unified framework for defining and fitting stochastic, discrete‐time, discrete‐stage population dynamics models. The biological system is described by a state‐space model, where the true but unknown state of the population is modelled by a state process, and this is linked to survey data by an observation process. All sources of uncertainty in the inputs, including uncertainty about model specification, are readily incorporated. The paper shows how the state process can be represented as a generalization of the standard Leslie or Lefkovitch matrix. By dividing the state process into subprocesses, complex models can be constructed from manageable building blocks. The paper illustrates the approach with a model of the British grey seal metapopulation, using sequential importance sampling with kernel smoothing to fit the model.  相似文献   

7.
The Fay–Herriot model is a standard model for direct survey estimators in which the true quantity of interest, the superpopulation mean, is latent and its estimation is improved through the use of auxiliary covariates. In the context of small area estimation, these estimates can be further improved by borrowing strength across spatial regions or by considering multiple outcomes simultaneously. We provide here two formulations to perform small area estimation with Fay–Herriot models that include both multivariate outcomes and latent spatial dependence. We consider two model formulations. In one of these formulations the outcome‐by‐space dependence structure is separable. The other accounts for the cross dependence through the use of a generalized multivariate conditional autoregressive (GMCAR) structure. The GMCAR model is shown, in a state‐level example, to produce smaller mean square prediction errors, relative to equivalent census variables, than the separable model and the state‐of‐the‐art multivariate model with unstructured dependence between outcomes and no spatial dependence. In addition, both the GMCAR and the separable models give smaller mean squared prediction error than the state‐of‐the‐art model when conducting small area estimation on county level data from the American Community Survey.  相似文献   

8.
Markov networks are popular models for discrete multivariate systems where the dependence structure of the variables is specified by an undirected graph. To allow for more expressive dependence structures, several generalizations of Markov networks have been proposed. Here, we consider the class of contextual Markov networks which takes into account possible context‐specific independences among pairs of variables. Structure learning of contextual Markov networks is very challenging due to the extremely large number of possible structures. One of the main challenges has been to design a score, by which a structure can be assessed in terms of model fit related to complexity, without assuming chordality. Here, we introduce the marginal pseudo‐likelihood as an analytically tractable criterion for general contextual Markov networks. Our criterion is shown to yield a consistent structure estimator. Experiments demonstrate the favourable properties of our method in terms of predictive accuracy of the inferred models.  相似文献   

9.
Generalized linear models with random effects and/or serial dependence are commonly used to analyze longitudinal data. However, the computation and interpretation of marginal covariate effects can be difficult. This led Heagerty (1999, 2002) to propose models for longitudinal binary data in which a logistic regression is first used to explain the average marginal response. The model is then completed by introducing a conditional regression that allows for the longitudinal, within‐subject, dependence, either via random effects or regressing on previous responses. In this paper, the authors extend the work of Heagerty to handle multivariate longitudinal binary response data using a triple of regression models that directly model the marginal mean response while taking into account dependence across time and across responses. Markov Chain Monte Carlo methods are used for inference. Data from the Iowa Youth and Families Project are used to illustrate the methods.  相似文献   

10.
Accelerometry is a low‐cost and noninvasive method that has been used to discriminate sleep from wake, however, its utility to detect sleep stages is unclear. We detail the development and comparison of methods which utilise raw, triaxial accelerometry data to classify varying stages of sleep, ranging from sleep/wake detection to discriminating rapid eye movement sleep, stage one sleep, stage two sleep, deep sleep and wake. First‐ and second‐order hidden Markov models (HMMs) with time‐homogeneous and time‐varying transition probability matrices, along with continuous acceleration observations in the form of a Gaussian‐observation HMM and K‐means classified acceleration in a discrete‐observation HMM were explored. In addition, generalised linear mixed models (GLMMs) with binary and multinomial responses and logit link functions were considered as was whether incorporating adjoining acceleration information into the models improved prediction. Model predictions were compared to the reference‐standard in sleep detection (polysomnography) and outcome accuracies were calculated. Consistently, HMMs yielded greater sleep stage detection than GLMMs but there was little difference between first‐ and second‐order HMMs. Varying degrees of difference were observed when comparing Gaussian‐observation HMMs to discrete‐observation HMMs, and time‐varying HMMs yielded greater discrimination than time‐homogeneous HMMs, as did models which considered adjoining acceleration information. These results suggest that wrist‐worn accelerometry data may be able to detect sleep stages but that further investigation is required to optimise classification accuracy.  相似文献   

11.
We propose a new class of state space models for longitudinal discrete response data where the observation equation is specified in an additive form involving both deterministic and random linear predictors. These models allow us to explicitly address the effects of trend, seasonal or other time-varying covariates while preserving the power of state space models in modeling serial dependence in the data. We develop a Markov chain Monte Carlo algorithm to carry out statistical inference for models with binary and binomial responses, in which we invoke de Jong and Shephard’s (Biometrika 82(2):339–350, 1995) simulation smoother to establish an efficient sampling procedure for the state variables. To quantify and control the sensitivity of posteriors on the priors of variance parameters, we add a signal-to-noise ratio type parameter in the specification of these priors. Finally, we illustrate the applicability of the proposed state space mixed models for longitudinal binomial response data in both simulation studies and data examples.  相似文献   

12.
In order to make predictions of future values of a time series, one needs to specify a forecasting model. A popular choice is an autoregressive time‐series model, for which the order of the model is chosen by an information criterion. We propose an extension of the focused information criterion (FIC) for model‐order selection, with emphasis on a high predictive accuracy (i.e. the mean squared forecast error is low). We obtain theoretical results and illustrate by means of a simulation study and some real data examples that the FIC is a valid alternative to the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for selection of a prediction model. We also illustrate the possibility of using the FIC for purposes other than forecasting, and explore its use in an extended model.  相似文献   

13.
Time series modelling of childhood diseases: a dynamical systems approach   总被引:3,自引:0,他引:3  
A key issue in the dynamical modelling of epidemics is the synthesis of complex mathematical models and data by means of time series analysis. We report such an approach, focusing on the particularly well-documented case of measles. We propose the use of a discrete time epidemic model comprising the infected and susceptible class as state variables. The model uses a discrete time version of the susceptible–exposed–infected–recovered type epidemic models, which can be fitted to observed disease incidence time series. We describe a method for reconstructing the dynamics of the susceptible class, which is an unobserved state variable of the dynamical system. The model provides a remarkable fit to the data on case reports of measles in England and Wales from 1944 to 1964. Morever, its systematic part explains the well-documented predominant biennial cyclic pattern. We study the dynamic behaviour of the time series model and show that episodes of annual cyclicity, which have not previously been explained quantitatively, arise as a response to a quicker replenishment of the susceptible class during the baby boom, around 1947.  相似文献   

14.
We describe a new discrete probability distribution with several useful properties for the analysis and modelling of survival processes and dispersion. First, the model can be used to describe survival processes with monotonically decreasing, constant, or increasing hazard functions, simply by tuning one parameter. Also, the model can describe counts that are overdispersed (contagious) or underdispersed, since the variance can exceed, equal, or be less than the mean. All of these properties are demonstrated both theoretically and with ecological examples, using ad-hoc parameter estimation techniques. Finally, the equations are tractable compared with, say, the negative binomial, and easily incorporated into larger models.  相似文献   

15.
In the regression analysis of time series of event counts, it is of interest to account for serial dependence that is likely to be present among such data as well as a nonlinear interaction between the expected event counts and predictors as a function of some underlying variables. We thus develop a Poisson autoregressive varying-coefficient model, which introduces autocorrelation through a latent process and allows regression coefficients to nonparametrically vary as a function of the underlying variables. The nonparametric functions for varying regression coefficients are estimated with data-driven basis selection, thereby avoiding overfitting and adapting to curvature variation. An efficient posterior sampling scheme is devised to analyse the proposed model. The proposed methodology is illustrated using simulated data and daily homicide data in Cali, Colombia.  相似文献   

16.
A generalized negative binomial distribution is derived from the Markov Bernoulli sequence of successes and failures. We study the properties and applications of this distribution. The properties are illustrated by two examples of discrete time queueing systems. The distribution is then fitted to two data sets, the eruption record of Mt. Sangay, and a record of computer disk failure accesses. In the first case there is a strong serial dependence in the data and the generalized negative binomial provides a good fit, while in the second case, although there is a significant serial dependence, it is insufficient to justify the additional parameter of the distribution. We conclude by demonstrating the usefulness of the distribution in the field of statistical quality control.  相似文献   

17.
A new family of mixture models for the model‐based clustering of longitudinal data is introduced. The covariance structures of eight members of this new family of models are given and the associated maximum likelihood estimates for the parameters are derived via expectation–maximization (EM) algorithms. The Bayesian information criterion is used for model selection and a convergence criterion based on the Aitken acceleration is used to determine the convergence of these EM algorithms. This new family of models is applied to yeast sporulation time course data, where the models give good clustering performance. Further constraints are then imposed on the decomposition to allow a deeper investigation of the correlation structure of the yeast data. These constraints greatly extend this new family of models, with the addition of many parsimonious models. The Canadian Journal of Statistics 38:153–168; 2010 © 2010 Statistical Society of Canada  相似文献   

18.
The modelling and analysis of count-data time series are areas of emerging interest with various applications in practice. We consider the particular case of the binomial AR(1) model, which is well suited for describing binomial counts with a first-order autoregressive serial dependence structure. We derive explicit expressions for the joint (central) moments and cumulants up to order 4. Then, we apply these results for expressing moments and asymptotic distribution of the squared difference estimator as an alternative to the sample autocovariance. We also analyse the asymptotic distribution of the conditional least-squares estimators of the parameters of the binomial AR(1) model. The finite-sample performance of these estimators is investigated in a simulation study, and we apply them to real data about computerized workstations.  相似文献   

19.
This article applies different approaches to distinguish state dependence from unobserved heterogeneity and serial correlation and, hence, test for state dependence in consumer brand choices. First, we apply a simple method proposed by Chamberlain, which involves lagged exogenous variables only. Second, we also estimate a lagged-dependent-variable specification proposed by Wooldridge. Third, we use the estimation approach suggested by Wooldridge to estimate a model with both lagged dependent and exogenous variables to distinguish between the two different sources of choice dynamics, state dependence and lagged effects of the exogenous variables. Our analysis reveals that the best approach is to use models with both lagged dependent and exogenous variables. Our findings include strong evidence for state dependence in five out of the six product categories studied in this article.  相似文献   

20.
Two types of state-switching models for U.S. real output have been proposed: models that switch randomly between states and models that switch states deterministically, as in the threshold autoregressive model of Potter. These models have been justified primarily on how well they fit the sample data, yielding statistically significant estimates of the model coefficients. Here we propose a new approach to the evaluation of an estimated nonlinear time series model that provides a complement to existing methods based on in-sample fit or on out-of-sample forecasting. In this new approach, a battery of distinct nonlinearity tests is applied to the sample data, resulting in a set of p-values for rejecting the null hypothesis of a linear generating mechanism. This set of p-values is taken to be a “stylized fact” characterizing the nonlinear serial dependence in the generating mechanism of the time series. The effectiveness of an estimated nonlinear model for this time series is then evaluated in terms of the congruence between this stylized fact and a set of nonlinearity test results obtained from data simulated using the estimated model. In particular, we derive a portmanteau statistic based on this set of nonlinearity test p-values that allows us to test the proposition that a given model adequately captures the nonlinear serial dependence in the sample data. We apply the method to several estimated state-switching models of U.S. real output.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号