期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Non‐stationary Cross‐Covariance Models for Multivariate Processes on a Globe

MIKYOUNG JUN 《Scandinavian Journal of Statistics》2011,38(4):726-747

Abstract. In geophysical and environmental problems, it is common to have multiple variables of interest measured at the same location and time. These multiple variables typically have dependence over space (and/or time). As a consequence, there is a growing interest in developing models for multivariate spatial processes, in particular, the cross‐covariance models. On the other hand, many data sets these days cover a large portion of the Earth such as satellite data, which require valid covariance models on a globe. We present a class of parametric covariance models for multivariate processes on a globe. The covariance models are flexible in capturing non‐stationarity in the data yet computationally feasible and require moderate numbers of parameters. We apply our covariance model to surface temperature and precipitation data from an NCAR climate model output. We compare our model to the multivariate version of the Matérn cross‐covariance function and models based on coregionalization and demonstrate the superior performance of our model in terms of AIC (and/or maximum loglikelihood values) and predictive skill. We also present some challenges in modelling the cross‐covariance structure of the temperature and precipitation data. Based on the fitted results using full data, we give the estimated cross‐correlation structure between the two variables. 相似文献

2.

Finite mixtures of multivariate Poisson distributions with application

Dimitris Karlis Loukia Meligkotsidou 《Journal of statistical planning and inference》2007

In the present paper we examine finite mixtures of multivariate Poisson distributions as an alternative class of models for multivariate count data. The proposed models allow for both overdispersion in the marginal distributions and negative correlation, while they are computationally tractable using standard ideas from finite mixture modelling. An EM type algorithm for maximum likelihood (ML) estimation of the parameters is developed. The identifiability of this class of mixtures is proved. Properties of ML estimators are derived. A real data application concerning model based clustering for multivariate count data related to different types of crime is presented to illustrate the practical potential of the proposed class of models. 相似文献

3.

Regression for doubly inflated multivariate Poisson distributions

Ishapathik Das Sumen Sen Pooja Sengupta 《Journal of Statistical Computation and Simulation》2019,89(13):2549-2561

Dependent multivariate count data occur in several research studies. These data can be modelled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula-based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models. 相似文献

4.

Multivariate Poisson hidden Markov models with a case study of modelling seismicity

下载免费PDF全文

K. Orfanogiannaki D. Karlis 《Australian & New Zealand Journal of Statistics》2018,60(3):301-322

While the literature on multivariate models for continuous data flourishes, there is a lack of models for multivariate counts. We aim to contribute to this framework by extending the well known class of univariate hidden Markov models to the multidimensional case, by introducing multivariate Poisson hidden Markov models. Each state of the extended model is associated with a different multivariate discrete distribution. We consider different distributions with Poisson marginals, starting from the multivariate Poisson distribution and then extending to copula based distributions to allow flexible dependence structures. An EM type algorithm is developed for maximum likelihood estimation. A real data application is presented to illustrate the usefulness of the proposed models. In particular, we apply the models to the occurrence of strong earthquakes (surface wave magnitude ≥5), in three seismogenic subregions in the broad region of the North Aegean Sea for the time period from 1 January 1981 to 31 December 2008. Earthquakes occurring in one subregion may trigger events in adjacent ones and hence the observed time series of events are cross‐correlated. It is evident from the results that the three subregions interact with each other at times differing by up to a few months. This migration of seismic activity is captured by the model as a transition to a state of higher seismicity. 相似文献

5.

A bivariate Sarmanov regression model for count data with generalised Poisson marginals

Vera Hofer Johannes Leitner 《Journal of applied statistics》2012,39(12):2599-2617

We present a bivariate regression model for count data that allows for positive as well as negative correlation of the response variables. The covariance structure is based on the Sarmanov distribution and consists of a product of generalised Poisson marginals and a factor that depends on particular functions of the response variables. The closed form of the probability function is derived by means of the moment-generating function. The model is applied to a large real dataset on health care demand. Its performance is compared with alternative models presented in the literature. We find that our model is significantly better than or at least equivalent to the benchmark models. It gives insights into influences on the variance of the response variables. 相似文献

6.

Bivariate Negative Binomial Generalized Linear Models for Environmental Count Data

Masakazu Iwasaki Hiroe Tsubaki 《Journal of applied statistics》2006,33(9):909-923

We propose a new bivariate negative binomial model with constant correlation structure, which was derived from a contagious bivariate distribution of two independent Poisson mass functions, by mixing the proposed bivariate gamma type density with constantly correlated covariance structure (Iwasaki & Tsubaki, 2005), which satisfies the integrability condition of McCullagh & Nelder (1989, p. 334). The proposed bivariate gamma type density comes from a natural exponential family. Joe (1997) points out the necessity of a multivariate gamma distribution to derive a multivariate distribution with negative binomial margins, and the luck of a convenient form of multivariate gamma distribution to get a model with greater flexibility in a dependent structure with indices of dispersion. In this paper we first derive a new bivariate negative binomial distribution as well as the first two cumulants, and, secondly, formulate bivariate generalized linear models with a constantly correlated negative binomial covariance structure in addition to the moment estimator of the components of the matrix. We finally fit the bivariate negative binomial models to two correlated environmental data sets. 相似文献

7.

Efficient Estimation in Marginal Partially Linear Models for Longitudinal/Clustered Data Using Splines

JIANHUA Z. HUANG LIANGYUE ZHANG LAN ZHOU 《Scandinavian Journal of Statistics》2007,34(3):451-477

Abstract. We consider marginal semiparametric partially linear models for longitudinal/clustered data and propose an estimation procedure based on a spline approximation of the non-parametric part of the model and an extension of the parametric marginal generalized estimating equations (GEE). Our estimates of both parametric part and non-parametric part of the model have properties parallel to those of parametric GEE, that is, the estimates are efficient if the covariance structure is correctly specified and they are still consistent and asymptotically normal even if the covariance structure is misspecified. By showing that our estimate achieves the semiparametric information bound, we actually establish the efficiency of estimating the parametric part of the model in a stronger sense than what is typically considered for GEE. The semiparametric efficiency of our estimate is obtained by assuming only conditional moment restrictions instead of the strict multivariate Gaussian error assumption. 相似文献

8.

A Multivariate Generalized Poisson Regression Model

Felix Famoye 《统计学通讯:理论与方法》2013,42(3):497-511

A multivariate generalized Poisson regression model based on the multivariate generalized Poisson distribution is defined and studied. The regression model can be used to describe a count data with any type of dispersion. The model allows for both positive and negative correlation between any pair of the response variables. The parameters of the regression model are estimated by using the maximum likelihood method. Some test statistics are discussed, and two numerical data sets are used to illustrate the applications of the multivariate count data regression model. 相似文献

9.

Bayesian multivariate Poisson mixtures with an unknown number of components

Loukia Meligkotsidou 《Statistics and Computing》2007,17(2):93-107

In this paper we present Bayesian analysis of finite mixtures of multivariate Poisson distributions with an unknown number of components. The multivariate Poisson distribution can be regarded as the discrete counterpart of the multivariate normal distribution, which is suitable for modelling multivariate count data. Mixtures of multivariate Poisson distributions allow for overdispersion and for negative correlations between variables. To perform Bayesian analysis of these models we adopt a reversible jump Markov chain Monte Carlo (MCMC) algorithm with birth and death moves for updating the number of components. We present results obtained from applying our modelling approach to simulated and real data. Furthermore, we apply our approach to a problem in multivariate disease mapping, namely joint modelling of diseases with correlated counts. 相似文献

10.

An EM algorithm for multivariate Poisson distribution and related models 总被引：2，自引：0，他引：2

Dimitris Karlis 《Journal of applied statistics》2003,30(1):63-77

Multivariate extensions of the Poisson distribution are plausible models for multivariate discrete data. The lack of estimation and inferential procedures reduces the applicability of such models. In this paper, an EM algorithm for Maximum Likelihood estimation of the parameters of the Multivariate Poisson distribution is described. The algorithm is based on the multivariate reduction technique that generates the Multivariate Poisson distribution. Illustrative examples are also provided. Extension to other models, generated via multivariate reduction, is discussed. 相似文献

11.

Modelling interval data with Normal and Skew-Normal distributions

Paula Brito A. Pedro Duarte Silva 《Journal of applied statistics》2012,39(1):3-20

A parametric modelling for interval data is proposed, assuming a multivariate Normal or Skew-Normal distribution for the midpoints and log-ranges of the interval variables. The intrinsic nature of the interval variables leads to special structures of the variance–covariance matrix, which is represented by five different possible configurations. Maximum likelihood estimation for both models under all considered configurations is studied. The proposed modelling is then considered in the context of analysis of variance and multivariate analysis of variance testing. To access the behaviour of the proposed methodology, a simulation study is performed. The results show that, for medium or large sample sizes, tests have good power and their true significance level approaches nominal levels when the constraints assumed for the model are respected; however, for small samples, sizes close to nominal levels cannot be guaranteed. Applications to Chinese meteorological data in three different regions and to credit card usage variables for different card designations, illustrate the proposed methodology. 相似文献

12.

Robust estimation of mean and covariance for longitudinal data with dropouts

Guoyou Qin 《Journal of applied statistics》2015,42(6):1240-1254

In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration. 相似文献

13.

Robust multivariate mixture regression models with incomplete data

Hwa Kyung Lim Naveen N. Narisetty 《Journal of Statistical Computation and Simulation》2017,87(2):328-347

Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis. 相似文献

14.

Separability tests for high-dimensional,low-sample size multivariate repeated measures data

Sean L. Simpson Lloyd J. Edwards Martin A. Styner Keith E. Muller 《Journal of applied statistics》2014,41(11):2450-2461

Longitudinal imaging studies have moved to the forefront of medical research due to their ability to characterize spatio-temporal features of biological structures across the lifespan. Valid inference in longitudinal imaging requires enough flexibility of the covariance model to allow reasonable fidelity to the true pattern. On the other hand, the existence of computable estimates demands a parsimonious parameterization of the covariance structure. Separable (Kronecker product) covariance models provide one such parameterization in which the spatial and temporal covariances are modeled separately. However, evaluating the validity of this parameterization in high dimensions remains a challenge. Here we provide a scientifically informed approach to assessing the adequacy of separable (Kronecker product) covariance models when the number of observations is large relative to the number of independent sampling units (sample size). We address both the general case, in which unstructured matrices are considered for each covariance model, and the structured case, which assumes a particular structure for each model. For the structured case, we focus on the situation where the within-subject correlation is believed to decrease exponentially in time and space as is common in longitudinal imaging studies. However, the provided framework equally applies to all covariance patterns used within the more general multivariate repeated measures context. Our approach provides useful guidance for high dimension, low-sample size data that preclude using standard likelihood-based tests. Longitudinal medical imaging data of caudate morphology in schizophrenia illustrate the approaches appeal. 相似文献

15.

Bayesian inference for Poisson and multinomial log-linear models

Jonathan J. Forster 《Statistical Methodology》2010,7(3):210-224

Categorical data frequently arise in applications in the Social Sciences. In such applications, the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomial or product multinomial data as observations of independent Poisson variables. For multinomial data, Lindley (1964) [20] showed that this approach leads to valid Bayesian posterior inferences when the prior density for the Poisson cell means factorises in a particular way. We develop this result to provide a general framework for the analysis of multinomial or product multinomial data using a Poisson log-linear model. Valid finite population inferences are also available, which can be particularly important in modelling social data. We then focus particular attention on multivariate normal prior distributions for the log-linear model parameters. Here, an improper prior distribution for certain Poisson model parameters is required for valid multinomial analysis, and we derive conditions under which the resulting posterior distribution is proper. We also consider the construction of prior distributions across models, and for model parameters, when uncertainty exists about the appropriate form of the model. We present classes of Poisson and multinomial models, invariant under certain natural groups of permutations of the cells. We demonstrate that, if prior belief concerning the model parameters is also invariant, as is the case in a ‘reference’ analysis, then the choice of prior distribution is considerably restricted. The analysis of multivariate categorical data in the form of a contingency table is considered in detail. We illustrate the methods with two examples. 相似文献

16.

Adaptive robust estimation in joint mean–covariance regression model for bivariate longitudinal data

Jing Lv Tingting Li Yuanyuan Hao Xiaolin Pan 《Statistics》2018,52(1):64-83

The estimation of the covariance matrix is important in the analysis of bivariate longitudinal data. A good estimator for the covariance matrix can improve the efficiency of the estimators of the mean regression coefficients. Furthermore, the covariance estimation itself is also of interest, but it is a challenging job to model the covariance matrix of bivariate longitudinal data due to the complex structure and positive definite constraint. In addition, most of existing approaches are based on the maximum likelihood, which is very sensitive to outliers or heavy-tail error distributions. In this article, an adaptive robust estimation method is proposed for bivariate longitudinal data. Unlike the existing likelihood-based methods, the proposed method can adapt to different error distributions. Specifically, at first, we utilize the modified Cholesky block decomposition to parameterize the covariance matrices. Secondly, we apply the bounded Huber's score function to develop a set of robust generalized estimating equations to estimate the parameters both in the mean and the covariance models simultaneously. A data-driven approach is presented to select the parameter c in the Huber's score function, which can ensure that the proposed method is robust and efficient. A simulation study and a real data analysis are conducted to illustrate the robustness and efficiency of the proposed approach. 相似文献

17.

Financial data modeling by Poisson mixture regression

S. Faria F. Gonçalves 《Journal of applied statistics》2013,40(10):2150-2162

In many financial applications, Poisson mixture regression models are commonly used to analyze heterogeneous count data. When fitting these models, the observed counts are supposed to come from two or more subpopulations and parameter estimation is typically performed by means of maximum likelihood via the Expectation–Maximization algorithm. In this study, we discuss briefly the procedure for fitting Poisson mixture regression models by means of maximum likelihood, the model selection and goodness-of-fit tests. These models are applied to a real data set for credit-scoring purposes. We aim to reveal the impact of demographic and financial variables in creating different groups of clients and to predict the group to which each client belongs, as well as his expected number of defaulted payments. The model's conclusions are very interesting, revealing that the population consists of three groups, contrasting with the traditional good versus bad categorization approach of the credit-scoring systems. 相似文献

18.

Modeling a mixture of ordinal and continuous repeated measures

《Journal of Statistical Computation and Simulation》2012,82(10):873-886

We study the correlation structure for a mixture of ordinal and continuous repeated measures using a Bayesian approach. We assume a multivariate probit model for the ordinal variables and a normal linear regression for the continuous variables, where latent normal variables underlying the ordinal data are correlated with continuous variables in the model. Due to the probit model assumption, we are required to sample a covariance matrix with some of the diagonal elements equal to one. The key computational idea is to use parameter-extended data augmentation, which involves applying the Metropolis-Hastings algorithm to get a sample from the posterior distribution of the covariance matrix incorporating the relevant restrictions. The methodology is illustrated through a simulated example and through an application to data from the UCLA Brain Injury Research Center. 相似文献

19.

A calibrated imputation method for secondary data analysis of survey data

Damio N. Da Silva Li‐Chun Zhang 《Scandinavian Journal of Statistics》2021,48(1):25-41

In practical survey sampling, missing data are unavoidable due to nonresponse, rejected observations by editing, disclosure control, or outlier suppression. We propose a calibrated imputation approach so that valid point and variance estimates of the population (or domain) totals can be computed by the secondary users using simple complete‐sample formulae. This is especially helpful for variance estimation, which generally require additional information and tools that are unavailable to the secondary users. Our approach is natural for continuous variables, where the estimation may be either based on reweighting or imputation, including possibly their outlier‐robust extensions. We also propose a multivariate procedure to accommodate the estimation of the covariance matrix between estimated population totals, which facilitates variance estimation of the ratios or differences among the estimated totals. We illustrate the proposed approach using simulation data in supplementary materials that are available online. 相似文献

20.

Covariance tapering for multivariate Gaussian random fields estimation 总被引：2，自引：0，他引：2

M. Bevilacqua A. Fassò C. Gaetan E. Porcu D. Velandia 《Statistical Methods and Applications》2016,25(1):21-37

In recent literature there has been a growing interest in the construction of covariance models for multivariate Gaussian random fields. However, effective estimation methods for these models are somehow unexplored. The maximum likelihood method has attractive features, but when we deal with large data sets this solution becomes impractical, so computationally efficient solutions have to be devised. In this paper we explore the use of the covariance tapering method for the estimation of multivariate covariance models. In particular, through a simulation study, we compare the use of simple separable tapers with more flexible multivariate tapers recently proposed in the literature and we discuss the asymptotic properties of the method under increasing domain asymptotics. 相似文献