期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A latent variable regression model for asymmetric bivariate ordered categorical data 总被引：1，自引：1，他引：0

Farid Zayeri Anoshirvan Kazemnejad 《Journal of applied statistics》2006,33(7):743-753

In many areas of medical research, especially in studies that involve paired organs, a bivariate ordered categorical response should be analyzed. Using a bivariate continuous distribution as the latent variable is an interesting strategy for analyzing these data sets. In this context, the bivariate standard normal distribution, which leads to the bivariate cumulative probit regression model, is the most common choice. In this paper, we introduce another latent variable regression model for modeling bivariate ordered categorical responses. This model may be an appropriate alternative for the bivariate cumulative probit regression model, when postulating a symmetric form for marginal or joint distribution of response data does not appear to be a valid assumption. We also develop the necessary numerical procedure to obtain the maximum likelihood estimates of the model parameters. To illustrate the proposed model, we analyze data from an epidemiologic study to identify some of the most important risk indicators of periodontal disease among students 15-19 years in Tehran, Iran. 相似文献

2.

A mixture of experts latent position cluster model for social network data

Isobel Claire Gormley Thomas Brendan Murphy 《Statistical Methodology》2010,7(3):385-405

Social network data represent the interactions between a group of social actors. Interactions between colleagues and friendship networks are typical examples of such data.The latent space model for social network data locates each actor in a network in a latent (social) space and models the probability of an interaction between two actors as a function of their locations. The latent position cluster model extends the latent space model to deal with network data in which clusters of actors exist — actor locations are drawn from a finite mixture model, each component of which represents a cluster of actors.A mixture of experts model builds on the structure of a mixture model by taking account of both observations and associated covariates when modeling a heterogeneous population. Herein, a mixture of experts extension of the latent position cluster model is developed. The mixture of experts framework allows covariates to enter the latent position cluster model in a number of ways, yielding different model interpretations.Estimates of the model parameters are derived in a Bayesian framework using a Markov Chain Monte Carlo algorithm. The algorithm is generally computationally expensive — surrogate proposal distributions which shadow the target distributions are derived, reducing the computational burden.The methodology is demonstrated through an illustrative example detailing relationships between a group of lawyers in the USA. 相似文献

3.

Product partition latent variable model for multiple change-point detection in multivariate data

Gift Nyamundanda Avril Hegarty Kevin Hayes 《Journal of applied statistics》2015,42(11):2321-2334

The product partition model (PPM) is a well-established efficient statistical method for detecting multiple change points in time-evolving univariate data. In this article, we refine the PPM for the purpose of detecting multiple change points in correlated multivariate time-evolving data. Our model detects distributional changes in both the mean and covariance structures of multivariate Gaussian data by exploiting a smaller dimensional representation of correlated multiple time series. The utility of the proposed method is demonstrated through experiments on simulated and real datasets. 相似文献

4.

Bayesian latent variable models for clustered mixed outcomes

D. B. Dunson 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(2):355-366

A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study. 相似文献

5.

Bayesian latent variable model for mixed continuous and ordinal responses with possibility of missing responses

E. Bahrami Samani M. Ganjali 《Journal of applied statistics》2011,38(6):1103-1116

A general framework is proposed for joint modelling of mixed correlated ordinal and continuous responses with missing values for responses, where the missing mechanism for both kinds of responses is also considered. Considering the posterior distribution of unknowns given all available information, a Markov Chain Monte Carlo sampling algorithm via winBUGS is used for estimating the posterior distribution of the parameters. For sensitivity analysis to investigate the perturbation from missing at random to not missing at random, it is shown how one can use some elements of covariance structure. These elements associate responses and their missing mechanisms. Influence of small perturbation of these elements on posterior displacement and posterior estimates is also studied. The model is illustrated using data from a foreign language achievement study. 相似文献

6.

A latent variable model for analyzing mixed longitudinal (k,l)-inflated count and ordinal responses

F. Razie M. Ganjali 《Journal of applied statistics》2016,43(12):2203-2224

A random effects model for analyzing mixed longitudinal count and ordinal data is presented where the count response is inflated in two points (k and l) and an (k,l)-Inflated Power series distribution is used as its distribution. A full likelihood-based approach is used to obtain maximum likelihood estimates of parameters of the model. For data with non-ignorable missing values models with probit model for missing mechanism are used.The dependence between longitudinal sequences of responses and inflation parameters are investigated using a random effects approach. Also, to investigate the correlation between mixed ordinal and count responses of each individuals at each time, a shared random effect is used. In order to assess the performance of the model, a simulation study is performed for a case that the count response has (k,l)-Inflated Binomial distribution. Performance comparisons of count-ordinal random effect model, Zero-Inflated ordinal random effects model and (k,l)-Inflated ordinal random effects model are also given. The model is applied to a real social data set from the first two waves of the national longitudinal study of adolescent to adult health (Add Health study). In this data set, the joint responses are the number of days in a month that each individual smoked as the count response and the general health condition of each individual as the ordinal response. For the count response there is incidence of excess values of 0 and 30. 相似文献

7.

Latent mixture modeling for clustered data

Sugasawa Shonosuke Kobayashi Genya Kawakubo Yuki 《Statistics and Computing》2019,29(3):537-548

Statistics and Computing - This article proposes a mixture modeling approach to estimating cluster-wise conditional distributions in clustered (grouped) data. We adapt the mixture-of-experts model... 相似文献

8.

A mixture model for wind shear data

G. K. Kanji 《Journal of applied statistics》1985,12(1):49-58

Wind shear is known to be an important factor affecting the safety of aircraft during take-off and landing period. Records of its measurement obtained from a civil airlines are presented and discussed. A mixture model with variable proportionality constant is used to describe this data. The method of minimum chi-squared is used to estimate the mixture proportionality constant and the suitability of the model is considered. 相似文献

9.

Sensitivity analysis for the identifiability with application to latent random effect model for the mixed data

E. Bahrami Samani 《Journal of applied statistics》2014,41(12):2761-2776

In this paper, we study the indentifiability of a latent random effect model for the mixed correlated continuous and ordinal longitudinal responses. We derive conditions for the identifiability of the covariance parameters of the responses. Also, we proposed sensitivity analysis to investigate the perturbation from the non-identifiability of the covariance parameters, it is shown how one can use some elements of covariance structure. These elements associate conditions for identifiability of the covariance parameters of the responses. Influence of small perturbation of these elements on maximal normal curvature is also studied. The model is illustrated using medical data. 相似文献

10.

Diagonal latent block model for binary data

Charlotte Laclau Mohamed Nadif 《Statistics and Computing》2017,27(5):1145-1163

This paper addresses the problem of co-clustering binary data in the latent block model framework with diagonal constraints for resulting data partitions. We consider the Bernoulli generative mixture model and present three new methods differing in the assumptions made about the degree of homogeneity of diagonal blocks. The proposed models are parsimonious and allow to take into account the structure of a data matrix when reorganizing it into homogeneous diagonal blocks. We derive algorithms for each of the presented models based on the classification expectation-maximization algorithm which maximizes the complete data likelihood. We show that our contribution can outperform other state-of-the-art (co)-clustering methods on synthetic sparse and non-sparse data. We also prove the efficiency of our approach in the context of document clustering, by using real-world benchmark data sets. 相似文献

11.

Finite normal mixture copulas for multivariate discrete data modeling

Aristidis K. Nikoloulopoulos Dimitris Karlis 《Journal of statistical planning and inference》2009,139(11):203

A new family of copulas is introduced that provides flexible dependence structure while being tractable and simple to use for multivariate discrete data modeling. The construction exploits finite mixtures of uncorrelated normal distributions. Accordingly, the cumulative distribution function is simply the product of univariate normal distributions. At the same time, however, the mixing operation introduces association. The properties of the new family of copulas are examined and a concrete application is used to show its applicability. 相似文献

12.

Asymptotic test of mixture model and its applications to QTL interval mapping

Dong-Yun Kim Yuehua Cui Ou Zhao 《Journal of statistical planning and inference》2013

Quantitative trait loci (QTL) mapping has been a standard means in identifying genetic regions harboring potential genes underlying complex traits. Likelihood ratio test (LRT) has been commonly applied to assess the significance of a genetic locus in a mixture model content. Given the time constraint in commonly used permutation tests to assess the significance of LRT in QTL mapping, we study the behavior of the LRT statistic in mixture model when the proportions of the distributions are unknown. We found that the asymptotic null distribution is stationary Gaussian process after suitable transformation. The result can be applied to one-parameter exponential family mixture model. Under certain condition, such as in a backcross mapping model, the tail probability of the supremum of the process is calculated and the threshold values can be determined by solving the distribution function. Simulation studies were performed to evaluate the asymptotic results. 相似文献

13.

A new regression model for bimodal data and applications in agriculture

Julio Cezar Souza Vasconcelos Gauss Moutinho Cordeiro Edwin Moises Marcos Ortega dila Maria de Rezende 《Journal of applied statistics》2021,48(2):349

We define the odd log-logistic exponential Gaussian regression with two systematic components, which extends the heteroscedastic Gaussian regression and it is suitable for bimodal data quite common in the agriculture area. We estimate the parameters by the method of maximum likelihood. Some simulations indicate that the maximum-likelihood estimators are accurate. The model assumptions are checked through case deletion and quantile residuals. The usefulness of the new regression model is illustrated by means of three real data sets in different areas of agriculture, where the data present bimodality. 相似文献

14.

General location multivariate latent variable models for mixed correlated bounded continuous,ordinal, and nominal responses with non-ignorable missing data

Elham Tabrizi Ehsan Bahrami Samani Mojtaba Ganjali 《Journal of applied statistics》2021,48(5):765

Using a multivariate latent variable approach, this article proposes some new general models to analyze the correlated bounded continuous and categorical (nominal or/and ordinal) responses with and without non-ignorable missing values. First, we discuss regression methods for jointly analyzing continuous, nominal, and ordinal responses that we motivated by analyzing data from studies of toxicity development. Second, using the beta and Dirichlet distributions, we extend the models so that some bounded continuous responses are replaced for continuous responses. The joint distribution of the bounded continuous, nominal and ordinal variables is decomposed into a marginal multinomial distribution for the nominal variable and a conditional multivariate joint distribution for the bounded continuous and ordinal variables given the nominal variable. We estimate the regression parameters under the new general location models using the maximum-likelihood method. Sensitivity analysis is also performed to study the influence of small perturbations of the parameters of the missing mechanisms of the model on the maximal normal curvature. The proposed models are applied to two data sets: BMI, Steatosis and Osteoporosis data and Tehran household expenditure budgets. 相似文献

15.

A bivariate mixture of negative binomial distributions and its applications

Deepak Singh 《统计学通讯:理论与方法》2020,49(17):4162-4177

Abstract

We construct a new bivariate mixture of negative binomial distributions which represents over-dispersed data more efficiently. This is an extension of a univariate mixture of beta and negative binomial distributions. Characteristics of this joint distribution are studied including conditional distributions. Some properties of the correlation coefficient are explored. We demonstrate the applicability of our proposed model by fitting to three real data sets with correlated count data. A comparison is made with some previously used models to show the effectiveness of the new model. 相似文献

16.

Alternative modeling techniques for the quantal response data in mixture experiments

Kadri Ulas Akay Müjgan Tez 《Journal of applied statistics》2011,38(11):2597-2616

Mixture experiments are commonly encountered in many fields including chemical, pharmaceutical and consumer product industries. Due to their wide applications, mixture experiments, a special study of response surface methodology, have been given greater attention in both model building and determination of designs compared with other experimental studies. In this paper, some new approaches are suggested on model building and selection for the analysis of the data in mixture experiments by using a special generalized linear models, logistic regression model, proposed by Chen et al. [7]. Generally, the special mixture models, which do not have a constant term, are highly affected by collinearity in modeling the mixture experiments. For this reason, in order to alleviate the undesired effects of collinearity in the analysis of mixture experiments with logistic regression, a new mixture model is defined with an alternative ratio variable. The deviance analysis table is given for standard mixture polynomial models defined by transformations and special mixture models used as linear predictors. The effects of components on the response in the restricted experimental region are given by using an alternative representation of Cox's direction approach. In addition, odds ratio and the confidence intervals of odds ratio are identified according to the chosen reference and control groups. To compare the suggested models, some model selection criteria, graphical odds ratio and the confidence intervals of the odds ratio are used. The advantage of the suggested approaches is illustrated on tumor incidence data set. 相似文献

17.

Moment identity for discrete random variable and its applications

Sudheesh Kumar Kattumannil Luisa Tibiletti 《Statistics》2013,47(6):767-775

In this paper, we obtain a moment identity applicable to a general class of discrete probability distributions. We then derive the corresponding identities for modified power series, Ord and Katz families. It is noted that the proposed identity has potential applications in different fields. 相似文献

18.

A multivariate finite mixture latent trajectory model with application to dementia studies

Dongbing Lai Huiping Xu Daniel Koller Tatiana Foroud 《Journal of applied statistics》2016,43(14):2503-2523

Dementia patients exhibit considerable heterogeneity in individual trajectories of cognitive decline, with some patients showing rapid decline following diagnoses while others exhibiting slower decline or remaining stable for several years. Dementia studies often collect longitudinal measures of multiple neuropsychological tests aimed to measure patients’ decline across a number of cognitive domains. We propose a multivariate finite mixture latent trajectory model to identify distinct longitudinal patterns of cognitive decline simultaneously in multiple cognitive domains, each of which is measured by multiple neuropsychological tests. EM algorithm is used for parameter estimation and posterior probabilities are used to predict latent class membership. We present results of a simulation study demonstrating adequate performance of our proposed approach and apply our model to the Uniform Data Set from the National Alzheimer's Coordinating Center to identify cognitive decline patterns among dementia patients. 相似文献

19.

A distance based regression model for prediction with mixed data

C.M. Cuadras C. Arenas 《统计学通讯:理论与方法》2013,42(6):2261-2279

A multiple regression method based on distance analysis and metric scaling is proposed and studied. This method allow us to predict a continuous response variable from several explanatory variables, is compatible with the general linear model and is found to be useful when the predictor variables are both continuous and categorical. Real data examples are given to illustrate the results obtained. 相似文献

20.

Relabelling algorithms for mixture models with applications for large data sets

《Journal of Statistical Computation and Simulation》2012,82(2):394-413

Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises because the posterior is invariant to permutations of the component parameters. Methods for dealing with label switching have been studied fairly extensively in the literature, with the most popular approaches being those based on loss functions. However, many of these algorithms turn out to be too slow in practice, and can be infeasible as the size and/or dimension of the data grow. We propose a new, computationally efficient algorithm based on a loss function interpretation, and show that it can scale up well in large data set scenarios. Then, we review earlier solutions which can scale up well for large data set, and compare their performances on simulated and real data sets. We conclude with some discussions and recommendations of all the methods studied. 相似文献