期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Hierarchical Kernelized Probabilistic Matrix Factorization

Hongxia Yang Jun Wang 《统计学通讯:模拟与计算》2016,45(7):2528-2540

We present a Bayesian analysis framework for matrix-variate normal data with dependency structures induced by rows and columns. This framework of matrix normal models includes prior specifications, posterior computation using Markov chain Monte Carlo methods, evaluation of prediction uncertainty, model structure search, and extensions to multidimensional arrays. Compared with Bayesian probabilistic matrix factorization, which integrates a Gaussian prior for single row of the data matrix, our proposed model, namely Bayesian hierarchical kernelized probabilistic matrix factorization, imposes Gaussian Process priors over multiple rows of the matrix. Hence, the learned model explicitly captures the underlying correlation among the rows and the columns. In addition, our method requires no specific assumptions like independence of latent factors for rows and columns, which obtains more flexibility for modeling real data compared to existing works. Finally, the proposed framework can be adapted to a wide range of applications, including multivariate analysis, times series, and spatial modeling. Experiments highlight the superiority of the proposed model in handling model uncertainty and model optimization. 相似文献

2.

潜在成长模型在体育运动领域的应用

苏荣海徐茂洲《统计与信息论坛》2017,(2):91-95

介绍了潜在成长模型的发展、类型和内容,通过分析、归纳、整理等,比较了潜在成长模型与传统t检验、方差分析、回归分析的差别,探讨了潜在成长模型研究设计的要求,通过台湾棒球职业球员薪酬成长模型说明。利用文献调查法,分析了LGM潜在成长模型在体育运动领域中应用的概况,举例说明体育运动领域应用LGM的情况,证实了未来潜在成长模型会在体育科学领域逐渐受到重视并得到快速发展。相似文献

3.

Modeling population and subject-specific growth in a latent trait measured by multiple instruments over time using a hierarchical Bayesian framework

Caitlin Ward Jacob Oleson J. Bruce Tomblin Elizabeth Walker 《Journal of applied statistics》2022,49(2):449

Psychometric growth curve modeling techniques are used to describe a person’s latent ability and how that ability changes over time based on a specific measurement instrument. However, the same instrument cannot always be used over a period of time to measure that latent ability. This is often the case when measuring traits longitudinally in children. Reasons may be that over time some measurement tools that were difficult for young children become too easy as they age resulting in floor effects or ceiling effects or both. We propose a Bayesian hierarchical model for such a scenario. Within the Bayesian model we combine information from multiple instruments used at different age ranges and having different scoring schemes to examine growth in latent ability over time. The model includes between-subject variance and within-subject variance and does not require linking item specific difficulty between the measurement tools. The model’s utility is demonstrated on a study of language ability in children from ages one to ten who are hard of hearing where measurement tool specific growth and subject-specific growth are shown in addition to a group level latent growth curve comparing the hard of hearing children to children with normal hearing.KEYWORDS: Bayesian hierarchical models, psychometric modeling, language ability, growth curve modeling, longitudinal analysis 相似文献

4.

Restricted Indian buffet processes

Finale Doshi-Velez Sinead A. Williamson 《Statistics and Computing》2017,27(5):1205-1223

Latent feature models are a powerful tool for modeling data with globally-shared features. Nonparametric distributions over exchangeable sets of features, such as the Indian Buffet Process, offer modeling flexibility by letting the number of latent features be unbounded. However, current models impose implicit distributions over the number of latent features per data point, and these implicit distributions may not match our knowledge about the data. In this work, we demonstrate how the restricted Indian buffet process circumvents this restriction, allowing arbitrary distributions over the number of features in an observation. We discuss several alternative constructions of the model and apply the insights to develop Markov Chain Monte Carlo and variational methods for simulation and posterior inference. 相似文献

5.

Non‐parametric Bayesian Hazard Regression for Chronic Disease Risk Assessment

下载免费PDF全文

Olli Saarela Elja Arjas 《Scandinavian Journal of Statistics》2015,42(2):609-626

Assessing the absolute risk for a future disease event in presently healthy individuals has an important role in the primary prevention of cardiovascular diseases (CVD) and other chronic conditions. In this paper, we study the use of non‐parametric Bayesian hazard regression techniques and posterior predictive inferences in the risk assessment task. We generalize our previously published Bayesian multivariate monotonic regression procedure to a survival analysis setting, combined with a computationally efficient estimation procedure utilizing case–base sampling. To achieve parsimony in the model fit, we allow for multidimensional relationships within specified subsets of risk factors, determined either on a priori basis or as a part of the estimation procedure. We apply the proposed methods for 10‐year CVD risk assessment in a Finnish population. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics 相似文献

6.

Joint modeling of longitudinal data with a dependent terminal event

Sui He Ting Du 《统计学通讯:理论与方法》2013,42(3):813-835

ABSTRACT

Longitudinal data often arise in longitudinal follow-up studies, and there may exist a dependent terminal event such as death that stops the follow-up. In this article, we propose a new joint modeling for the analysis of longitudinal data with informative observation times via a dependent terminal event and two latent variables. Estimating equations are developed for parameter estimation, and asymptotic properties of the resulting estimators are established. In addition, a generalization of the joint model with time-varying coefficients for the longitudinal response variable is considered, and goodness-of-fit methods for assessing the adequacy of the model are also provided. The proposed method works well in our simulation studies, and is applied to a data set from a bladder cancer study. 相似文献

7.

Max-Linear Competing Factor Models

Qiurong Cui Zhengjun Zhang 《商业与经济统计学杂志》2018,36(1):62-74

Models incorporating “latent” variables have been commonplace in financial, social, and behavioral sciences. Factor model, the most popular latent model, explains the continuous observed variables in a smaller set of latent variables (factors) in a matter of linear relationship. However, complex data often simultaneously display asymmetric dependence, asymptotic dependence, and positive (negative) dependence between random variables, which linearity and Gaussian distributions and many other extant distributions are not capable of modeling. This article proposes a nonlinear factor model that can model the above-mentioned variable dependence features but still possesses a simple form of factor structure. The random variables, marginally distributed as unit Fréchet distributions, are decomposed into max linear functions of underlying Fréchet idiosyncratic risks, transformed from Gaussian copula, and independent shared external Fréchet risks. By allowing the random variables to share underlying (latent) pervasive risks with random impact parameters, various dependence structures are created. This innovates a new promising technique to generate families of distributions with simple interpretations. We dive in the multivariate extreme value properties of the proposed model and investigate maximum composite likelihood methods for the impact parameters of the latent risks. The estimates are shown to be consistent. The estimation schemes are illustrated on several sets of simulated data, where comparisons of performance are addressed. We employ a bootstrap method to obtain standard errors in real data analysis. Real application to financial data reveals inherent dependencies that previous work has not disclosed and demonstrates the model’s interpretability to real data. Supplementary materials for this article are available online. 相似文献

8.

A functional data approach to model score difference process in professional basketball games

Tao Chen 《Journal of applied statistics》2018,45(1):112-127

In this paper, we investigate the progress of score difference (between home and away teams) in professional basketball games employing functional data analysis (FDA). The observed score difference is viewed as the realization of the latent intensity process, which is assumed to be continuous. There are two major advantages of modeling the latent score difference intensity process using FDA: (1) it allows for arbitrary dependent structure among score change increments. This removes potential model mis-specifications and accommodates momentum which is often observed in sports games. (2) further statistical inferences using FDA estimates will not suffer from inconsistency due to the issue of having a continuous model yet discretely sampled data. Based on the FDA estimates, we define and numerically characterize momentum in basketball games and demonstrate its importance in predicting game outcomes. 相似文献

9.

Mixture of latent trait analyzers for model-based clustering of categorical data

Isabella Gollini Thomas Brendan Murphy 《Statistics and Computing》2014,24(4):569-588

Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone. 相似文献

10.

Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes

Yue Zhang Kiros Berhane 《Journal of applied statistics》2016,43(4):704-720

We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development. 相似文献

11.

Modeling Longitudinal Obesity Data with Intermittent Missingness Using a New Latent Variable Model

Li Qin Lisa Weissfeld Marsha D. Marcus Michele D. Levine Feng Dai 《统计学通讯:模拟与计算》2016,45(6):2018-2031

We propose a latent variable model for informative missingness in longitudinal studies which is an extension of latent dropout class model. In our model, the value of the latent variable is affected by the missingness pattern and it is also used as a covariate in modeling the longitudinal response. So the latent variable links the longitudinal response and the missingness process. In our model, the latent variable is continuous instead of categorical and we assume that it is from a normal distribution. The EM algorithm is used to obtain the estimates of the parameter we are interested in and Gauss–Hermite quadrature is used to approximate the integration of the latent variable. The standard errors of the parameter estimates can be obtained from the bootstrap method or from the inverse of the Fisher information matrix of the final marginal likelihood. Comparisons are made to the mixed model and complete-case analysis in terms of a clinical trial dataset, which is Weight Gain Prevention among Women (WGPW) study. We use the generalized Pearson residuals to assess the fit of the proposed latent variable model. 相似文献

12.

A multifactor transformed diffusion model with applications to VIX and VIX futures

《Econometric Reviews》2012,31(1):27-53

Abstract

Transformed diffusions (TDs) have become increasingly popular in financial modeling for their model flexibility and tractability. While existing TD models are predominately one-factor models, empirical evidence often prefers models with multiple factors. We propose a novel distribution-driven nonlinear multifactor TD model with latent components. Our model is a transformation of a underlying multivariate Ornstein–Uhlenbeck (MVOU) process, where the transformation function is endogenously specified by a flexible parametric stationary distribution of the observed variable. Computationally efficient exact likelihood inference can be implemented for our model using a modified Kalman filter algorithm and the transformed affine structure also allows us to price derivatives in semi-closed form. We compare the proposed multifactor model with existing TD models for modeling VIX and pricing VIX futures. Our results show that the proposed model outperforms all existing TD models both in the sample and out of the sample consistently across all categories and scenarios of our comparison. 相似文献

13.

Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data

Baolin Wu 《Journal of applied statistics》2013,40(2):358-367

Currently, extreme large-scale genetic data present significant challenges for cluster analysis. Most of the existing clustering methods are typically built on the Euclidean distance and geared toward analyzing continuous response. They work well for clustering, e.g. microarray gene expression data, but often perform poorly for clustering, e.g. large-scale single nucleotide polymorphism (SNP) data. In this paper, we study the penalized latent class model for clustering extremely large-scale discrete data. The penalized latent class model takes into account the discrete nature of the response using appropriate generalized linear models and adopts the lasso penalized likelihood approach for simultaneous model estimation and selection of important covariates. We develop very efficient numerical algorithms for model estimation based on the iterative coordinate descent approach and further develop the expectation–maximization algorithm to incorporate and model missing values. We use simulation studies and applications to the international HapMap SNP data to illustrate the competitive performance of the penalized latent class model. 相似文献

14.

Sequential imputation for models with latent variables assuming latent ignorability

Lauren J. Beesley Jeremy M. G. Taylor Roderick J. A. Little 《Australian & New Zealand Journal of Statistics》2019,61(2):213-233

Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model. 相似文献

15.

Comparing Two-Stage Segmentation Methods for Choice Data with a One-Stage Latent Class Choice Analysis

Marjolein Crabbe Bradley Jones Martina Vandebroek 《统计学通讯:模拟与计算》2013,42(5):1188-1212

Market segmentation is a key concept in marketing research. Identification of consumer segments helps in setting up and improving a marketing strategy. Hence, the need is to improve existing methods and to develop new segmentation methods. We introduce two new consumer indicators that can be used as segmentation basis in two-stage methods, the forces and the dfbetas. Both bases express a subject’s effect on the aggregate estimates of the parameters in a conditional logit model. Further, individual-level estimates, obtained by either estimating a conditional logit model for each individual separately with maximum likelihood or by hierarchical Bayes (HB) estimation of a mixed logit choice model, and the respondents’ raw choices are also used as segmentation basis. In the second stage of the methods the bases are classified into segments with cluster analysis or latent class models. All methods are applied to choice data because of the increasing popularity of choice experiments to analyze choice behavior. To verify whether two-stage segmentation methods can compete with a one-stage approach, a latent class choice model is estimated as well. A simulation study reveals the superiority of the two-stage method that clusters the HB estimates and the one-stage latent class choice model. Additionally, very good results are obtained for two-stage latent class cluster analysis of the choices as well as for the two-stage methods clustering the forces, the dfbetas and the choices. 相似文献

16.

Bayesian composite quantile regression for linear mixed-effects models

Yuzhu Tian Heng Lian Maozai Tian 《统计学通讯:理论与方法》2017,46(15):7717-7731

Longitudinal data are commonly modeled with the normal mixed-effects models. Most modeling methods are based on traditional mean regression, which results in non robust estimation when suffering extreme values or outliers. Median regression is also not a best choice to estimation especially for non normal errors. Compared to conventional modeling methods, composite quantile regression can provide robust estimation results even for non normal errors. In this paper, based on a so-called pseudo composite asymmetric Laplace distribution (PCALD), we develop a Bayesian treatment to composite quantile regression for mixed-effects models. Furthermore, with the location-scale mixture representation of the PCALD, we establish a Bayesian hierarchical model and achieve the posterior inference of all unknown parameters and latent variables using Markov Chain Monte Carlo (MCMC) method. Finally, this newly developed procedure is illustrated by some Monte Carlo simulations and a case analysis of HIV/AIDS clinical data set. 相似文献

17.

Identification of Multivariate Responders/Non-Responders Using Bayesian Growth Curve Latent Class Models

Leiby BE Sammel MD Ten Have TR Lynch KG 《Journal of the Royal Statistical Society. Series C, Applied statistics》2009,58(4):505-524

In this paper, we propose a multivariate growth curve mixture model that groups subjects based on multiple symptoms measured repeatedly over time. Our model synthesizes features of two models. First, we follow Roy and Lin (2000) in relating the multiple symptoms at each time point to a single latent variable. Second, we use the growth mixture model of Muthén and Shedden (1999) to group subjects based on distinctive longitudinal profiles of this latent variable. The mean growth curve for the latent variable in each class defines that class's features. For example, a class of "responders" would have a decline in the latent symptom summary variable over time. A Bayesian approach to estimation is employed where the methods of Elliott et al (2005) are extended to simultaneously estimate the posterior distributions of the parameters from the latent variable and growth curve mixture portions of the model. We apply our model to data from a randomized clinical trial evaluating the efficacy of Bacillus Calmette-Guerin (BCG) in treating symptoms of Interstitial Cystitis. In contrast to conventional approaches using a single subjective Global Response Assessment, we use the multivariate symptom data to identify a class of subjects where treatment demonstrates effectiveness. Simulations are used to confirm identifiability results and evaluate the performance of our algorithm. The definitive version of this paper is available at onlinelibrary.wiley.com. 相似文献

18.

Joint modeling tumor burden and time to event data in oncology trials

Ye Shen Aparna Anderson Ritwik Sinha Yang Li 《Pharmaceutical statistics》2014,13(5):286-293

The tumor burden (TB) process is postulated to be the primary mechanism through which most anticancer treatments provide benefit. In phase II oncology trials, the biologic effects of a therapeutic agent are often analyzed using conventional endpoints for best response, such as objective response rate and progression‐free survival, both of which causes loss of information. On the other hand, graphical methods including spider plot and waterfall plot lack any statistical inference when there is more than one treatment arm. Therefore, longitudinal analysis of TB data is well recognized as a better approach for treatment evaluation. However, longitudinal TB process suffers from informative missingness because of progression or death. We propose to analyze the treatment effect on tumor growth kinetics using a joint modeling framework accounting for the informative missing mechanism. Our approach is illustrated by multisetting simulation studies and an application to a nonsmall‐cell lung cancer data set. The proposed analyses can be performed in early‐phase clinical trials to better characterize treatment effect and thereby inform decision‐making. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

19.

Simultaneous variable and factor selection via sparse group lasso in factor analysis

Yuanchu Dang 《Journal of Statistical Computation and Simulation》2019,89(14):2744-2764

This paper considers variable and factor selection in factor analysis. We treat the factor loadings for each observable variable as a group, and introduce a weighted sparse group lasso penalty to the complete log-likelihood. The proposal simultaneously selects observable variables and latent factors of a factor analysis model in a data-driven fashion; it produces a more flexible and sparse factor loading structure than existing methods. For parameter estimation, we derive an expectation-maximization algorithm that optimizes the penalized log-likelihood. The tuning parameters of the procedure are selected by a likelihood cross-validation criterion that yields satisfactory results in various simulation settings. Simulation results reveal that the proposed method can better identify the possibly sparse structure of the true factor loading matrix with higher estimation accuracy than existing methods. A real data example is also presented to demonstrate its performance in practice. 相似文献

20.

Sequential Bayesian inference in hidden Markov stochastic kinetic models with application to detection and response to seasonal epidemics 总被引：1，自引：0，他引：1

Junjing Lin Michael Ludkovski 《Statistics and Computing》2014,24(6):1047-1062

We study sequential Bayesian inference in stochastic kinetic models with latent factors. Assuming continuous observation of all the reactions, our focus is on joint inference of the unknown reaction rates and the dynamic latent states, modeled as a hidden Markov factor. Using insights from nonlinear filtering of continuous-time jump Markov processes we develop a novel sequential Monte Carlo algorithm for this purpose. Our approach applies the ideas of particle learning to minimize particle degeneracy and exploit the analytical jump Markov structure. A motivating application of our methods is modeling of seasonal infectious disease outbreaks represented through a compartmental epidemic model. We demonstrate inference in such models with several numerical illustrations and also discuss predictive analysis of epidemic countermeasures using sequential Bayes estimates. 相似文献