期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Building and Fitting Non‐Gaussian Latent Variable Models via the Moment‐Generating Function

TORE SELLAND KLEPPE HANS J. SKAUG 《Scandinavian Journal of Statistics》2008,35(4):664-676

Abstract. For certain classes of hierarchical models, it is easy to derive an expression for the joint moment‐generating function (MGF) of data, whereas the joint probability density has an intractable form which typically involves an integral. The most important example is the class of linear models with non‐Gaussian latent variables. Parameters in the model can be estimated by approximate maximum likelihood, using a saddlepoint‐type approximation to invert the MGF. We focus on modelling heavy‐tailed latent variables, and suggest a family of mixture distributions that behaves well under the saddlepoint approximation (SPA). It is shown that the well‐known normalization issue renders the ordinary SPA useless in the present context. As a solution we extend the non‐Gaussian leading term SPA to a multivariate setting, and introduce a general rule for choosing the leading term density. The approach is applied to mixed‐effects regression, time‐series models and stochastic networks and it is shown that the modified SPA is very accurate. 相似文献

2.

Estimation of generalized linear latent variable models

Philippe Huber Elvezio Ronchetti Maria-Pia Victoria-Feser 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(4):893-908

Summary. Generalized linear latent variable models (GLLVMs), as defined by Bartholomew and Knott, enable modelling of relationships between manifest and latent variables. They extend structural equation modelling techniques, which are powerful tools in the social sciences. However, because of the complexity of the log-likelihood function of a GLLVM, an approximation such as numerical integration must be used for inference. This can limit drastically the number of variables in the model and can lead to biased estimators. We propose a new estimator for the parameters of a GLLVM, based on a Laplace approximation to the likelihood function and which can be computed even for models with a large number of variables. The new estimator can be viewed as an M -estimator, leading to readily available asymptotic properties and correct inference. A simulation study shows its excellent finite sample properties, in particular when compared with a well-established approach such as LISREL. A real data example on the measurement of wealth for the computation of multidimensional inequality is analysed to highlight the importance of the methodology. 相似文献

3.

Bayesian density regression for discrete outcomes

Georgios Papageorgiou 《Australian & New Zealand Journal of Statistics》2019,61(3):336-359

We develop Bayesian models for density regression with emphasis on discrete outcomes. The problem of density regression is approached by considering methods for multivariate density estimation of mixed scale variables, and obtaining conditional densities from the multivariate ones. The approach to multivariate mixed scale outcome density estimation that we describe represents discrete variables, either responses or covariates, as discretised versions of continuous latent variables. We present and compare several models for obtaining these thresholds in the challenging context of count data analysis where the response may be over‐ and/or under‐dispersed in some of the regions of the covariate space. We utilise a nonparametric mixture of multivariate Gaussians to model the directly observed and the latent continuous variables. The paper presents a Markov chain Monte Carlo algorithm for posterior sampling, sufficient conditions for weak consistency, and illustrations on density, mean and quantile regression utilising simulated and real datasets. 相似文献

4.

Smooth Semi‐nonparametric Analysis for Mixture Cure Models and Its Application to Breast Cancer

Haifen Li Jiajia Zhang Yincai Tang 《Australian & New Zealand Journal of Statistics》2014,56(3):217-235

Mixture cure models are widely used when a proportion of patients are cured. The proportional hazards mixture cure model and the accelerated failure time mixture cure model are the most popular models in practice. Usually the expectation–maximisation (EM) algorithm is applied to both models for parameter estimation. Bootstrap methods are used for variance estimation. In this paper we propose a smooth semi‐nonparametric (SNP) approach in which maximum likelihood is applied directly to mixture cure models for parameter estimation. The variance can be estimated by the inverse of the second derivative of the SNP likelihood. A comprehensive simulation study indicates good performance of the proposed method. We investigate stage effects in breast cancer by applying the proposed method to breast cancer data from the South Carolina Cancer Registry. 相似文献

5.

Isotone additive latent variable models

Sylvain Sardy Maria-Pia Victoria-Feser 《Statistics and Computing》2012,22(2):647-659

For manifest variables with additive noise and for a given number of latent variables with an assumed distribution, we propose to nonparametrically estimate the association between latent and manifest variables. Our estimation is a two step procedure: first it employs standard factor analysis to estimate the latent variables as theoretical quantiles of the assumed distribution; second, it employs the additive models’ backfitting procedure to estimate the monotone nonlinear associations between latent and manifest variables. The estimated fit may suggest a different latent distribution or point to nonlinear associations. We show on simulated data how, based on mean squared errors, the nonparametric estimation improves on factor analysis. We then employ the new estimator on real data to illustrate its use for exploratory data analysis. 相似文献

6.

Estimation with right‐censored observations under a semi‐Markov model

Lihui Zhao X. Joan Hu 《Revue canadienne de statistique》2013,41(2):237-256

The semi‐Markov process often provides a better framework than the classical Markov process for the analysis of events with multiple states. The purpose of this paper is twofold. First, we show that in the presence of right censoring, when the right end‐point of the support of the censoring time is strictly less than the right end‐point of the support of the semi‐Markov kernel, the transition probability of the semi‐Markov process is nonidentifiable, and the estimators proposed in the literature are inconsistent in general. We derive the set of all attainable values for the transition probability based on the censored data, and we propose a nonparametric inference procedure for the transition probability using this set. Second, the conventional approach to constructing confidence bands is not applicable for the semi‐Markov kernel and the sojourn time distribution. We propose new perturbation resampling methods to construct these confidence bands. Different weights and transformations are explored in the construction. We use simulation to examine our proposals and illustrate them with hospitalization data from a recent cancer survivor study. The Canadian Journal of Statistics 41: 237–256; 2013 © 2013 Statistical Society of Canada 相似文献

7.

Nonlinear mixed‐effects models with misspecified random‐effects distribution

Reza Drikvandi 《Pharmaceutical statistics》2020,19(3):187-201

Nonlinear mixed‐effects models are being widely used for the analysis of longitudinal data, especially from pharmaceutical research. They use random effects which are latent and unobservable variables so the random‐effects distribution is subject to misspecification in practice. In this paper, we first study the consequences of misspecifying the random‐effects distribution in nonlinear mixed‐effects models. Our study is focused on Gauss‐Hermite quadrature, which is now the routine method for calculation of the marginal likelihood in mixed models. We then present a formal diagnostic test to check the appropriateness of the assumed random‐effects distribution in nonlinear mixed‐effects models, which is very useful for real data analysis. Our findings show that the estimates of fixed‐effects parameters in nonlinear mixed‐effects models are generally robust to deviations from normality of the random‐effects distribution, but the estimates of variance components are very sensitive to the distributional assumption of random effects. Furthermore, a misspecified random‐effects distribution will either overestimate or underestimate the predictions of random effects. We illustrate the results using a real data application from an intensive pharmacokinetic study. 相似文献

8.

A fast algorithm for univariate log‐concave density estimation

下载免费PDF全文

Yu Liu Yong Wang 《Australian & New Zealand Journal of Statistics》2018,60(2):258-275

A new fast algorithm for computing the nonparametric maximum likelihood estimate of a univariate log‐concave density is proposed and studied. It is an extension of the constrained Newton method for nonparametric mixture estimation. In each iteration, the newly extended algorithm includes, if necessary, new knots that are located via a special directional derivative function. The algorithm renews the changes of slope at all knots via a quadratically convergent method and removes the knots at which the changes of slope become zero. Theoretically, the characterisation of the nonparametric maximum likelihood estimate is studied and the algorithm is guaranteed to converge to the unique maximum likelihood estimate. Numerical studies show that it outperforms other algorithms that are available in the literature. Applications to some real‐world financial data are also given. 相似文献

9.

Generalised quasi‐likelihood inference in a semi‐parametric binary dynamic mixed logit model

下载免费PDF全文

Nan Zheng Brajendra C. Sutradhar 《Australian & New Zealand Journal of Statistics》2018,60(3):343-373

There exists a recent study where dynamic mixed‐effects regression models for count data have been extended to a semi‐parametric context. However, when one deals with other discrete data such as binary responses, the results based on count data models are not directly applicable. In this paper, we therefore begin with existing binary dynamic mixed models and generalise them to the semi‐parametric context. For inference, we use a new semi‐parametric conditional quasi‐likelihood (SCQL) approach for the estimation of the non‐parametric function involved in the semi‐parametric model, and a semi‐parametric generalised quasi‐likelihood (SGQL) approach for the estimation of the main regression, dynamic dependence and random effects variance parameters. A semi‐parametric maximum likelihood (SML) approach is also used as a comparison to the SGQL approach. The properties of the estimators are examined both asymptotically and empirically. More specifically, the consistency of the estimators is established and finite sample performances of the estimators are examined through an intensive simulation study. 相似文献

10.

Joint regression modeling for missing categorical covariates in generalized linear models

Luis Carlos Pérez-Ruiz Gabriel Escarela 《Journal of applied statistics》2018,45(15):2741-2759

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques. 相似文献

11.

Nonparametric Bayes Stochastically Ordered Latent Class Models

Yang H O'Brien S Dunson DB 《Journal of the American Statistical Association》2011,106(495):807-817

Latent class models (LCMs) are used increasingly for addressing a broad variety of problems, including sparse modeling of multivariate and longitudinal data, model-based clustering, and flexible inferences on predictor effects. Typical frequentist LCMs require estimation of a single finite number of classes, which does not increase with the sample size, and have a well-known sensitivity to parametric assumptions on the distributions within a class. Bayesian nonparametric methods have been developed to allow an infinite number of classes in the general population, with the number represented in a sample increasing with sample size. In this article, we propose a new nonparametric Bayes model that allows predictors to flexibly impact the allocation to latent classes, while limiting sensitivity to parametric assumptions by allowing class-specific distributions to be unknown subject to a stochastic ordering constraint. An efficient MCMC algorithm is developed for posterior computation. The methods are validated using simulation studies and applied to the problem of ranking medical procedures in terms of the distribution of patient morbidity. 相似文献

12.

Estimation of regression parameters in a semiparametric transformation model

《Journal of statistical planning and inference》1996,52(3):331-351

A semiparametric approach to model skewed/heteroscedastic regression data is discussed. We work with a semiparametric transform-both-sides regression model, which contains a parametric regression function and a nonparametric transformation. This model is adequate when the relationship between the median response and the explanatory variable has been specified by a theoretical result or a previous empirical study. The transform-both-sides model with a parametric transformation has been studied extensively and applied successfully to a number data sets. Allowing a nonparametric transformation function increases the flexibility of the model. In this article, we estimate the nonparametric transformation function by the conditional kernel density approach developed by Wang and Ruppert (1995), and then use a pseudo-maximum likelihood estimator to estimate the regression parameters. This estimate of the regression parameters has not been studied previously. In this article, the asymptotic distribution of this pseudo-MLE is derived. We also show that when σ, the standard deviation of the error, goes to zero (small σ asymptotics), this estimator is adaptive. Adaptive means that the regression parameters are estimated as precisely as when the transformation is known exactly. A similar result holds in the parametric approaches of Carroll and Ruppert (1984) and Ruppert and Aldershof (1989). Simulated and real examples are provided to illustrate the performance of the proposed estimator for finite sample size. 相似文献

13.

Most Likely Transformations

《Scandinavian Journal of Statistics》2018,45(1):110-134

We propose and study properties of maximum likelihood estimators in the class of conditional transformation models. Based on a suitable explicit parameterization of the unconditional or conditional transformation function, we establish a cascade of increasingly complex transformation models that can be estimated, compared and analysed in the maximum likelihood framework. Models for the unconditional or conditional distribution function of any univariate response variable can be set up and estimated in the same theoretical and computational framework simply by choosing an appropriate transformation function and parameterization thereof. The ability to evaluate the distribution function directly allows us to estimate models based on the exact likelihood, especially in the presence of random censoring or truncation. For discrete and continuous responses, we establish the asymptotic normality of the proposed estimators. A reference software implementation of maximum likelihood‐based estimation for conditional transformation models that allows the same flexibility as the theory developed here was employed to illustrate the wide range of possible applications. 相似文献

14.

Latent Variable Models for Mixed Discrete and Continuous Outcomes 总被引：1，自引：0，他引：1

Mary Dupuis Sammel Louise M. Ryan & Julie M. Legler 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(3):667-678

We propose a latent variable model for mixed discrete and continuous outcomes. The model accommodates any mixture of outcomes from an exponential family and allows for arbitrary covariate effects, as well as direct modelling of covariates on the latent variable. An EM algorithm is proposed for parameter estimation and estimates of the latent variables are produced as a by-product of the analysis. A generalized likelihood ratio test can be used to test the significance of covariates affecting the latent outcomes. This method is applied to birth defects data, where the outcomes of interest are continuous measures of size and binary indicators of minor physical anomalies. Infants who were exposed in utero to anticonvulsant medications are compared with controls. 相似文献

15.

The Role of Posterior Densities in Latent Variable Models for Ordinal Data

Silvia Bianconcini Silvia Cagnone 《统计学通讯:理论与方法》2014,43(4):681-692

In latent variable models, problems related to the integration of the likelihood function arise since analytical solutions do not exist. Laplace and Adaptive Gauss-Hermite (AGH) approximations have been discussed as good approximating methods. Their performance relies on the assumption of normality of the posterior density of the latent variables, but, in small samples, this is not necessarily assured. Here, we analyze how the shape of the posterior densities varies as function of the model parameters, and we investigate its influence on the performance of AGH and of the Laplace approximation. 相似文献

16.

A Semi‐parametric Transformation Frailty Model for Semi‐competing Risks Survival Data

下载免费PDF全文

Fei Jiang Sebastien Haneuse 《Scandinavian Journal of Statistics》2017,44(1):112-129

In the analysis of semi‐competing risks data interest lies in estimation and inference with respect to a so‐called non‐terminal event, the observation of which is subject to a terminal event. Multi‐state models are commonly used to analyse such data, with covariate effects on the transition/intensity functions typically specified via the Cox model and dependence between the non‐terminal and terminal events specified, in part, by a unit‐specific shared frailty term. To ensure identifiability, the frailties are typically assumed to arise from a parametric distribution, specifically a Gamma distribution with mean 1.0 and variance, say, σ². When the frailty distribution is misspecified, however, the resulting estimator is not guaranteed to be consistent, with the extent of asymptotic bias depending on the discrepancy between the assumed and true frailty distributions. In this paper, we propose a novel class of transformation models for semi‐competing risks analysis that permit the non‐parametric specification of the frailty distribution. To ensure identifiability, the class restricts to parametric specifications of the transformation and the error distribution; the latter are flexible, however, and cover a broad range of possible specifications. We also derive the semi‐parametric efficient score under the complete data setting and propose a non‐parametric score imputation method to handle right censoring; consistency and asymptotic normality of the resulting estimators is derived and small‐sample operating characteristics evaluated via simulation. Although the proposed semi‐parametric transformation model and non‐parametric score imputation method are motivated by the analysis of semi‐competing risks data, they are broadly applicable to any analysis of multivariate time‐to‐event outcomes in which a unit‐specific shared frailty is used to account for correlation. Finally, the proposed model and estimation procedures are applied to a study of hospital readmission among patients diagnosed with pancreatic cancer. 相似文献

17.

A semiparametric stochastic mixed effects model for bivariate cyclic longitudinal data

Kexin Ji Joel A. Dubin 《Revue canadienne de statistique》2020,48(3):471-498

We propose a flexible semiparametric stochastic mixed effects model for bivariate cyclic longitudinal data. The model can handle either single cycle or, more generally, multiple consecutive cycle data. The approach models the mean of responses by parametric fixed effects and a smooth nonparametric function for the underlying time effects, and the relationship across the bivariate responses by a bivariate Gaussian random field and a joint distribution of random effects. The proposed model not only can model complicated individual profiles, but also allows for more flexible within-subject and between-response correlations. The fixed effects regression coefficients and the nonparametric time functions are estimated using maximum penalized likelihood, where the resulting estimator for the nonparametric time function is a cubic smoothing spline. The smoothing parameters and variance components are estimated simultaneously using restricted maximum likelihood. Simulation results show that the parameter estimates are close to the true values. The fit of the proposed model on a real bivariate longitudinal dataset of pre-menopausal women also performs well, both for a single cycle analysis and for a multiple consecutive cycle analysis. The Canadian Journal of Statistics 48: 471–498; 2020 © 2020 Statistical Society of Canada 相似文献

18.

随机效应半参数logit模型的惩罚似然估计研究

下载免费PDF全文

孙燕《统计研究》2013,30(4):92-98

在颇具争议的收入差距和健康关系研究中,为了降低可能存在的模型设定和遗漏变量偏误,本文提出了随机效应半参数logit模型,其中非参数的设定还可用于数据的初探性分析。随后本文提出了模型非参数和参数部分的估计方法。这里涉及的难点是随机效应的存在导致似然函数中的积分没有解析式,而非参数的存在更加大了估计难度。本文基于惩罚样条非参数估计方法和四阶Laplace近似方法建立了惩罚对数似然函数,其最大化采用了Newton_Raphson近似方法。文章还建立了惩罚样条中重要光滑参数的选取准则。模型在收入差距和健康实例中的估计结果表明数据支持收入差距弱假说,且非参数估计结果表明其具有U型形式,与实例估计结果的比较指出本文提出的估计方法是较准确的。相似文献

19.

Gamma failure‐time mixture models: yet another way to establish efficacy

Kallappa M. Koti 《Pharmaceutical statistics》2003,2(2):133-144

Using a Yamaguchi‐type generalized gamma failure‐time mixture model, we analyse the data from a study of autologous and allogeneic bone marrow transplantation in the treatment of high‐risk refractory acute lymphoblastic leukaemia, focusing on the time to recurrence of disease. We develop maximum likelihood techniques for the joint estimation of the surviving fractions and the survivor functions. This includes an approximation to the derivative of the survivor function with respect to the shape parameter. We obtain the maximum likelihood estimates of the model parameters. We also compute the variance‐covariance matrix of the parameter estimators. The extended family of generalized gamma failure‐time mixture models is flexible enough to include many commonly used failure‐time distributions as special cases. Yet these models are not used in practice because of computational difficulties. We claim that we have overcome this problem. The proposed approximation to the derivative of the survivor function with respect to the shape parameter can be used in any statistical package. We also address the issue of lack of identifiability. We point out that there can be a substantial advantage to using the gamma failure‐time mixture models over nonparametric methods. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

20.

The beta log-normal distribution

F. Castellares L. C. Montenegro G. M. Cordeiro 《Journal of Statistical Computation and Simulation》2013,83(2):203-228

For the first time, we introduce the beta log-normal (LN) distribution for which the LN distribution is a special case. Various properties of the new distribution are discussed. Expansions for the cumulative distribution and density functions that do not involve complicated functions are derived. We obtain expressions for its moments and for the moments of order statistics. The estimation of parameters is approached by the method of maximum likelihood, and the expected information matrix is derived. The new model is quite flexible in analysing positive data as an important alternative to the gamma, Weibull, generalized exponential, beta exponential, and Birnbaum–Saunders distributions. The flexibility of the new distribution is illustrated in an application to a real data set. 相似文献