期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian clustering for continuous-time hidden Markov models

Yu Luo David A. Stephens David L. Buckeridge 《Revue canadienne de statistique》2023,51(1):134-156

We develop clustering procedures for longitudinal trajectories based on a continuous-time hidden Markov model (CTHMM) and a generalized linear observation model. Specifically, in this article we carry out finite and infinite mixture model-based clustering for a CTHMM and achieve inference using Markov chain Monte Carlo (MCMC). For a finite mixture model with a prior on the number of components, we implement reversible-jump MCMC to facilitate the trans-dimensional move between models with different numbers of clusters. For a Dirichlet process mixture model, we utilize restricted Gibbs sampling split–merge proposals to improve the performance of the MCMC algorithm. We apply our proposed algorithms to simulated data as well as a real-data example, and the results demonstrate the desired performance of the new sampler. 相似文献

2.

Bayesian model averaging for estimating the number of classes: applications to the total number of species in metagenomics

Sébastien Li-Thiao-Té Daudin Jean-Jacques Robin Stéphane 《Journal of applied statistics》2012,39(7):1489-1504

相似文献

3.

Profile likelihood approaches for semiparametric copula and frailty models for clustered survival data

Il Do Ha Jong-Min Kim Takeshi Emura 《Journal of applied statistics》2019,46(14):2553-2571

ABSTRACT

In clustered survival data, the dependence among individual survival times within a cluster has usually been described using copula models and frailty models. In this paper we propose a profile likelihood approach for semiparametric copula models with different cluster sizes. We also propose a likelihood ratio method based on profile likelihood for testing the absence of association parameter (i.e. test of independence) under the copula models, leading to the boundary problem of the parameter space. For this purpose, we show via simulation study that the proposed likelihood ratio method using an asymptotic chi-square mixture distribution performs well as sample size increases. We compare the behaviors of the two models using the profile likelihood approach under a semiparametric setting. The proposed method is demonstrated using two well-known data sets. 相似文献

4.

Clustering Dependencies Via Mixtures of Copulas

Veni Arakelian Dimitris Karlis 《统计学通讯:模拟与计算》2013,42(7):1644-1661

相似文献

5.

Modeling statistical dependence of Markov chains via copula models

Fentaw Abegaz U.V. Naik-Nimbalkar 《Journal of statistical planning and inference》2008

Conditional probability distributions have been commonly used in modeling Markov chains. In this paper we consider an alternative approach based on copulas to investigate Markov-type dependence structures. Based on the realization of a single Markov chain, we estimate the parameters using one- and two-stage estimation procedures. We derive asymptotic properties of the marginal and copula parameter estimators and compare performance of the estimation procedures based on Monte Carlo simulations. At low and moderate dependence structures the two-stage estimation has comparable performance as the maximum likelihood estimation. In addition we propose a parametric pseudo-likelihood ratio test for copula model selection under the two-stage procedure. We apply the proposed methods to an environmental data set. 相似文献

6.

Robust designs for misspecified logistic models

Adeniyi J. Adewale Douglas P. Wiens 《Journal of statistical planning and inference》2009

We develop criteria that generate robust designs and use such criteria for the construction of designs that insure against possible misspecifications in logistic regression models. The design criteria we propose are different from the classical in that we do not focus on sampling error alone. Instead we use design criteria that account as well for error due to bias engendered by the model misspecification. Our robust designs optimize the average of a function of the sampling error and bias error over a specified misspecification neighbourhood. Examples of robust designs for logistic models are presented, including a case study implementing the methodologies using beetle mortality data. 相似文献

7.

Quantile regression for robust inference on varying coefficient partially nonlinear models

Jing Yang Fang Lu Hu Yang 《Journal of the Korean Statistical Society》2018,47(2):172-184

In this paper, we propose a robust statistical inference approach for the varying coefficient partially nonlinear models based on quantile regression. A three-stage estimation procedure is developed to estimate the parameter and coefficient functions involved in the model. Under some mild regularity conditions, the asymptotic properties of the resulted estimators are established. Some simulation studies are conducted to evaluate the finite performance as well as the robustness of our proposed quantile regression method versus the well known profile least squares estimation procedure. Moreover, the Boston housing price data is given to further illustrate the application of the new method. 相似文献

8.

The Identifiability of Dependent Competing Risks Models Induced by Bivariate Frailty Models

下载免费PDF全文

Antai Wang Krishnendu Chandra Ruihua Xu Junfeng Sun 《Scandinavian Journal of Statistics》2015,42(2):427-437

In this paper, we propose to use a special class of bivariate frailty models to study dependent censored data. The proposed models are closely linked to Archimedean copula models. We give sufficient conditions for the identifiability of this type of competing risks models. The proposed conditions are derived based on a property shared by Archimedean copula models and satisfied by several well‐known bivariate frailty models. Compared with the models studied by Heckman and Honoré and Abbring and van den Berg, our models are more restrictive but can be identified with a discrete (even finite) covariate. Under our identifiability conditions, expectation–maximization (EM) algorithm provides us with consistent estimates of the unknown parameters. Simulation studies have shown that our estimation procedure works quite well. We fit a dependent censored leukaemia data set using the Clayton copula model and end our paper with some discussions. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics 相似文献

9.

A Bayesian approach to mixture cure models with spatial frailties for population‐based cancer relative survival data

Binbing Yu Ram C. Tiwari 《Revue canadienne de statistique》2012,40(1):40-54

As the treatments of cancer progress, a certain number of cancers are curable if diagnosed early. In population‐based cancer survival studies, cure is said to occur when mortality rate of the cancer patients returns to the same level as that expected for the general cancer‐free population. The estimates of cure fraction are of interest to both cancer patients and health policy makers. Mixture cure models have been widely used because the model is easy to interpret by separating the patients into two distinct groups. Usually parametric models are assumed for the latent distribution for the uncured patients. The estimation of cure fraction from the mixture cure model may be sensitive to misspecification of latent distribution. We propose a Bayesian approach to mixture cure model for population‐based cancer survival data, which can be extended to county‐level cancer survival data. Instead of modeling the latent distribution by a fixed parametric distribution, we use a finite mixture of the union of the lognormal, loglogistic, and Weibull distributions. The parameters are estimated using the Markov chain Monte Carlo method. Simulation study shows that the Bayesian method using a finite mixture latent distribution provides robust inference of parameter estimates. The proposed Bayesian method is applied to relative survival data for colon cancer patients from the Surveillance, Epidemiology, and End Results (SEER) Program to estimate the cure fractions. The Canadian Journal of Statistics 40: 40–54; 2012 © 2012 Statistical Society of Canada 相似文献

10.

Estimating class-specific parametric models using finite mixtures: an application to a hedonic model of wine prices

Steven B. Caudill 《Journal of applied statistics》2016,43(7):1253-1261

Hedonic price models are commonly used in the study of markets for various goods, most notably those for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes within a given product class, where in the case of some goods, such as wine, substantial product differentiation exists. To address this issue, recent research on wine prices employs local polynomial regression clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of wine prices, provided only that the dependent variable in the model is rescaled. The present study also catalogues several of the advantages over LPRC modeling of estimating mixture models. 相似文献

11.

Bayesian Semiparametric Modelling in Quantile Regression

ATHANASIOS KOTTAS MILOVAN KRNJAJI&#x; 《Scandinavian Journal of Statistics》2009,36(2):297-319

Abstract. We propose a Bayesian semiparametric methodology for quantile regression modelling. In particular, working with parametric quantile regression functions, we develop Dirichlet process mixture models for the error distribution in an additive quantile regression formulation. The proposed non‐parametric prior probability models allow the shape of the error density to adapt to the data and thus provide more reliable predictive inference than models based on parametric error distributions. We consider extensions to quantile regression for data sets that include censored observations. Moreover, we employ dependent Dirichlet processes to develop quantile regression models that allow the error distribution to change non‐parametrically with the covariates. Posterior inference is implemented using Markov chain Monte Carlo methods. We assess and compare the performance of our models using both simulated and real data sets. 相似文献

12.

Discussion on the paper by Peter Müller,Fernando A. Quintana,and Garritt L. Page

Seongil Jo Jaeyong Lee 《Statistical Methods and Applications》2018,27(2):227-230

The article by Müller, Quintana, and Page reviews a variety of Bayesian nonparametric models and demonstrates them in a few applications. They emphasize applications in spatial data on which our discussion focuses as well. In particular, we consider two types of mixture models based on species sampling models (SSM) for spatial clustering and apply them to the Chilean mathematics testing score data analyzed by the authors. We conclude that only the mixture model of SSM with spatial locations as part of observations renders spatially non-overlapping clusters. 相似文献

13.

Computational aspects of the EM algorithm for spatial econometric models with missing data

Thomas Suesse Andrew Zammit-Mangion 《Journal of Statistical Computation and Simulation》2017,87(9):1767-1786

Maximum likelihood (ML) estimation with spatial econometric models is a long-standing problem that finds application in several areas of economic importance. The problem is particularly challenging in the presence of missing data, since there is an implied dependence between all units, irrespective of whether they are observed or not. Out of the several approaches adopted for ML estimation in this context, that of LeSage and Pace [Models for spatially dependent missing data. J Real Estate Financ Econ. 2004;29(2):233–254] stands out as one of the most commonly used with spatial econometric models due to its ability to scale with the number of units. Here, we review their algorithm, and consider several similar alternatives that are also suitable for large datasets. We compare the methods through an extensive empirical study and conclude that, while the approximate approaches are suitable for large sampling ratios, for small sampling ratios the only reliable algorithms are those that yield exact ML or restricted ML estimates. 相似文献

14.

Inference for step-stress accelerated life tests

Moshe Shaked Nozer D. Singpurwalla 《Journal of statistical planning and inference》1983,7(4):295-306

In this paper we consider the more realistic aspect of accelerated life testing wherein the stress on an unfailed item is allowed to increase at a preassigned test time. Such tests are known as step-stress tests. Our approach is nonparametric in that we do not make any assumptions about the underlying distribution of life lengths. We introduce a model for step-stress testing which is based on the ideas of shock models and of wear processes. This model unifies and generalizes two previously proposed models for step-stress testing. We propose an estimator for the life distribution under use conditions stress and show that this estimator is strongly consistent. 相似文献

15.

Heterogeneity in Consumer Price Stickiness

《商业与经济统计学杂志》2013,31(3):247-264

We examine heterogeneity in price stickiness using a large, original, set of individual price data collected at the retail level for the computation of the French consumer price index. For that purpose, we estimate at a very high level of disaggregation, a piecewise-constant hazard model, as well as competing-risks duration models that distinguish between price increases, price decreases, and product replacements. The main findings are the following: (a) at the product–outlet-type level, the baseline hazard function of a price spell is nondecreasing; (b) cross-product and cross-outlet-type heterogeneity is pervasive, both in the shape and the level of the hazard function as well as in the impact of covariates; (c) there is strong evidence of state dependence, especially for price increases; (d) there is an asymmetry because determinants of price increases differ from those of price decreases. 相似文献

16.

Evaluation of missing data mechanisms in two and three dimensional incomplete tables

Sayan Ghosh Palaniappan Vellaisamy 《Journal of the Korean Statistical Society》2019,48(2):297-313

The analysis of incomplete contingency tables is a practical and an interesting problem. In this paper, we provide characterizations for the various missing mechanisms of a variable in terms of response and non-response odds for two and three dimensional incomplete tables. Log-linear parametrization and some distinctive properties of the missing data models for the above tables are discussed. All possible cases in which data on one, two or all variables may be missing are considered. We study the missingness of each variable in a model, which is more insightful for analyzing cross-classified data than the missingness of the outcome vector. For sensitivity analysis of the incomplete tables, we propose easily verifiable procedures to evaluate the missing at random (MAR), missing completely at random (MCAR) and not missing at random (NMAR) assumptions of the missing data models. These methods depend only on joint and marginal odds computed from fully and partially observed counts in the tables, respectively. Finally, some real-life datasets are analyzed to illustrate our results, which are confirmed based on simulation studies. 相似文献

17.

Efficient Bayesian analysis of multiple changepoint models with dependence across segments

Paul Fearnhead Zhen Liu 《Statistics and Computing》2011,21(2):217-229

We consider Bayesian analysis of a class of multiple changepoint models. While there are a variety of efficient ways to analyse these models if the parameters associated with each segment are independent, there are few general approaches for models where the parameters are dependent. Under the assumption that the dependence is Markov, we propose an efficient online algorithm for sampling from an approximation to the posterior distribution of the number and position of the changepoints. In a simulation study, we show that the approximation introduced is negligible. We illustrate the power of our approach through fitting piecewise polynomial models to data, under a model which allows for either continuity or discontinuity of the underlying curve at each changepoint. This method is competitive with, or outperform, other methods for inferring curves from noisy data; and uniquely it allows for inference of the locations of discontinuities in the underlying curve. 相似文献

18.

A dependent Dirichlet process model for survival data with competing risks

Shi Yushu Laud Purushottam Neuner Joan 《Lifetime data analysis》2021,27(1):156-176

In this paper, we first propose a dependent Dirichlet process (DDP) model using a mixture of Weibull models with each mixture component resembling a Cox model for survival data. We then build a Dirichlet process mixture model for competing risks data without regression covariates. Next we extend this model to a DDP model for competing risks regression data by using a multiplicative covariate effect on subdistribution hazards in the mixture components. Though built on proportional hazards (or subdistribution hazards) models, the proposed nonparametric Bayesian regression models do not require the assumption of constant hazard (or subdistribution hazard) ratio. An external time-dependent covariate is also considered in the survival model. After describing the model, we discuss how both cause-specific and subdistribution hazard ratios can be estimated from the same nonparametric Bayesian model for competing risks regression. For use with the regression models proposed, we introduce an omnibus prior that is suitable when little external information is available about covariate effects. Finally we compare the models’ performance with existing methods through simulations. We also illustrate the proposed competing risks regression model with data from a breast cancer study. An R package “DPWeibull” implementing all of the proposed methods is available at CRAN.

相似文献

19.

Statistical properties of parametric estimators for Markov chain vectors based on copula models

Wende Yi Stephen Shaoyi Liao 《Journal of statistical planning and inference》2010

To estimate and measure risks, two key classes of dependence relationship must be identified: temporal dependence and contemporaneous dependence. In this paper, we propose a parametric estimation model that uses a three-stage pseudo maximum likelihood estimation (3SPMLE), and we investigate the consistency and asymptotic normality of parametric estimators. The proposed model combines the concept of a copula and the methods of parametric estimators of two-stage pseudo maximum likelihood estimation (2SPMLE). The selection of a copula model that best captures the dependence structure is a critical problem. To solve this problem, we propose a model selection method that is based on the parametric pseudo-likelihood ratio under the 3SPMLE for stationary Markov vector-type models. 相似文献

20.

Ordered ranked set samples and applications to inference

N. Balakrishnan T. Li 《Journal of statistical planning and inference》2008

Ranked set sampling (RSS) was first proposed by McIntyre [1952. A method for unbiased selective sampling, using ranked sets. Australian J. Agricultural Res. 3, 385–390] as an effective way to estimate the unknown population mean. Chuiv and Sinha [1998. On some aspects of ranked set sampling in parametric estimation. In: Balakrishnan, N., Rao, C.R. (Eds.), Handbook of Statistics, vol. 17. Elsevier, Amsterdam, pp. 337–377] and Chen et al. [2004. Ranked Set Sampling—Theory and Application. Lecture Notes in Statistics, vol. 176. Springer, New York] have provided excellent surveys of RSS and various inferential results based on RSS. In this paper, we use the idea of order statistics from independent and non-identically distributed (INID) random variables to propose ordered ranked set sampling (ORSS) and then develop optimal linear inference based on ORSS. We determine the best linear unbiased estimators based on ORSS (BLUE-ORSS) and show that they are more efficient than BLUE-RSS for the two-parameter exponential, normal and logistic distributions. Although this is not the case for the one-parameter exponential distribution, the relative efficiency of the BLUE-ORSS (to BLUE-RSS) is very close to 1. Furthermore, we compare both BLUE-ORSS and BLUE-RSS with the BLUE based on order statistics from a simple random sample (BLUE-OS). We show that BLUE-ORSS is uniformly better than BLUE-OS, while BLUE-RSS is not as efficient as BLUE-OS for small sample sizes (n<5

n < 5

). 相似文献