期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Maximum Likelihood Estimations and EM Algorithms with Length-biased Data 总被引：2，自引：0，他引：2

Qin J Ning J Liu H Shen Y 《Journal of the American Statistical Association》2011,106(496):1434-1449

Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. 相似文献

2.

Empirical Bayes models of Poisson clinical trials and sample size determination

Boris G. Zaslavsky 《Pharmaceutical statistics》2010,9(2):133-141

Bayesian methods are often used to reduce the sample sizes and/or increase the power of clinical trials. The right choice of the prior distribution is a critical step in Bayesian modeling. If the prior not completely specified, historical data may be used to estimate it. In the empirical Bayesian analysis, the resulting prior can be used to produce the posterior distribution. In this paper, we describe a Bayesian Poisson model with a conjugate Gamma prior. The parameters of Gamma distribution are estimated in the empirical Bayesian framework under two estimation schemes. The straightforward numerical search for the maximum likelihood (ML) solution using the marginal negative binomial distribution is unfeasible occasionally. We propose a simplification to the maximization procedure. The Markov Chain Monte Carlo method is used to create a set of Poisson parameters from the historical count data. These Poisson parameters are used to uniquely define the Gamma likelihood function. Easily computable approximation formulae may be used to find the ML estimations for the parameters of gamma distribution. For the sample size calculations, the ML solution is replaced by its upper confidence limit to reflect an incomplete exchangeability of historical trials as opposed to current studies. The exchangeability is measured by the confidence interval for the historical rate of the events. With this prior, the formula for the sample size calculation is completely defined. Published in 2009 by John Wiley & Sons, Ltd. 相似文献

3.

Approximate Bayesian computation with composite score functions

Erlis Ruli Nicola Sartori Laura Ventura 《Statistics and Computing》2016,26(3):679-692

Both approximate Bayesian computation (ABC) and composite likelihood methods are useful for Bayesian and frequentist inference, respectively, when the likelihood function is intractable. We propose to use composite likelihood score functions as summary statistics in ABC in order to obtain accurate approximations to the posterior distribution. This is motivated by the use of the score function of the full likelihood, and extended to general unbiased estimating functions in complex models. Moreover, we show that if the composite score is suitably standardised, the resulting ABC procedure is invariant to reparameterisations and automatically adjusts the curvature of the composite likelihood, and of the corresponding posterior distribution. The method is illustrated through examples with simulated data, and an application to modelling of spatial extreme rainfall data is discussed. 相似文献

4.

On predictive distributions and Bayesian networks

Kontkanen P. Myllymäki P. Silander T. Tirri H. Grünwald P. 《Statistics and Computing》2000,10(1):39-54

In this paper we are interested in discrete prediction problems for a decision-theoretic setting, where the task is to compute the predictive distribution for a finite set of possible alternatives. This question is first addressed in a general Bayesian framework, where we consider a set of probability distributions defined by some parametric model class. Given a prior distribution on the model parameters and a set of sample data, one possible approach for determining a predictive distribution is to fix the parameters to the instantiation with the maximum a posteriori probability. A more accurate predictive distribution can be obtained by computing the evidence (marginal likelihood), i.e., the integral over all the individual parameter instantiations. As an alternative to these two approaches, we demonstrate how to use Rissanen's new definition of stochastic complexity for determining predictive distributions, and show how the evidence predictive distribution with Jeffrey's prior approaches the new stochastic complexity predictive distribution in the limit with increasing amount of sample data. To compare the alternative approaches in practice, each of the predictive distributions discussed is instantiated in the Bayesian network model family case. In particular, to determine Jeffrey's prior for this model family, we show how to compute the (expected) Fisher information matrix for a fixed but arbitrary Bayesian network structure. In the empirical part of the paper the predictive distributions are compared by using the simple tree-structured Naive Bayes model, which is used in the experiments for computational reasons. The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense. The evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used. 相似文献

5.

Nonparametric tests for right-censored data with biased sampling

Ning J Qin J Shen Y 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2010,72(5):609-630

Testing the equality of two survival distributions can be difficult in a prevalent cohort study when non random sampling of subjects is involved. Due to the biased sampling scheme, independent censoring assumption is often violated. Although the issues about biased inference caused by length-biased sampling have been widely recognized in statistical, epidemiological and economical literature, there is no satisfactory solution for efficient two-sample testing. We propose an asymptotic most efficient nonparametric test by properly adjusting for length-biased sampling. The test statistic is derived from a full likelihood function, and can be generalized from the two-sample test to a k-sample test. The asymptotic properties of the test statistic under the null hypothesis are derived using its asymptotic independent and identically distributed representation. We conduct extensive Monte Carlo simulations to evaluate the performance of the proposed test statistics and compare them with the conditional test and the standard logrank test for different biased sampling schemes and right-censoring mechanisms. For length-biased data, empirical studies demonstrated that the proposed test is substantially more powerful than the existing methods. For general left-truncated data, the proposed test is robust, still maintains accurate control of type I error rate, and is also more powerful than the existing methods, if the truncation patterns and right-censoring patterns are the same between the groups. We illustrate the methods using two real data examples. 相似文献

6.

Bayesian analysis of bivariate competing risks models with covariates

《Journal of statistical planning and inference》2003,115(2):441-459

Bivariate exponential models have often been used for the analysis of competing risks data involving two correlated risk components. Competing risks data consist only of the time to failure and cause of failure. In situations where there is positive probability of simultaneous failure, possibly the most widely used model is the Marshall–Olkin (J. Amer. Statist. Assoc. 62 (1967) 30) bivariate lifetime model. This distribution is not absolutely continuous as it involves a singularity component. However, the likelihood function based on the competing risks data is then identifiable, and any inference, Bayesian or frequentist, can be carried out in a straightforward manner. For the analysis of absolutely continuous bivariate exponential models, standard approaches often run into difficulty due to the lack of a fully identifiable likelihood (Basu and Ghosh; Commun. Statist. Theory Methods 9 (1980) 1515). To overcome the nonidentifiability, the usual frequentist approach is based on an integrated likelihood. Such an approach is implicit in Wada et al. (Calcutta Statist. Assoc. Bull. 46 (1996) 197) who proved some related asymptotic results. We offer in this paper an alternative Bayesian approach. Since systematic prior elicitation is often difficult, the present study focuses on Bayesian analysis with noninformative priors. It turns out that with an appropriate reparameterization, standard noninformative priors such as Jeffreys’ prior and its variants can be applied directly even though the likelihood is not fully identifiable. Two noninformative priors are developed that consist of Laplace's prior for nonidentifiable parameters and Laplace's and Jeffreys's priors for identifiable parameters. The resulting Bayesian procedures possess some frequentist optimality properties as well. Finally, these Bayesian methods are illustrated with analyses of a data set originating out of a lung cancer clinical trial conducted by the Eastern Cooperative Oncology Group. 相似文献

7.

Bayesian validation assessment of multivariate computational models

Xiaomo Jiang 《Journal of applied statistics》2008,35(1):49-65

Multivariate model validation is a complex decision-making problem involving comparison of multiple correlated quantities, based upon the available information and prior knowledge. This paper presents a Bayesian risk-based decision method for validation assessment of multivariate predictive models under uncertainty. A generalized likelihood ratio is derived as a quantitative validation metric based on Bayes’ theorem and Gaussian distribution assumption of errors between validation data and model prediction. The multivariate model is then assessed based on the comparison of the likelihood ratio with a Bayesian decision threshold, a function of the decision costs and prior of each hypothesis. The probability density function of the likelihood ratio is constructed using the statistics of multiple response quantities and Monte Carlo simulation. The proposed methodology is implemented in the validation of a transient heat conduction model, using a multivariate data set from experiments. The Bayesian methodology provides a quantitative approach to facilitate rational decisions in multivariate model assessment under uncertainty. 相似文献

8.

Learning from a lot: Empirical Bayes for high‐dimensional model‐based prediction

Mark A. van de Wiel Dennis E. Te Beest Magnus M. Münch 《Scandinavian Journal of Statistics》2019,46(1):2-25

Empirical Bayes is a versatile approach to “learn from a lot” in two ways: first, from a large number of variables and, second, from a potentially large amount of prior information, for example, stored in public repositories. We review applications of a variety of empirical Bayes methods to several well‐known model‐based prediction methods, including penalized regression, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss “formal” empirical Bayes methods that maximize the marginal likelihood but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross‐validation and full Bayes and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we consider a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, which model a priori information on variables termed “co‐data”. In particular, we present two novel examples that allow for co‐data: first, a Bayesian spike‐and‐slab setting that facilitates inclusion of multiple co‐data sources and types and, second, a hybrid empirical Bayes–full Bayes ridge regression approach for estimation of the posterior predictive interval. 相似文献

9.

Bayesian partial likelihood approach for tied observations

Yongdai Kim Dohyun Kim 《Journal of statistical planning and inference》2009

The Bayesian analysis based on the partial likelihood for Cox's proportional hazards model is frequently used because of its simplicity. The Bayesian partial likelihood approach is often justified by showing that it approximates the full Bayesian posterior of the regression coefficients with a diffuse prior on the baseline hazard function. This, however, may not be appropriate when ties exist among uncensored observations. In that case, the full Bayesian and Bayesian partial likelihood posteriors can be much different. In this paper, we propose a new Bayesian partial likelihood approach for many tied observations and justify its use. 相似文献

10.

Bayesian multiple change-point estimation for exponential distribution with truncated and censored data

Chaobing He 《统计学通讯:理论与方法》2017,46(12):5827-5839

This paper considers the multiple change-point estimation for exponential distribution with truncated and censored data by Gibbs sampling. After all the missing data of interest is filled in by some sampling methods such as rejection sampling method, the complete-data likelihood function is obtained. The full conditional distributions of all parameters are discussed. The means of Gibbs samples are taken as Bayesian estimations of the parameters. The implementation steps of Gibbs sampling are introduced in detail. Finally random simulation test is developed, and the results show that Bayesian estimations are fairly accurate. 相似文献

11.

On semiparametric pivotal bayesian inference for quantiles

T.B. Swartz C. Villegas C.J. Martinez 《统计学通讯:理论与方法》2013,42(10):2499-2515

In semiparametric inference we distinguish between the parameter of interest which may be a location parameter, and a nuisance parameter that determines the remaining shape of the sampling distribution. As was pointed out by Diaconis and Freedman the main problem in semiparametric Bayesian inference is to obtain a consistent posterior distribution for the parameter of interest. The present paper considers a semiparametric Bayesian method based on a pivotal likelihood function. It is shown that when the parameter of interest is the median, this method produces a consistent posterior distribution and is easily implemented, Numerical comparisons with classical methods and with Bayesian methods based on a Dirichlet prior are provided. It is also shown that in the case of symmetric intervals, the classical confidence coefficients have a Bayesian interpretation as the limiting posterior probability of the interval based on the Dirichlet prior with a parameter that converges to zero. 相似文献

12.

Bayesian–frequentist hybrid model with application to the analysis of gene copy number changes

Ao Yuan Guanjie Chen Juan Xiong Wenqing He Wen Jin Charles Rotimi 《Journal of applied statistics》2011,38(5):987-1005

Gene copy number (GCN) changes are common characteristics of many genetic diseases. Comparative genomic hybridization (CGH) is a new technology widely used today to screen the GCN changes in mutant cells with high resolution genome-wide. Statistical methods for analyzing such CGH data have been evolving. Existing methods are either frequentist's or full Bayesian. The former often has computational advantage, while the latter can incorporate prior information into the model, but could be misleading when one does not have sound prior information. In an attempt to take full advantages of both approaches, we develop a Bayesian-frequentist hybrid approach, in which a subset of the model parameters is inferred by the Bayesian method, while the rest parameters by the frequentist's. This new hybrid approach provides advantages over those of the Bayesian or frequentist's method used alone. This is especially the case when sound prior information is available on part of the parameters, and the sample size is relatively small. Spatial dependence and false discovery rate are also discussed, and the parameter estimation is efficient. As an illustration, we used the proposed hybrid approach to analyze a real CGH data. 相似文献

13.

Software modules categorization through likelihood and bayesian analysis of finite dirichlet mixtures

Nizar Bouguila Jian Han Wang A. Ben Hamza 《Journal of applied statistics》2010,37(2):235-252

相似文献

14.

带线性约束的多元线性回归模型参数估计

李小胜王申令《统计研究》2016,33(11):85-92

本文首先构造线性约束条件下的多元线性回归模型的样本似然函数,利用Lagrange法证明其合理性。其次,从似然函数的角度讨论线性约束条件对模型参数的影响,对由传统理论得出的参数估计作出贝叶斯与经验贝叶斯的改进。做贝叶斯改进时,将矩阵正态-Wishart分布作为模型参数和精度阵的联合共轭先验分布,结合构造的似然函数得出参数的后验分布,计算出参数的贝叶斯估计;做经验贝叶斯改进时,将样本分组,从方差的角度讨论由子样得出的参数估计对总样本的参数估计的影响,计算出经验贝叶斯估计。最后,利用Matlab软件生成的随机矩阵做模拟。结果表明,这两种改进后的参数估计均较由传统理论得出的参数估计更精确,拟合结果的误差比更小,可信度更高,在大数据的情况下,这种计算方法的速度更快。相似文献

15.

On progressively censored inverted exponentiated Rayleigh distribution

Raj Kamal Maurya Tanmay Sen Manoj Kumar Rastogi 《Journal of Statistical Computation and Simulation》2019,89(3):492-518

In this paper, we discuss a progressively censored inverted exponentiated Rayleigh distribution. Estimation of unknown parameters is considered under progressive censoring using maximum likelihood and Bayesian approaches. Bayes estimators of unknown parameters are derived with respect to different symmetric and asymmetric loss functions using gamma prior distributions. An importance sampling procedure is taken into consideration for deriving these estimates. Further highest posterior density intervals for unknown parameters are constructed and for comparison purposes bootstrap intervals are also obtained. Prediction of future observations is studied in one- and two-sample situations from classical and Bayesian viewpoint. We further establish optimum censoring schemes using Bayesian approach. Finally, we conduct a simulation study to compare the performance of proposed methods and analyse two real data sets for illustration purposes. 相似文献

16.

Bayesian inference for multivariate gamma distributions 总被引：2，自引：1，他引：1

Efthymios G. Tsionas 《Statistics and Computing》2004,14(3):223-233

The paper considers the multivariate gamma distribution for which the method of moments has been considered as the only method of estimation due to the complexity of the likelihood function. With a non-conjugate prior, practical Bayesian analysis can be conducted using Gibbs sampling with data augmentation. The new methods are illustrated using artificial data for a trivariate gamma distribution as well as an application to technical inefficiency estimation. 相似文献

17.

Estimating multiple-membership logit models with mixed effects: indirect inference versus data cloning

Anna Gottard Giorgio Calzolari 《Journal of Statistical Computation and Simulation》2017,87(12):2334-2348

Multiple-membership logit models with random effects are models for clustered binary data, where each statistical unit can belong to more than one group. The likelihood function of these models is analytically intractable. We propose two different approaches for parameter estimation: indirect inference and data cloning (DC). The former is a non-likelihood-based method which uses an auxiliary model to select reasonable estimates. We propose an auxiliary model with the same dimension of parameter space as the target model, which is particularly convenient to reach good estimates very fast. The latter method computes maximum likelihood estimates through the posterior distribution of an adequate Bayesian model, fitted to cloned data. We implement a DC algorithm specifically for multiple-membership models. A Monte Carlo experiment compares the two methods on simulated data. For further comparison, we also report Bayesian posterior mean and Integrated Nested Laplace Approximation hybrid DC estimates. Simulations show a negligible loss of efficiency for the indirect inference estimator, compensated by a relevant computational gain. The approaches are then illustrated with two real examples on matched paired data. 相似文献

18.

Bayesian approach to epsilon-skew-normal family

M. Maleki 《统计学通讯:理论与方法》2017,46(15):7546-7561

The estimation problem of epsilon-skew-normal (ESN) distribution parameters is considered within Bayesian approaches. This family of distributions contains the normal distribution, can be used for analyzing the asymmetric and near-normal data. Bayesian estimates under informative and non informative Jeffreys prior distributions are obtained and performances of ESN family and these estimates are shown via a simulation study. A real data set is also used to illustrate the ideas. 相似文献

19.

Objective Bayesian analysis based on upper record values from two-parameter Rayleigh distribution with partial information

Jung In Seo 《Journal of applied statistics》2017,44(12):2222-2237

In the life test, predicting higher failure times than the largest failure time of the observed is an important issue. Although the Rayleigh distribution is a suitable model for analyzing the lifetime of components that age rapidly over time because its failure rate function is an increasing linear function of time, the inference for a two-parameter Rayleigh distribution based on upper record values has not been addressed from the Bayesian perspective. This paper provides Bayesian analysis methods by proposing a noninformative prior distribution to analyze survival data, using a two-parameter Rayleigh distribution based on record values. In addition, we provide a pivotal quantity and an algorithm based on the pivotal quantity to predict the behavior of future survival records. We show that the proposed method is superior to the frequentist counterpart in terms of the mean-squared error and bias through Monte carlo simulations. For illustrative purposes, survival data on lung cancer patients are analyzed, and it is proved that the proposed model can be a good alternative when prior information is not given. 相似文献

20.

Estimation and prediction of the Burr type XII distribution based on record values and inter-record times

《Journal of Statistical Computation and Simulation》2012,82(16):3297-3321

The maximum likelihood and Bayesian approaches for parameter estimations and prediction of future record values have been considered for the two-parameter Burr Type XII distribution based on record values with the number of trials following the record values (inter-record times). Firstly, the Bayes estimates are obtained based on a joint bivariate prior for the shape parameters. In this case, the Bayes estimates of the parameters have been developed by using Lindley's approximation and the Markov Chain Monte Carlo (MCMC) method due to the lack of explicit forms under the squared error and the linear-exponential loss functions. The MCMC method has been also used to construct the highest posterior density credible intervals. Secondly, the Bayes estimates are obtained with respect to a discrete prior for the first shape parameter and a conjugate prior for other shape parameter. The Bayes and the maximum likelihood estimates are compared in terms of the estimated risk by the Monte Carlo simulations. We further consider the non-Bayesian and Bayesian prediction for future lower record arising from the Burr Type XII distribution based on record data. The comparison of the derived predictors is carried out by using Monte Carlo simulations. A real data are analysed for illustration purposes. 相似文献