首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Empirical Bayes is a versatile approach to “learn from a lot” in two ways: first, from a large number of variables and, second, from a potentially large amount of prior information, for example, stored in public repositories. We review applications of a variety of empirical Bayes methods to several well‐known model‐based prediction methods, including penalized regression, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss “formal” empirical Bayes methods that maximize the marginal likelihood but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross‐validation and full Bayes and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we consider a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, which model a priori information on variables termed “co‐data”. In particular, we present two novel examples that allow for co‐data: first, a Bayesian spike‐and‐slab setting that facilitates inclusion of multiple co‐data sources and types and, second, a hybrid empirical Bayes–full Bayes ridge regression approach for estimation of the posterior predictive interval.  相似文献   

2.
Single cohort stage‐frequency data are considered when assessing the stage reached by individuals through destructive sampling. For this type of data, when all hazard rates are assumed constant and equal, Laplace transform methods have been applied in the past to estimate the parameters in each stage‐duration distribution and the overall hazard rates. If hazard rates are not all equal, estimating stage‐duration parameters using Laplace transform methods becomes complex. In this paper, two new models are proposed to estimate stage‐dependent maturation parameters using Laplace transform methods where non‐trivial hazard rates apply. The first model encompasses hazard rates that are constant within each stage but vary between stages. The second model encompasses time‐dependent hazard rates within stages. Moreover, this paper introduces a method for estimating the hazard rate in each stage for the stage‐wise constant hazard rates model. This work presents methods that could be used in specific types of laboratory studies, but the main motivation is to explore the relationships between stage maturation parameters that, in future work, could be exploited in applying Bayesian approaches. The application of the methodology in each model is evaluated using simulated data in order to illustrate the structure of these models.  相似文献   

3.
The most common forecasting methods in business are based on exponential smoothing, and the most common time series in business are inherently non‐negative. Therefore it is of interest to consider the properties of the potential stochastic models underlying exponential smoothing when applied to non‐negative data. We explore exponential smoothing state space models for non‐negative data under various assumptions about the innovations, or error, process. We first demonstrate that prediction distributions from some commonly used state space models may have an infinite variance beyond a certain forecasting horizon. For multiplicative error models that do not have this flaw, we show that sample paths will converge almost surely to zero even when the error distribution is non‐Gaussian. We propose a new model with similar properties to exponential smoothing, but which does not have these problems, and we develop some distributional properties for our new model. We then explore the implications of our results for inference, and compare the short‐term forecasting performance of the various models using data on the weekly sales of over 300 items of costume jewelry. The main findings of the research are that the Gaussian approximation is adequate for estimation and one‐step‐ahead forecasting. However, as the forecasting horizon increases, the approximate prediction intervals become increasingly problematic. When the model is to be used for simulation purposes, a suitably specified scheme must be employed.  相似文献   

4.
A longitudinal mixture model for classifying patients into responders and non‐responders is established using both likelihood‐based and Bayesian approaches. The model takes into consideration responders in the control group. Therefore, it is especially useful in situations where the placebo response is strong, or in equivalence trials where the drug in development is compared with a standard treatment. Under our model, a treatment shows evidence of being effective if it increases the proportion of responders or increases the response rate among responders in the treated group compared with the control group. Therefore, the model has flexibility to accommodate different situations. The proposed method is illustrated using simulation and a depression clinical trial dataset for the likelihood‐based approach, and the same depression clinical trial dataset for the Bayesian approach. The likelihood‐based and Bayesian approaches generated consistent results for the depression trial data. In both the placebo group and the treated group, patients are classified into two components with distinct response rate. The proportion of responders is shown to be significantly higher in the treated group compared with the control group, suggesting the treatment paroxetine is effective. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

5.
A Comparison of Frailty and Other Models for Bivariate Survival Data   总被引:1,自引:0,他引:1  
Multivariate survival data arise when eachstudy subject may experience multiple events or when study subjectsare clustered into groups. Statistical analyses of such dataneed to account for the intra-cluster dependence through appropriatemodeling. Frailty models are the most popular for such failuretime data. However, there are other approaches which model thedependence structure directly. In this article, we compare thefrailty models for bivariate data with the models based on bivariateexponential and Weibull distributions. Bayesian methods providea convenient paradigm for comparing the two sets of models weconsider. Our techniques are illustrated using two examples.One simulated example demonstrates model choice methods developedin this paper and the other example, based on a practical dataset of onset of blindness among patients with diabetic Retinopathy,considers Bayesian inference using different models.  相似文献   

6.
For clinical trials with time‐to‐event endpoints, predicting the accrual of the events of interest with precision is critical in determining the timing of interim and final analyses. For example, overall survival (OS) is often chosen as the primary efficacy endpoint in oncology studies, with planned interim and final analyses at a pre‐specified number of deaths. Often, correlated surrogate information, such as time‐to‐progression (TTP) and progression‐free survival, are also collected as secondary efficacy endpoints. It would be appealing to borrow strength from the surrogate information to improve the precision of the analysis time prediction. Currently available methods in the literature for predicting analysis timings do not consider utilizing the surrogate information. In this article, using OS and TTP as an example, a general parametric model for OS and TTP is proposed, with the assumption that disease progression could change the course of the overall survival. Progression‐free survival, related both to OS and TTP, will be handled separately, as it can be derived from OS and TTP. The authors seek to develop a prediction procedure using a Bayesian method and provide detailed implementation strategies under certain assumptions. Simulations are performed to evaluate the performance of the proposed method. An application to a real study is also provided. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
In biomedical studies, it is of substantial interest to develop risk prediction scores using high-dimensional data such as gene expression data for clinical endpoints that are subject to censoring. In the presence of well-established clinical risk factors, investigators often prefer a procedure that also adjusts for these clinical variables. While accelerated failure time (AFT) models are a useful tool for the analysis of censored outcome data, it assumes that covariate effects on the logarithm of time-to-event are linear, which is often unrealistic in practice. We propose to build risk prediction scores through regularized rank estimation in partly linear AFT models, where high-dimensional data such as gene expression data are modeled linearly and important clinical variables are modeled nonlinearly using penalized regression splines. We show through simulation studies that our model has better operating characteristics compared to several existing models. In particular, we show that there is a non-negligible effect on prediction as well as feature selection when nonlinear clinical effects are misspecified as linear. This work is motivated by a recent prostate cancer study, where investigators collected gene expression data along with established prognostic clinical variables and the primary endpoint is time to prostate cancer recurrence. We analyzed the prostate cancer data and evaluated prediction performance of several models based on the extended c statistic for censored data, showing that 1) the relationship between the clinical variable, prostate specific antigen, and the prostate cancer recurrence is likely nonlinear, i.e., the time to recurrence decreases as PSA increases and it starts to level off when PSA becomes greater than 11; 2) correct specification of this nonlinear effect improves performance in prediction and feature selection; and 3) addition of gene expression data does not seem to further improve the performance of the resultant risk prediction scores.  相似文献   

8.
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2‐arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size.  相似文献   

9.
We present a scalable Bayesian modelling approach for identifying brain regions that respond to a certain stimulus and use them to classify subjects. More specifically, we deal with multi‐subject electroencephalography (EEG) data with a binary response distinguishing between alcoholic and control groups. The covariates are matrix‐variate with measurements taken from each subject at different locations across multiple time points. EEG data have a complex structure with both spatial and temporal attributes. We use a divide‐and‐conquer strategy and build separate local models, that is, one model at each time point. We employ Bayesian variable selection approaches using a structured continuous spike‐and‐slab prior to identify the locations that respond to a certain stimulus. We incorporate the spatio‐temporal structure through a Kronecker product of the spatial and temporal correlation matrices. We develop a highly scalable estimation algorithm, using likelihood approximation, to deal with large number of parameters in the model. Variable selection is done via clustering of the locations based on their duration of activation. We use scoring rules to evaluate the prediction performance. Simulation studies demonstrate the efficiency of our scalable algorithm in terms of estimation and fast computation. We present results using our scalable approach on a case study of multi‐subject EEG data.  相似文献   

10.
Multivariate extreme events are typically modelled using multivariate extreme value distributions. Unfortunately, there exists no finite parametrization for the class of multivariate extreme value distributions. One common approach is to model extreme events using some flexible parametric subclass. This approach has been limited to only two or three dimensions, primarily because suitably flexible high-dimensional parametric models have prohibitively complex density functions. We present an approach that allows a number of popular flexible models to be used in arbitrarily high dimensions. The approach easily handles missing and censored data, and can be employed when modelling componentwise maxima and multivariate threshold exceedances. The approach is based on a representation using conditionally independent marginal components, conditioning on positive stable random variables. We use Bayesian inference, where the conditioning variables are treated as auxiliary variables within Markov chain Monte Carlo simulations. We demonstrate these methods with an application to sea-levels, using data collected at 10 sites on the east coast of England.  相似文献   

11.
Multiple-membership logit models with random effects are models for clustered binary data, where each statistical unit can belong to more than one group. The likelihood function of these models is analytically intractable. We propose two different approaches for parameter estimation: indirect inference and data cloning (DC). The former is a non-likelihood-based method which uses an auxiliary model to select reasonable estimates. We propose an auxiliary model with the same dimension of parameter space as the target model, which is particularly convenient to reach good estimates very fast. The latter method computes maximum likelihood estimates through the posterior distribution of an adequate Bayesian model, fitted to cloned data. We implement a DC algorithm specifically for multiple-membership models. A Monte Carlo experiment compares the two methods on simulated data. For further comparison, we also report Bayesian posterior mean and Integrated Nested Laplace Approximation hybrid DC estimates. Simulations show a negligible loss of efficiency for the indirect inference estimator, compensated by a relevant computational gain. The approaches are then illustrated with two real examples on matched paired data.  相似文献   

12.
Various statistical models have been proposed for two‐dimensional dose finding in drug‐combination trials. However, it is often a dilemma to decide which model to use when conducting a particular drug‐combination trial. We make a comprehensive comparison of four dose‐finding methods, and for fairness, we apply the same dose‐finding algorithm under the four model structures. Through extensive simulation studies, we compare the operating characteristics of these methods in various practical scenarios. The results show that different models may lead to different design properties and that no single model performs uniformly better in all scenarios. As a result, we propose using Bayesian model averaging to overcome the arbitrariness of the model specification and enhance the robustness of the design. We assign a discrete probability mass to each model as the prior model probability and then estimate the toxicity probabilities of combined doses in the Bayesian model averaging framework. During the trial, we adaptively allocated each new cohort of patients to the most appropriate dose combination by comparing the posterior estimates of the toxicity probabilities with the prespecified toxicity target. The simulation results demonstrate that the Bayesian model averaging approach is robust under various scenarios. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
Time‐to‐event data have been extensively studied in many areas. Although multiple time scales are often observed, commonly used methods are based on a single time scale. Analysing time‐to‐event data on two time scales can offer a more extensive insight into the phenomenon. We introduce a non‐parametric Bayesian intensity model to analyse two‐dimensional point process on Lexis diagrams. After a simple discretization of the two‐dimensional process, we model the intensity by a one‐dimensional piecewise constant hazard functions parametrized by the change points and corresponding hazard levels. Our prior distribution incorporates a built‐in smoothing feature in two dimensions. We implement posterior simulation using the reversible jump Metropolis–Hastings algorithm and demonstrate the applicability of the method using both simulated and empirical survival data. Our approach outperforms commonly applied models by borrowing strength in two dimensions.  相似文献   

14.
This paper reviews Bayesian methods that have been developed in recent years to estimate and evaluate dynamic stochastic general equilibrium (DSGE) models. We consider the estimation of linearized DSGE models, the evaluation of models based on Bayesian model checking, posterior odds comparisons, and comparisons to vector autoregressions, as well as the non-linear estimation based on a second-order accurate model solution. These methods are applied to data generated from correctly specified and misspecified linearized DSGE models and a DSGE model that was solved with a second-order perturbation method.  相似文献   

15.
Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.  相似文献   

16.
We consider data with a continuous outcome that is missing at random and a fully observed set of covariates. We compare by simulation a variety of doubly-robust (DR) estimators for estimating the mean of the outcome. An estimator is DR if it is consistent when either the regression model for the mean function or the propensity to respond is correctly specified. Performance of different methods is compared in terms of root mean squared error of the estimates and width and coverage of confidence intervals or posterior credibility intervals in repeated samples. Overall, the DR methods tended to yield better inference than the incorrect model when either the propensity or mean model is correctly specified, but were less successful for small sample sizes, where the asymptotic DR property is less consequential. Two methods tended to outperform the other DR methods: penalized spline of propensity prediction [Little RJA, An H. Robust likelihood-based analysis of multivariate data with missing values. Statist Sinica. 2004;14:949–968] and the robust method proposed in [Cao W, Tsiatis AA, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96:723–734].  相似文献   

17.
Abstract. We investigate simulation methodology for Bayesian inference in Lévy‐driven stochastic volatility (SV) models. Typically, Bayesian inference from such models is performed using Markov chain Monte Carlo (MCMC); this is often a challenging task. Sequential Monte Carlo (SMC) samplers are methods that can improve over MCMC; however, there are many user‐set parameters to specify. We develop a fully automated SMC algorithm, which substantially improves over the standard MCMC methods in the literature. To illustrate our methodology, we look at a model comprised of a Heston model with an independent, additive, variance gamma process in the returns equation. The driving gamma process can capture the stylized behaviour of many financial time series and a discretized version, fit in a Bayesian manner, has been found to be very useful for modelling equity data. We demonstrate that it is possible to draw exact inference, in the sense of no time‐discretization error, from the Bayesian SV model.  相似文献   

18.
This paper presents a comprehensive review and comparison of five computational methods for Bayesian model selection, based on MCMC simulations from posterior model parameter distributions. We apply these methods to a well-known and important class of models in financial time series analysis, namely GARCH and GARCH-t models for conditional return distributions (assuming normal and t-distributions). We compare their performance with the more common maximum likelihood-based model selection for simulated and real market data. All five MCMC methods proved reliable in the simulation study, although differing in their computational demands. Results on simulated data also show that for large degrees of freedom (where the t-distribution becomes more similar to a normal one), Bayesian model selection results in better decisions in favor of the true model than maximum likelihood. Results on market data show the instability of the harmonic mean estimator and reliability of the advanced model selection methods.  相似文献   

19.
In this paper, a new hybrid model of vector autoregressive moving average (VARMA) models and Bayesian networks is proposed to improve the forecasting performance of multivariate time series. In the proposed model, the VARMA model, which is a popular linear model in time series forecasting, is specified to capture the linear characteristics. Then the errors of the VARMA model are clustered into some trends by K-means algorithm with Krzanowski–Lai cluster validity index determining the number of trends, and a Bayesian network is built to learn the relationship between the data and the trend of its corresponding VARMA error. Finally, the estimated values of the VARMA model are compensated by the probabilities of their corresponding VARMA errors belonging to each trend, which are obtained from the Bayesian network. Compared with VARMA models, the experimental results with a simulation study and two multivariate real-world data sets indicate that the proposed model can effectively improve the prediction performance.  相似文献   

20.
The Simon's two‐stage design is the most commonly applied among multi‐stage designs in phase IIA clinical trials. It combines the sample sizes at the two stages in order to minimize either the expected or the maximum sample size. When the uncertainty about pre‐trial beliefs on the expected or desired response rate is high, a Bayesian alternative should be considered since it allows to deal with the entire distribution of the parameter of interest in a more natural way. In this setting, a crucial issue is how to construct a distribution from the available summaries to use as a clinical prior in a Bayesian design. In this work, we explore the Bayesian counterparts of the Simon's two‐stage design based on the predictive version of the single threshold design. This design requires specifying two prior distributions: the analysis prior, which is used to compute the posterior probabilities, and the design prior, which is employed to obtain the prior predictive distribution. While the usual approach is to build beta priors for carrying out a conjugate analysis, we derived both the analysis and the design distributions through linear combinations of B‐splines. The motivating example is the planning of the phase IIA two‐stage trial on anti‐HER2 DNA vaccine in breast cancer, where initial beliefs formed from elicited experts' opinions and historical data showed a high level of uncertainty. In a sample size determination problem, the impact of different priors is evaluated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号