首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Drug developers are required to demonstrate substantial evidence of effectiveness through the conduct of adequate and well‐controlled (A&WC) studies to obtain marketing approval of their medicine. What constitutes A&WC is interpreted as the conduct of randomized controlled trials (RCTs). However, these trials are sometimes unfeasible because of their size, duration, and cost. One way to reduce sample size is to leverage information on the control through a prior. One consideration when forming data‐driven prior is the consistency of the external and the current data. It is essential to make this process less susceptible to choosing information that only helps improve the chances toward making an effectiveness claim. For this purpose, propensity score methods are employed for two reasons: (1) it gives the probability of a patient to be in the trial, and (2) it minimizes selection bias by pairing together treatment and control within the trial and control subjects in the external data that are similar in terms of their pretreatment characteristics. Two matching schemes based on propensity scores, estimated through generalized boosted methods, are applied to a real example with the objective of using external data to perform Bayesian augmented control in a trial where the allocation is disproportionate. The simulation results show that the data augmentation process prevents prior and data conflict and improves the precision of the estimator of the average treatment effect.  相似文献   

2.
Prior information is often incorporated informally when planning a clinical trial. Here, we present an approach on how to incorporate prior information, such as data from historical clinical trials, into the nuisance parameter–based sample size re‐estimation in a design with an internal pilot study. We focus on trials with continuous endpoints in which the outcome variance is the nuisance parameter. For planning and analyzing the trial, frequentist methods are considered. Moreover, the external information on the variance is summarized by the Bayesian meta‐analytic‐predictive approach. To incorporate external information into the sample size re‐estimation, we propose to update the meta‐analytic‐predictive prior based on the results of the internal pilot study and to re‐estimate the sample size using an estimator from the posterior. By means of a simulation study, we compare the operating characteristics such as power and sample size distribution of the proposed procedure with the traditional sample size re‐estimation approach that uses the pooled variance estimator. The simulation study shows that, if no prior‐data conflict is present, incorporating external information into the sample size re‐estimation improves the operating characteristics compared to the traditional approach. In the case of a prior‐data conflict, that is, when the variance of the ongoing clinical trial is unequal to the prior location, the performance of the traditional sample size re‐estimation procedure is in general superior, even when the prior information is robustified. When considering to include prior information in sample size re‐estimation, the potential gains should be balanced against the risks.  相似文献   

3.
Leverage values are being used in regression diagnostics as measures of unusual observations in the X-space. Detection of high leverage observations or points is crucial due to their responsibility for masking outliers. In linear regression, high leverage points (HLP) are those that stand far apart from the center (mean) of the data and hence the most extreme points in the covariate space get the highest leverage. But Hosemer and Lemeshow [Applied logistic regression, Wiley, New York, 1980] pointed out that in logistic regression, the leverage measure contains a component which can make the leverage values of genuine HLP misleadingly very small and that creates problem in the correct identification of the cases. Attempts have been made to identify the HLP based on the median distances from the mean, but since they are designed for the identification of a single high leverage point they may not be very effective in the presence of multiple HLP due to their masking (false–negative) and swamping (false–positive) effects. In this paper we propose a new method for the identification of multiple HLP in logistic regression where the suspect cases are identified by a robust group deletion technique and they are confirmed using diagnostic techniques. The usefulness of the proposed method is then investigated through several well-known examples and a Monte Carlo simulation.  相似文献   

4.
The posterior predictive p value (ppp) was invented as a Bayesian counterpart to classical p values. The methodology can be applied to discrepancy measures involving both data and parameters and can, hence, be targeted to check for various modeling assumptions. The interpretation can, however, be difficult since the distribution of the ppp value under modeling assumptions varies substantially between cases. A calibration procedure has been suggested, treating the ppp value as a test statistic in a prior predictive test. In this paper, we suggest that a prior predictive test may instead be based on the expected posterior discrepancy, which is somewhat simpler, both conceptually and computationally. Since both these methods require the simulation of a large posterior parameter sample for each of an equally large prior predictive data sample, we furthermore suggest to look for ways to match the given discrepancy by a computation‐saving conflict measure. This approach is also based on simulations but only requires sampling from two different distributions representing two contrasting information sources about a model parameter. The conflict measure methodology is also more flexible in that it handles non‐informative priors without difficulty. We compare the different approaches theoretically in some simple models and in a more complex applied example.  相似文献   

5.
Leverage values are being used in regression diagnostics as measures of influential observations in the $X$-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking and/or swamping of outliers, etc. Much work has been done on the identification of single high leverage points and it is generally believed that the problem of detection of a single high leverage point has been largely resolved. But there is no general agreement among the statisticians about the detection of multiple high leverage points. When a group of high leverage points is present in a data set, mainly because of the masking and/or swamping effects the commonly used diagnostic methods fail to identify them correctly. On the other hand, the robust alternative methods can identify the high leverage points correctly but they have a tendency to identify too many low leverage points to be points of high leverages which is not also desired. An attempt has been made to make a compromise between these two approaches. We propose an adaptive method where the suspected high leverage points are identified by robust methods and then the low leverage points (if any) are put back into the estimation data set after diagnostic checking. The usefulness of our newly proposed method for the detection of multiple high leverage points is studied by some well-known data sets and Monte Carlo simulations.  相似文献   

6.
Partial specification of a prior distribution can be appealing to an analyst, but there is no conventional way to update a partial prior. In this paper, we show how a framework for Bayesian updating with data can be based on the Dirichlet(a) process. Within this framework, partial information predictors generalize standard minimax predictors and have interesting multiple-point shrinkage properties. Approximations to partial-information estimators for squared error loss are defined straightforwardly, and an estimate of the mean shrinks the sample mean. The proposed updating of the partial prior is a consequence of four natural requirements when the Dirichlet parameter a is continuous. Namely, the updated partial posterior should be calculable from knowledge of only the data and partial prior, it should be faithful to the full posterior distribution, it should assign positive probability to every observed event {X,}, and it should not assign probability to unobserved events not included in the partial prior specification.  相似文献   

7.
In the area of diagnostics, it is common practice to leverage external data to augment a traditional study of diagnostic accuracy consisting of prospectively enrolled subjects to potentially reduce the time and/or cost needed for the performance evaluation of an investigational diagnostic device. However, the statistical methods currently being used for such leveraging may not clearly separate study design and outcome data analysis, and they may not adequately address possible bias due to differences in clinically relevant characteristics between the subjects constituting the traditional study and those constituting the external data. This paper is intended to draw attention in the field of diagnostics to the recently developed propensity score-integrated composite likelihood approach, which originally focused on therapeutic medical products. This approach applies the outcome-free principle to separate study design and outcome data analysis and can mitigate bias due to imbalance in covariates, thereby increasing the interpretability of study results. While this approach was conceived as a statistical tool for the design and analysis of clinical studies for therapeutic medical products, here, we will show how it can also be applied to the evaluation of sensitivity and specificity of an investigational diagnostic device leveraging external data. We consider two common scenarios for the design of a traditional diagnostic device study consisting of prospectively enrolled subjects, which is to be augmented by external data. The reader will be taken through the process of implementing this approach step-by-step following the outcome-free principle that preserves study integrity.  相似文献   

8.
Summary.  The problem motivating the paper is the determination of sample size in clinical trials under normal likelihoods and at the substantive testing stage of a financial audit where normality is not an appropriate assumption. A combination of analytical and simulation-based techniques within the Bayesian framework is proposed. The framework accommodates two different prior distributions: one is the general purpose fitting prior distribution that is used in Bayesian analysis and the other is the expert subjective prior distribution, the sampling prior which is believed to generate the parameter values which in turn generate the data. We obtain many theoretical results and one key result is that typical non-informative prior distributions lead to very small sample sizes. In contrast, a very informative prior distribution may either lead to a very small or a very large sample size depending on the location of the centre of the prior distribution and the hypothesized value of the parameter. The methods that are developed are quite general and can be applied to other sample size determination problems. Some numerical illustrations which bring out many other aspects of the optimum sample size are given.  相似文献   

9.
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.  相似文献   

10.
Mixtures of multivariate t distributions provide a robust parametric extension to the fitting of data with respect to normal mixtures. In presence of some noise component, potential outliers or data with longer-than-normal tails, one way to broaden the model can be provided by considering t distributions. In this framework, the degrees of freedom can act as a robustness parameter, tuning the heaviness of the tails, and downweighting the effect of the outliers on the parameters estimation. The aim of this paper is to extend to mixtures of multivariate elliptical distributions some theoretical results about the likelihood maximization on constrained parameter spaces. Further, a constrained monotone algorithm implementing maximum likelihood mixture decomposition of multivariate t distributions is proposed, to achieve improved convergence capabilities and robustness. Monte Carlo numerical simulations and a real data study illustrate the better performance of the algorithm, comparing it to earlier proposals.  相似文献   

11.
We present a Bayesian analysis framework for matrix-variate normal data with dependency structures induced by rows and columns. This framework of matrix normal models includes prior specifications, posterior computation using Markov chain Monte Carlo methods, evaluation of prediction uncertainty, model structure search, and extensions to multidimensional arrays. Compared with Bayesian probabilistic matrix factorization, which integrates a Gaussian prior for single row of the data matrix, our proposed model, namely Bayesian hierarchical kernelized probabilistic matrix factorization, imposes Gaussian Process priors over multiple rows of the matrix. Hence, the learned model explicitly captures the underlying correlation among the rows and the columns. In addition, our method requires no specific assumptions like independence of latent factors for rows and columns, which obtains more flexibility for modeling real data compared to existing works. Finally, the proposed framework can be adapted to a wide range of applications, including multivariate analysis, times series, and spatial modeling. Experiments highlight the superiority of the proposed model in handling model uncertainty and model optimization.  相似文献   

12.
In rare diseases, typically only a small number of patients are available for a randomized clinical trial. Nevertheless, it is not uncommon that more than one study is performed to evaluate a (new) treatment. Scarcity of available evidence makes it particularly valuable to pool the data in a meta-analysis. When the primary outcome is binary, the small sample sizes increase the chance of observing zero events. The frequentist random-effects model is known to induce bias and to result in improper interval estimation of the overall treatment effect in a meta-analysis with zero events. Bayesian hierarchical modeling could be a promising alternative. Bayesian models are known for being sensitive to the choice of prior distributions for between-study variance (heterogeneity) in sparse settings. In a rare disease setting, only limited data will be available to base the prior on, therefore, robustness of estimation is desirable. We performed an extensive and diverse simulation study, aiming to provide practitioners with advice on the choice of a sufficiently robust prior distribution shape for the heterogeneity parameter. Our results show that priors that place some concentrated mass on small τ values but do not restrict the density for example, the Uniform(−10, 10) heterogeneity prior on the log(τ2) scale, show robust 95% coverage combined with less overestimation of the overall treatment effect, across varying degrees of heterogeneity. We illustrate the results with meta-analyzes of a few small trials.  相似文献   

13.
Borrowing data from external control has been an appealing strategy for evidence synthesis when conducting randomized controlled trials (RCTs). Often named hybrid control trials, they leverage existing control data from clinical trials or potentially real-world data (RWD), enable trial designs to allocate more patients to the novel intervention arm, and improve the efficiency or lower the cost of the primary RCT. Several methods have been established and developed to borrow external control data, among which the propensity score methods and Bayesian dynamic borrowing framework play essential roles. Noticing the unique strengths of propensity score methods and Bayesian hierarchical models, we utilize both methods in a complementary manner to analyze hybrid control studies. In this article, we review methods including covariate adjustments, propensity score matching and weighting in combination with dynamic borrowing and compare the performance of these methods through comprehensive simulations. Different degrees of covariate imbalance and confounding are examined. Our findings suggested that the conventional covariate adjustment in combination with the Bayesian commensurate prior model provides the highest power with good type I error control under the investigated settings. It has desired performance especially under scenarios of different degrees of confounding. To estimate efficacy signals in the exploratory setting, the covariate adjustment method in combination with the Bayesian commensurate prior is recommended.  相似文献   

14.
This paper presents a robust mixture modeling framework using the multivariate skew t distributions, an extension of the multivariate Student’s t family with additional shape parameters to regulate skewness. The proposed model results in a very complicated likelihood. Two variants of Monte Carlo EM algorithms are developed to carry out maximum likelihood estimation of mixture parameters. In addition, we offer a general information-based method for obtaining the asymptotic covariance matrix of maximum likelihood estimates. Some practical issues including the selection of starting values as well as the stopping criterion are also discussed. The proposed methodology is applied to a subset of the Australian Institute of Sport data for illustration.  相似文献   

15.
In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration.  相似文献   

16.
When recruitment into a clinical trial is limited due to rarity of the disease of interest, or when recruitment to the control arm is limited due to ethical reasons (eg, pediatric studies or important unmet medical need), exploiting historical controls to augment the prospectively collected database can be an attractive option. Statistical methods for combining historical data with randomized data, while accounting for the incompatibility between the two, have been recently proposed and remain an active field of research. The current literature is lacking a rigorous comparison between methods but also guidelines about their use in practice. In this paper, we compare the existing methods based on a confirmatory phase III study design exercise done for a new antibacterial therapy with a binary endpoint and a single historical dataset. A procedure to assess the relative performance of the different methods for borrowing information from historical control data is proposed, and practical questions related to the selection and implementation of methods are discussed. Based on our examination, we found that the methods have a comparable performance, but we recommend the robust mixture prior for its ease of implementation.  相似文献   

17.

The Mallows-type estimator, one of the most reasonable bounded influence estimators, often downweights leverage points regardless of the magnitude of the corresponding residual, and this could imply a loss of efficiency. In this article, we consider whether the efficiency of this bounded influence estimator could be improved by regarding both the robust x -distance and the residual size. We develop a new robust procedure based on the ideas of the Mallows-type estimator and the general robust recipe, where data been cleaned by pulling outliers towards their fitted values. Our basic idea is to formulate the robust estimation as an allocation problem, where the objective function is a Huber-type "loss" function, but the pulling resource is restricted. Using a mathematical programming technique, the pulling resource is optimally allocated to influential points <$>({x}_i, y_i)<$> with respect to residual size and given weights, <$>w({x}_i)<$>. Three previously published approaches are compared to our proposal via simulated experiments. In the case of contaminated data by regression outliers and "good" leverage points, the proposed robust estimator is a reasonable bounded influence estimator concerning both efficiency and norm of bias. In addition, the proposed approach offers the potential to establish constraints for the regression parameters and also may potentially provide insight regarding outlier detection.  相似文献   

18.
Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada  相似文献   

19.
A novel framework is proposed for the estimation of multiple sinusoids from irregularly sampled time series. This spectral analysis problem is addressed as an under-determined inverse problem, where the spectrum is discretized on an arbitrarily thin frequency grid. As we focus on line spectra estimation, the solution must be sparse, i.e. the amplitude of the spectrum must be zero almost everywhere. Such prior information is taken into account within the Bayesian framework. Two models are used to account for the prior sparseness of the solution, namely a Laplace prior and a Bernoulli–Gaussian prior, associated to optimization and stochastic sampling algorithms, respectively. Such approaches are efficient alternatives to usual sequential prewhitening methods, especially in case of strong sampling aliases perturbating the Fourier spectrum. Both methods should be intensively tested on real data sets by physicists.  相似文献   

20.
Abstract. The modelling process in Bayesian Statistics constitutes the fundamental stage of the analysis, since depending on the chosen probability laws the inferences may vary considerably. This is particularly true when conflicts arise between two or more sources of information. For instance, inference in the presence of an outlier (which conflicts with the information provided by the other observations) can be highly dependent on the assumed sampling distribution. When heavy‐tailed (e.g. t) distributions are used, outliers may be rejected whereas this kind of robust inference is not available when we use light‐tailed (e.g. normal) distributions. A long literature has established sufficient conditions on location‐parameter models to resolve conflict in various ways. In this work, we consider a location–scale parameter structure, which is more complex than the single parameter cases because conflicts can arise between three sources of information, namely the likelihood, the prior distribution for the location parameter and the prior for the scale parameter. We establish sufficient conditions on the distributions in a location–scale model to resolve conflicts in different ways as a single observation tends to infinity. In addition, for each case, we explicitly give the limiting posterior distributions as the conflict becomes more extreme.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号