期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sample size calculation for a proportional hazards mixture cure model with nonbinary covariates

Yihong Zhan Yanan Zhang Bo Cai James W. Hardin 《Journal of applied statistics》2019,46(3):468-483

Sample size calculation is a critical issue in clinical trials because a small sample size leads to a biased inference and a large sample size increases the cost. With the development of advanced medical technology, some patients can be cured of certain chronic diseases, and the proportional hazards mixture cure model has been developed to handle survival data with potential cure information. Given the needs of survival trials with potential cure proportions, a corresponding sample size formula based on the log-rank test statistic for binary covariates has been proposed by Wang et al. [25]. However, a sample size formula based on continuous variables has not been developed. Herein, we presented sample size and power calculations for the mixture cure model with continuous variables based on the log-rank method and further modified it by Ewell's method. The proposed approaches were evaluated using simulation studies for synthetic data from exponential and Weibull distributions. A program for calculating necessary sample size for continuous covariates in a mixture cure model was implemented in R. 相似文献

2.

On inference for Kendall's τ within a longitudinal data setting

Yan Ma 《Journal of applied statistics》2012,39(11):2441-2452

Kendall's τ is a non-parametric measure of correlation based on ranks and is used in a wide range of research disciplines. Although methods are available for making inference about Kendall's τ, none has been extended to modeling multiple Kendall's τs arising in longitudinal data analysis. Compounding this problem is the pervasive issue of missing data in such study designs. In this article, we develop a novel approach to provide inference about Kendall's τ within a longitudinal study setting under both complete and missing data. The proposed approach is illustrated with simulated data and applied to an HIV prevention study. 相似文献

3.

Comparison of Selected Methods for Modeling of Multi-State Disease Progression Processes: A Simulation Study

Ella Huszti Ahmadou Alioum Catherine Quantin 《统计学通讯:模拟与计算》2013,42(9):1402-1421

Prognostic studies are essential to understand the role of particular prognostic factors and, thus, improve prognosis. In most studies, disease progression trajectories of individual patients may end up with one of mutually exclusive endpoints or can involve a sequence of different events.

One challenge in such studies concerns separating the effects of putative prognostic factors on these different endpoints and testing the differences between these effects.

In this article, we systematically evaluate and compare, through simulations, the performance of three alternative multivariable regression approaches in analyzing competing risks and multiple-event longitudinal data. The three approaches are: (1) fitting separate event-specific Cox's proportional hazards models; (2) the extension of Cox's model to competing risks proposed by Lunn and McNeil; and (3) Markov multi-state model.

The simulation design is based on a prognostic study of cancer progression, and several simulated scenarios help investigate different methodological issues relevant to the modeling of multiple-event processes of disease progression. The results highlight some practically important issues. Specifically, the decreased precision of the observed timing of intermediary (non fatal) events has a strong negative impact on the accuracy of regression coefficients estimated with either the Cox's or Lunn-McNeil models, while the Markov model appears to be quite robust, under the same circumstances. Furthermore, the tests based on both Markov and Lunn-McNeil models had similar power for detecting a difference between the effects of the same covariate on the hazards of two mutually exclusive events. The Markov approach yields also accurate Type I error rate and good empirical power for testing the hypothesis that the effect of a prognostic factor on changes after an intermediary event, which cannot be directly tested with the Lunn-McNeil method. Bootstrap-based standard errors improve the coverage rates for Markov model estimates. Overall, the results of our simulations validate Markov multi-state model for a wide range of data structures encountered in prognostic studies of disease progression, and may guide end users regarding the choice of model(s) most appropriate for their specific application. 相似文献

4.

On estimands and the analysis of adverse events in the presence of varying follow‐up times within the benefit assessment of therapies

Steffen Unkel Marjan Amiri Norbert Benda Jan Beyersmann Dietrich Knoerzer Katrin Kupas Frank Langer Friedhelm Leverkus Anja Loos Claudia Ose Tanja Proctor Claudia Schmoor Carsten Schwenke Guido Skipka Kristina Unnebrink Florian Voss Tim Friede 《Pharmaceutical statistics》2019,18(2):166-183

The analysis of adverse events (AEs) is a key component in the assessment of a drug's safety profile. Inappropriate analysis methods may result in misleading conclusions about a therapy's safety and consequently its benefit‐risk ratio. The statistical analysis of AEs is complicated by the fact that the follow‐up times can vary between the patients included in a clinical trial. This paper takes as its focus the analysis of AE data in the presence of varying follow‐up times within the benefit assessment of therapeutic interventions. Instead of approaching this issue directly and solely from an analysis point of view, we first discuss what should be estimated in the context of safety data, leading to the concept of estimands. Although the current discussion on estimands is mainly related to efficacy evaluation, the concept is applicable to safety endpoints as well. Within the framework of estimands, we present statistical methods for analysing AEs with the focus being on the time to the occurrence of the first AE of a specific type. We give recommendations which estimators should be used for the estimands described. Furthermore, we state practical implications of the analysis of AEs in clinical trials and give an overview of examples across different indications. We also provide a review of current practices of health technology assessment (HTA) agencies with respect to the evaluation of safety data. Finally, we describe problems with meta‐analyses of AE data and sketch possible solutions. 相似文献

5.

Propensity score matching and stratification using multiparty data without pooling

Jixian Wang Roland Marion-Gallois 《Pharmaceutical statistics》2023,22(1):4-19

Matching and stratification based on confounding factors or propensity scores (PS) are powerful approaches for reducing confounding bias in indirect treatment comparisons. However, implementing these approaches requires pooled individual patient data (IPD). The research presented here was motivated by an indirect comparison between a single-armed trial in acute myeloid leukemia (AML), and two external AML registries with current treatments for a control. For confidentiality reasons, IPD cannot be pooled. Common approaches to adjusting confounding bias, such as PS matching or stratification, cannot be applied as 1) a model for PS, for example, a logistic model, cannot be fitted without pooling covariate data; 2) pooling response data may be necessary for some statistical inference (e.g., estimating the SE of mean difference of matched pairs) after PS matching. We propose a set of approaches that do not require pooling IPD, using a combination of methods including a linear discriminant for matching and stratification, and secure multiparty computation for estimation of within-pair sample variance and for calculations involving multiple control sources. The approaches only need to share aggregated data offline, rather than real-time secure data transfer, as required by typical secure multiparty computation for model fitting. For survival analysis, we propose an approach using restricted mean survival time. A simulation study was conducted to evaluate this approach in several scenarios, in particular, with a mixture of continuous and binary covariates. The results confirmed the robustness and efficiency of the proposed approach. A real data example is also provided for illustration. 相似文献

6.

Bayesian False Discovery Rate Wavelet Shrinkage: Theory and Applications

Ilya Lavrik Yoon Young Jung Brani Vidakovic 《统计学通讯:模拟与计算》2013,42(6):1086-1100

Statistical inference in the wavelet domain remains a vibrant area of contemporary statistical research because of desirable properties of wavelet representations and the need of scientific community to process, explore, and summarize massive data sets. Prime examples are biomedical, geophysical, and internet related data. We propose two new approaches to wavelet shrinkage/thresholding.

In the spirit of Efron and Tibshirani's recent work on local false discovery rate, we propose Bayesian Local False Discovery Rate (BLFDR), where the underlying model on wavelet coefficients does not assume known variances. This approach to wavelet shrinkage is shown to be connected with shrinkage based on Bayes factors. The second proposal, Bayesian False Discovery Rate (BaFDR), is based on ordering of posterior probabilities of hypotheses on true wavelets coefficients being null, in Bayesian testing of multiple hypotheses.

We demonstrate that both approaches result in competitive shrinkage methods by contrasting them to some popular shrinkage techniques. 相似文献

7.

Bayesian Joint Modeling of Multiple Gene Networks and Diverse Genomic Data to Identify Target Genes of a Transcription Factor

Wei P Pan W 《The annals of applied statistics》2012,6(1):334-355

相似文献

8.

Power and stability comparisons of multiple testing procedures with false discovery rate control

《Journal of Statistical Computation and Simulation》2012,82(14):2808-2822

High-throughput data analyses are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. False discovery rate (FDR) has been considered a proper type I error rate to control for discovery-based high-throughput data analysis. Various multiple testing procedures have been proposed to control the FDR. The power and stability properties of some commonly used multiple testing procedures have not been extensively investigated yet, however. Simulation studies were conducted to compare power and stability properties of five widely used multiple testing procedures at different proportions of true discoveries for various sample sizes for both independent and dependent test statistics. Storey's two linear step-up procedures showed the best performance among all tested procedures considering FDR control, power, and variance of true discoveries. Leukaemia and ovarian cancer microarray studies were used to illustrate the power and stability characteristics of these five multiple testing procedures with FDR control. 相似文献

9.

Flexible Tweedie regression models for continuous data

Wagner Hugo Bonat Célestin C. Kokonendji 《Journal of Statistical Computation and Simulation》2017,87(11):2138-2152

Tweedie regression models (TRMs) provide a flexible family of distributions to deal with non-negative right-skewed data and can handle continuous data with probability mass at zero. Estimation and inference of TRMs based on the maximum likelihood (ML) method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting TRMs, namely quasi-likelihood (QML) and pseudo-likelihood (PML). We discuss their asymptotic properties and perform simulation studies to compare our methods with the ML method. We show that the QML method provides asymptotically efficient estimation for regression parameters. Simulation studies showed that the QML and PML approaches present estimates, standard errors and coverage rates similar to the ML method. Furthermore, the second-moment assumptions required by the QML and PML methods enable us to extend the TRMs to the class of quasi-TRMs in Wedderburn's style. It allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide an R implementation and illustrate the application of TRMs using three data sets. 相似文献

10.

Variable selection for high-dimensional generalized linear model with block-missing data

Yifan He Yang Feng Xinyuan Song 《Scandinavian Journal of Statistics》2023,50(3):1279-1297

In modern scientific research, multiblock missing data emerges with synthesizing information across multiple studies. However, existing imputation methods for handling block-wise missing data either focus on the single-block missing pattern or heavily rely on the model structure. In this study, we propose a single regression-based imputation algorithm for multiblock missing data. First, we conduct a sparse precision matrix estimation based on the structure of block-wise missing data. Second, we impute the missing blocks with their means conditional on the observed blocks. Theoretical results about variable selection and estimation consistency are established in the context of a generalized linear model. Moreover, simulation studies show that compared with existing methods, the proposed imputation procedure is robust to various missing mechanisms because of the good properties of regression imputation. An application to Alzheimer's Disease Neuroimaging Initiative data also confirms the superiority of our proposed method. 相似文献

11.

Longitudinal covariate-adjusted response-adaptive randomized designs

Tao Huang Zhongqiang Liu Feifang Hu 《Journal of statistical planning and inference》2013

Clinical trials often involve longitudinal data set which has two important characteristics: repeated and correlated measurements and time-varying covariates. In this paper, we propose a general framework of longitudinal covariate-adjusted response-adaptive (LCARA) randomization procedures. We study their properties under widely satisfied conditions. This design skews the allocation probabilities which depend on both patients' first observed covariates and sequentially estimated parameters based on the accrued longitudinal responses and covariates. The asymptotic properties of estimators for the unknown parameters and allocation proportions are established. The special case of binary treatment and continuous responses is studied in detail. Simulation studies and an analysis of the National Cooperative Gallstone Study (NCGS) data are carried out to illustrate the advantages of the proposed LCARA randomization procedure. 相似文献

12.

A procedure for approximate negative binomial tolerance intervals

《Journal of Statistical Computation and Simulation》2012,82(2):438-450

In this article, we present a procedure for approximate negative binomial tolerance intervals. We utilize an approach that has been well-studied to approximate tolerance intervals for the binomial and Poisson settings, which is based on the confidence interval for the parameter in the respective distribution. A simulation study is performed to assess the coverage probabilities and expected widths of the tolerance intervals. The simulation study also compares eight different confidence interval approaches for the negative binomial proportions. We recommend using those in practice that perform the best based on our simulation results. The method is also illustrated using two real data examples. 相似文献

13.

Assessing safety at the end of clinical trials using system organ classes: A case and comparative study

Raymond Carragher Chris Robertson 《Pharmaceutical statistics》2021,20(6):1278-1287

Recent approaches to the statistical analysis of adverse event (AE) data in clinical trials have proposed the use of groupings of related AEs, such as by system organ class (SOC). These methods have opened up the possibility of scanning large numbers of AEs while controlling for multiple comparisons, making the comparative performance of the different methods in terms of AE detection and error rates of interest to investigators. We apply two Bayesian models and two procedures for controlling the false discovery rate (FDR), which use groupings of AEs, to real clinical trial safety data. We find that while the Bayesian models are appropriate for the full data set, the error controlling methods only give similar results to the Bayesian methods when low incidence AEs are removed. A simulation study is used to compare the relative performances of the methods. We investigate the differences between the methods over full trial data sets, and over data sets with low incidence AEs and SOCs removed. We find that while the removal of low incidence AEs increases the power of the error controlling procedures, the estimated power of the Bayesian methods remains relatively constant over all data sizes. Automatic removal of low-incidence AEs however does have an effect on the error rates of all the methods, and a clinically guided approach to their removal is needed. Overall we found that the Bayesian approaches are particularly useful for scanning the large amounts of AE data gathered. 相似文献

14.

Procedures for the identification of multiple influential observations in linear regression

A.A.M. Nurunnabi Ali S. Hadi A.H.M.R. Imon 《Journal of applied statistics》2014,41(6):1315-1331

Since the seminal paper by Cook (1977) in which he introduced Cook's distance, the identification of influential observations has received a great deal of interest and extensive investigation in linear regression. It is well documented that most of the popular diagnostic measures that are based on single-case deletion can mislead the analysis in the presence of multiple influential observations because of the well-known masking and/or swamping phenomena. Atkinson (1981) proposed a modification of Cook's distance. In this paper we propose a further modification of the Cook's distance for the identification of a single influential observation. We then propose new measures for the identification of multiple influential observations, which are not affected by the masking and swamping problems. The efficiency of the new statistics is presented through several well-known data sets and a simulation study. 相似文献

15.

Matrix decomposition in meta-analysis for extraction of adverse event pattern and patient-level safety profile

Kentaro Matsuura Jun Tsuchida Shuji Ando Takashi Sozu 《Pharmaceutical statistics》2021,20(4):806-819

The purpose of assessing adverse events (AEs) in clinical studies is to evaluate what AE patterns are likely to occur during treatment. In contrast, it is difficult to specify which of these patterns occurs in each patient. To tackle this challenging issue, we constructed a new statistical model including nonnegative matrix factorization by incorporating background knowledge of AE-specific structures such as severity and drug mechanism of action. The model uses a meta-analysis framework for integrating data from multiple clinical studies because insufficient information is derived from a single trial. We demonstrated the proposed method by applying it to real data consisting of three Phase III studies, two mechanisms of action, five anticancer treatments, 3317 patients, 848 AE types, and 99,546 AEs. The extracted typical treatment-specific AE patterns coincided with medical knowledge. We also demonstrated patient-level safety profiles using the data of AEs that were observed by the end of the second cycle. 相似文献

16.

A multiple-case deletion approach for detecting influential points in high-dimensional regression

Tao Wang Qun Li Qingpei Zang 《统计学通讯:模拟与计算》2013,42(7):2065-2082

ABSTRACT

In high-dimensional regression, the presence of influential observations may lead to inaccurate analysis results so that it is a prime and important issue to detect these unusual points before statistical regression analysis. Most of the traditional approaches are, however, based on single-case diagnostics, and they may fail due to the presence of multiple influential observations that suffer from masking effects. In this paper, an adaptive multiple-case deletion approach is proposed for detecting multiple influential observations in the presence of masking effects in high-dimensional regression. The procedure contains two stages. Firstly, we propose a multiple-case deletion technique, and obtain an approximate clean subset of the data that is presumably free of influential observations. To enhance efficiency, in the second stage, we refine the detection rule. Monte Carlo simulation studies and a real-life data analysis investigate the effective performance of the proposed procedure. 相似文献

17.

An Empirical Comparison of Multiple Imputation Methods for Categorical Data

Olanrewaju Akande Fan Li Jerome Reiter 《The American statistician》2017,71(2):162-170

Multiple imputation is a common approach for dealing with missing values in statistical databases. The imputer fills in missing values with draws from predictive models estimated from the observed data, resulting in multiple, completed versions of the database. Researchers have developed a variety of default routines to implement multiple imputation; however, there has been limited research comparing the performance of these methods, particularly for categorical data. We use simulation studies to compare repeated sampling properties of three default multiple imputation methods for categorical data, including chained equations using generalized linear models, chained equations using classification and regression trees, and a fully Bayesian joint distribution based on Dirichlet process mixture models. We base the simulations on categorical data from the American Community Survey. In the circumstances of this study, the results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches. They also suggest competing advantages for the regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Supplementary material for this article is available online. 相似文献

18.

Identification of multiple influential observations in logistic regression

A. A.M. Nurunnabi A. H.M. Rahmatullah Imon M. Nasser 《Journal of applied statistics》2010,37(10):1605-1624

The identification of influential observations in logistic regression has drawn a great deal of attention in recent years. Most of the available techniques like Cook's distance and difference of fits (DFFITS) are based on single-case deletion. But there is evidence that these techniques suffer from masking and swamping problems and consequently fail to detect multiple influential observations. In this paper, we have developed a new measure for the identification of multiple influential observations in logistic regression based on a generalized version of DFFITS. The advantage of the proposed method is then investigated through several well-referred data sets and a simulation study. 相似文献

19.

Modeling longitudinal count data with dropouts

Mohamed Alosh 《Pharmaceutical statistics》2010,9(1):35-45

This paper explores the utility of different approaches for modeling longitudinal count data with dropouts arising from a clinical study for the treatment of actinic keratosis lesions on the face and balding scalp. A feature of these data is that as the disease for subjects on the active arm improves their data show larger dispersion compared with those on the vehicle, exhibiting an over‐dispersion relative to the Poisson distribution. After fitting the marginal (or population averaged) model using the generalized estimating equation (GEE), we note that inferences from such a model might be biased as dropouts are treatment related. Then, we consider using a weighted GEE (WGEE) where each subject's contribution to the analysis is weighted inversely by the subject's probability of dropout. Based on the model findings, we argue that the WGEE might not address the concerns about the impact of dropouts on the efficacy findings when dropouts are treatment related. As an alternative, we consider likelihood‐based inference where random effects are added to the model to allow for heterogeneity across subjects. Finally, we consider a transition model where, unlike the previous approaches that model the log‐link function of the mean response, we model the subject's actual lesion counts. This model is an extension of the Poisson autoregressive model of order 1, where the autoregressive parameter is taken to be a function of treatment as well as other covariates to induce different dispersions and correlations for the two treatment arms. We conclude with a discussion about model selection. Published in 2009 by John Wiley & Sons, Ltd. 相似文献

20.

A Survey regarding the Reporting of Simulation Studies

Walter W. Hauck Sharon Anderson 《The American statistician》2013,67(3):214-216

Hoaglin and Andrews (1975) proposed standards for computational practice and the reporting of computation-based studies. They observed that “statisticians … often pay too little attention to their own principles of design.” To see if the design and reporting had improved, we surveyed five major statistical journals for 1975, 1978, and 1981 to ascertain whether reported simulation studies involved a specified design, justified the choice of the number of iterations, and specified the random number generator(s) used. Eighteen percent of the 1,198 papers surveyed included results based on simulation. We found that 9% of the papers including a simulation study justified the choice of the number of iterations and 44% at least partially specified the random number generator. Hoaglin and Andrews's observation appears still to be true. 相似文献