期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The value of the aggregate data approach in meta-analysis with time-to-event outcomes

Catrin Tudur Paula R. Williamson Saboor Khan & Lesley Y. Best 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(2):357-370

Collecting individual patient data has been described as the 'gold standard' for undertaking meta-analysis. If studies involve time-to-event outcomes, conducting a meta-analysis based on aggregate data can be problematical. Two meta-analyses of randomized controlled trials with time-to-event outcomes are used to illustrate the practicality and value of several proposed methods to obtain summary statistic estimates. In the first example the results suggest that further effort should be made to find unpublished trials. In the second example the use of aggregate data for trials where no individual patient data have been supplied allows the totality of evidence to be assessed and indicates previously unrecognized heterogeneity. 相似文献

2.

Empirical Bayes Small-Area Estimation Using Logistic Regression Models and Summary Statistics

Patrick J. Farrell Brenda MacGibbon Thomas J. Tomberlin 《商业与经济统计学杂志》2013,31(1):101-108

Many of the available methods for estimating small-area parameters are model-based approaches in which auxiliary variables are used to predict the variable of interest. For models that are nonlinear, prediction is not straightforward. MacGibbon and Tomberlin and Farrell, MacGibbon, and Tomberlin have proposed approaches that require microdata for all individuals in a small area. In this article, we develop a method, based on a second-order Taylor-series expansion to obtain model-based predictions, that requires only local-area summary statistics for both continuous and categorical auxiliary variables. The methodology is evaluated using data based on a U.S. Census. 相似文献

3.

A Generalized Chain Binomial Model with Aggregated Data

Ying Xu Paul S. F. Yip Richard M. Huggins 《统计学通讯:理论与方法》2013,42(18):3325-3338

In large cohort studies it can be impractical to report individual data that only summary or aggregated data are available. Using aggregated data from Bernoulli trials is expected to result in overdispersion so that a quasi-binomial approach would seem feasible. We show that when applied to aggregated data arising from cohorts of individuals according to a chain binomial model, the quasi-binomial model results in biased estimates. We propose an alternate calibration estimator and demonstrate its improved performance by simulations. The calibration method is then applied to model the probability of leaving a personal emergency link service in Hong Kong. 相似文献

4.

Patient-specific meta-analysis for risk assessment using multivariate proportional hazards regression

Michael R. Crager Gong Tang 《Journal of applied statistics》2014,41(12):2676-2695

We propose a method for assessing an individual patient's risk of a future clinical event using clinical trial or cohort data and Cox proportional hazards regression, combining the information from several studies using meta-analysis techniques. The method combines patient-specific estimates of the log cumulative hazard across studies, weighting by the relative precision of the estimates, using either fixed- or random-effects meta-analysis calculations. Risk assessment can be done for any future patient using a few key summary statistics determined once and for all from each study. Generalizations of the method to logistic regression and linear models are immediate. We evaluate the methods using simulation studies and illustrate their application using real data. 相似文献

5.

Sufficiency Revisited: Rethinking Statistical Algorithms in the Big Data Era

Jarod Y. L. Lee James J. Brown Louise M. Ryan 《The American statistician》2017,71(3):202-208

The big data era demands new statistical analysis paradigms, since traditional methods often break down when datasets are too large to fit on a single desktop computer. Divide and Recombine (D&R) is becoming a popular approach for big data analysis, where results are combined over subanalyses performed in separate data subsets. In this article, we consider situations where unit record data cannot be made available by data custodians due to privacy concerns, and explore the concept of statistical sufficiency and summary statistics for model fitting. The resulting approach represents a type of D&R strategy, which we refer to as summary statistics D&R; as opposed to the standard approach, which we refer to as horizontal D&R. We demonstrate the concept via an extended Gamma–Poisson model, where summary statistics are extracted from different databases and incorporated directly into the fitting algorithm without having to combine unit record data. By exploiting the natural hierarchy of data, our approach has major benefits in terms of privacy protection. Incorporating the proposed modelling framework into data extraction tools such as TableBuilder by the Australian Bureau of Statistics allows for potential analysis at a finer geographical level, which we illustrate with a multilevel analysis of the Australian unemployment data. Supplementary materials for this article are available online. 相似文献

6.

An assessment of the use of the continuity correction for sparse data in meta-analysis

Steadman S. Sankey M.Sc. Lisa A. Weissfeld Ph.D. Michael J. Fine M.D. M.Sc. Wishwa Kapoor M.D. M.P.H. 《统计学通讯:模拟与计算》2013,42(4):1031-1056

A problem which occurs in the practice of meta-analysis is that one or more component studies may have sparse data, such as zero events in the treatment and control groups. Two possible approaches were explored using simulations. The corrected method, in which one half was added to each cell was compared to the uncorrected method. These methods were compared over a range of sparse data situations in terms of coverage rates using three summary statistics:the Mantel-Haenszel odds ratio and the dersimonian and Laird odds ratio and rate difference. The uncorrected method performed better only when using the Mantel-Haenszel odds ratio with very little heterogeneity present. For all other sparse data applications, the continuity correction performed better and is recommended for use in meta-analyses of similar scope 相似文献

7.

The relationship between osteoporotic fracture risk and a surrogate: Apparent discrepancies between analyses based on individual patient data and summary statistics

Ian Barton 《Pharmaceutical statistics》2004,3(3):205-212

There is debate within the osteoporosis research community about the relationship between the risk of osteoporotic fracture and the surrogate measures of fracture risk. Meta‐regression analyses based on summary data have shown a linear relationship between fracture risk and surrogate measures, whereas analyses based on individual patient data (IPD) have shown a nonlinear relationship. We investigated the association between changes in a surrogate measure of fracture incidence, in this case a bone turnover marker for resorption assessed in the three risedronate phase III clinical programmes, and incident osteoporosis‐related fracture risk using regression models based on patient‐level and trial‐level information. The relationship between osteoporosis‐related fracture risk and changes in bone resorption was different when analysed on the basis of IPD than when analysed on the basis of a meta‐analytic approach (i.e., meta‐regression) using summary data (e.g., treatment effect based on treatment group estimates). This inconsistency in our findings was consistent with those in the published literature. Meta‐regression based on summary statistics at the trial level is not expected to reflect causal relationships between a clinical outcome and surrogate measures. Analyses based on IPD make possible a more comprehensive analysis since all relevant data on a patient level are available. Copyright © 2004 John Wiley & Sons Ltd. 相似文献

8.

Score tests for independence in semiparametric competing risks models

Mériem Saïd Nadia Ghazzali Louis-Paul Rivest 《Lifetime data analysis》2009,15(4):413-440

A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper. 相似文献

9.

Hausman-type tests for individual and time effects in the panel regression model with incomplete data

Jing Chen Rongxian Yue Jianhong Wu 《Journal of the Korean Statistical Society》2018,47(3):347-363

By comparing estimators of the variance of idiosyncratic error at different robust levels, two Hausman-type test statistics are respectively constructed for the existence of individual and time effects in the panel regression model with incomplete data. The resultant test statistics have several desired properties. Firstly, they are robust to the presence of one effect when the other is tested. Secondly, they are immune to the non-normal distribution of the disturbances since the distributional conditions are not needed in the construction of the statistics. Thirdly, they have more robust performances than the main competitors in the literature when the covariates are correlated with the effects. Additionally, they are very simple and have no heavy computational burden. Joint tests for both of the two effects are also discussed. Monte Carlo evidence shows that the proposed tests have desired finite sample properties, and a real data analysis gives further support. 相似文献

10.

A generalized BLUE approach for combining location and scale information in a meta-analysis

Xin Yang Alan D. Hutson Dongliang Wang 《Journal of applied statistics》2022,49(15):3846

相似文献

11.

我国统计学高等教育与学科建设若干问题研究 总被引：7，自引：0，他引：7

下载免费PDF全文

熊俊顺《统计研究》2001,18(4):20-23

统计学是 2 0世纪人类取得突出成就的学科之一 ,可是目前我国统计学专业高等教育和学科建设面对的现实却很严峻 :随着企业纷纷裁减合并统计从业人员 ,撤消专职统计岗位以及政府统计部门机构精简、编制压缩 ,传统的社会经济统计在市场选择中处于尴尬地位 ;高校统计学专业生源锐减 ,毕业生就业困难 ;社会经济统计学专业发展步履维艰 ,令人困惑 ;在国家颁布实施的学科分类体系中 ,统计学的归属问题分歧较大 ,引发许多争论。如国家技术监督局制定的ＧＢＴ1 4745- 96《学科分类与代码》中 ,统计学作为与经济学并列的一级学科 ,归入哲学与人文科… 相似文献

12.

Logistic regression in meta-analysis using aggregate data

Bei-Hung Chang Stuart Lipsitz Christine Waternaux 《Journal of applied statistics》2000,27(4):411-424

We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probabilities than the other two estimators. We concluded that analysts should be cautious in using aggregate data to estimate the parameters of the logistic regression model for the underlying individual data. 相似文献

13.

Spectral density-based and measure-preserving ABC for partially observed diffusion processes. An illustration on Hamiltonian SDEs

Buckwar Evelyn Tamborrino Massimiliano Tubikanec Irene 《Statistics and Computing》2020,30(3):627-648

Approximate Bayesian computation (ABC) has become one of the major tools of likelihood-free statistical inference in complex mathematical models. Simultaneously, stochastic differential equations (SDEs) have developed to an established tool for modelling time-dependent, real-world phenomena with underlying random effects. When applying ABC to stochastic models, two major difficulties arise: First, the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the stochastic process under the same parameter configuration result in different trajectories. Second, exact simulation schemes to generate trajectories from the stochastic model are rarely available, requiring the derivation of suitable numerical methods for the synthetic data generation. To obtain summaries that are less sensitive to the intrinsic stochasticity of the model, we propose to build up the statistical method (e.g. the choice of the summary statistics) on the underlying structural properties of the model. Here, we focus on the existence of an invariant measure and we map the data to their estimated invariant density and invariant spectral density. Then, to ensure that these model properties are kept in the synthetic data generation, we adopt measure-preserving numerical splitting schemes. The derived property-based and measure-preserving ABC method is illustrated on the broad class of partially observed Hamiltonian type SDEs, both with simulated data and with real electroencephalography data. The derived summaries are particularly robust to the model simulation, and this fact, combined with the proposed reliable numerical scheme, yields accurate ABC inference. In contrast, the inference returned using standard numerical methods (Euler–Maruyama discretisation) fails. The proposed ingredients can be incorporated into any type of ABC algorithm and directly applied to all SDEs that are characterised by an invariant distribution and for which a measure-preserving numerical method can be derived.

相似文献

14.

Modeling Association Between Two or More Categorical Variables that Allow for Multiple Category Choices

Christopher R. Bilder Thomas M. Loughin 《统计学通讯:理论与方法》2013,42(2):433-451

Multiple-response (or pick any/c) categorical variables summarize responses to survey questions that ask “pick any” from a set of item responses. Extensions to loglinear model methodology are proposed to model associations between these variables across all their items simultaneously. Because individual item responses to a multiple-response categorical variable are likely to be correlated, the usual chi-square distributional approximations for model-comparison statistics are not appropriate. Adjusted statistics and a new bootstrap procedure are developed to facilitate distributional approximations. Odds ratio and standardized Pearson residual measures are also developed to estimate specific associations and examine deviations from a specified model. 相似文献

15.

Buckley–James-type estimation of quantile regression with recurrent gap time data

Jung-Yu Cheng 《Journal of applied statistics》2015,42(7):1383-1401

In longitudinal studies, an individual may potentially undergo a series of repeated recurrence events. The gap times, which are referred to as the times between successive recurrent events, are typically the outcome variables of interest. Various regression models have been developed in order to evaluate covariate effects on gap times based on recurrence event data. The proportional hazards model, additive hazards model, and the accelerated failure time model are all notable examples. Quantile regression is a useful alternative to the aforementioned models for survival analysis since it can provide great flexibility to assess covariate effects on the entire distribution of the gap time. In order to analyze recurrence gap time data, we must overcome the problem of the last gap time subjected to induced dependent censoring, when numbers of recurrent events exceed one time. In this paper, we adopt the Buckley–James-type estimation method in order to construct a weighted estimation equation for regression coefficients under the quantile model, and develop an iterative procedure to obtain the estimates. We use extensive simulation studies to evaluate the finite-sample performance of the proposed estimator. Finally, analysis of bladder cancer data is presented as an illustration of our proposed methodology. 相似文献

16.

PubPredict: Prediction of progression and survival in oncology leveraging publications and early efficacy data

Jianqi Zhang Yufei Guo Junyi Zhou Hans Erik Rasmussen 《Pharmaceutical statistics》2023,22(5):963-973

In oncology/hematology early phase clinical trials, efficacies were often observed in terms of response rate, depth, timing, and duration. However, the true clinical benefits that eventually support registrational purpose are progression-free survival (PFS) and/or overall survival (OS), the follow-up of which are typically not long enough in early phase trials. This gap imposes challenges in strategies for late phase drug development. In this article, we tackle the question by leveraging published study to establish a quantitative link between early efficacy outcomes and late phase efficacy endpoints. We used solid tumor cancer as disease model. We modeled the disease course of a RECISTv1.1 assessed solid tumor with a continuous Markov chain (CMC) model. We parameterize the transition intensity matrix of a CMC model based on published aggregate-level summary statistics, and then simulate subject-level time-to-event data. The simulated data is shown to have good approximation to published studies. PFS and/or OS could be predicted with the transition intensity matrix modified given clinical knowledge to reflect various assumptions on response rate, depth, timing, and duration. The authors have built a R shiny application named PubPredict, the tool implements the algorithm described above and allows customized features including multiple response levels, treatment crossover and varying follow-up duration. This toolset has been applied to advise phase 3 trial design when only early efficacy data are available from phase 1 or 2 studies. 相似文献

17.

On goodness of fit for time series regression models

《Journal of Statistical Computation and Simulation》2012,82(3):239-256

We Propose a Bayesian approach to chech the goodness of fit for time series regression models. The test statistics is proposed by Smith (1985) based on a sequence of random variables which are independently distributed standard normal if the model is correct. We estimate this sequence of random variables using several methods. The tests of goodness of fit are performed when either the error terms violate the Gaussian assumption, or the order is incorrect, or the model is misspecified. The methodology is illustrated using both a simulation study and three real date sets. 相似文献

18.

On double hysteretic heteroskedastic model

Cathy W.S. Chen Buu-Chau Truong 《Journal of Statistical Computation and Simulation》2016,86(13):2684-2705

ABSTRACT

This paper proposes a hysteretic autoregressive model with GARCH specification and a skew Student's t-error distribution for financial time series. With an integrated hysteresis zone, this model allows both the conditional mean and conditional volatility switching in a regime to be delayed when the hysteresis variable lies in a hysteresis zone. We perform Bayesian estimation via an adaptive Markov Chain Monte Carlo sampling scheme. The proposed Bayesian method allows simultaneous inferences for all unknown parameters, including threshold values and a delay parameter. To implement model selection, we propose a numerical approximation of the marginal likelihoods to posterior odds. The proposed methodology is illustrated using simulation studies and two major Asia stock basis series. We conduct a model comparison for variant hysteresis and threshold GARCH models based on the posterior odds ratios, finding strong evidence of the hysteretic effect and some asymmetric heavy-tailness. Versus multi-regime threshold GARCH models, this new collection of models is more suitable to describe real data sets. Finally, we employ Bayesian forecasting methods in a Value-at-Risk study of the return series. 相似文献

19.

Mantel test for spatial functional data

Ramón Giraldo William Caballero Jesús Camacho-Tamayo 《AStA Advances in Statistical Analysis》2018,102(1):21-39

Statistics for spatial functional data is an emerging field in statistics which combines methods of spatial statistics and functional data analysis to model spatially correlated functional data. Checking for spatial autocorrelation is an important step in the statistical analysis of spatial data. Several statistics to achieve this goal have been proposed. The test based on the Mantel statistic is widely known and used in this context. This paper proposes an application of this test to the case of spatial functional data. Although we focus particularly on geostatistical functional data, that is functional data observed in a region with spatial continuity, the test proposed can also be applied with functional data which can be measured on a discrete set of areas of a region (areal functional data) by defining properly the distance between the areas. Based on two simulation studies, we show that the proposed test has a good performance. We illustrate the methodology by applying it to an agronomic data set. 相似文献

20.

Determining Process Death Based on Censored Activity Data

Nicholas Evangelopoulos Anna Sidorova Stergios Fotopoulos Indushobha Chengalur-Smith 《统计学通讯:模拟与计算》2013,42(8):1647-1662

This article addresses the problem of estimating the time of apparent death in a binary stochastic process. We show that, when only censored data are available, a fitted logistic regression model may estimate the time of death incorrectly. We improve this estimation by utilizing discrete-event simulation to produce simulated complete time series data. The proposed methodology may be applied to situations where time of death cannot be formally determined and has to be estimated based on prolonged inactivity. As an illustration, we use observed monthly activity patterns from 300 real Open Source Software development projects sampled from Sourceforge.net. 相似文献