期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Recurrent events analysis in the presence of time-dependent covariates and dependent censoring

Maja Miloslavsky Sündüz Kele&#; Mark J. van der Laan Steve Butler 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(1):239-257

Summary. Recurrent events models have had considerable attention recently. The majority of approaches show the consistency of parameter estimates under the assumption that censoring is independent of the recurrent events process of interest conditional on the covariates that are included in the model. We provide an overview of available recurrent events analysis methods and present an inverse probability of censoring weighted estimator for the regression parameters in the Andersen–Gill model that is commonly used for recurrent event analysis. This estimator remains consistent under informative censoring if the censoring mechanism is estimated consistently, and it generally improves on the naïve estimator for the Andersen–Gill model in the case of independent censoring. We illustrate the bias of ad hoc estimators in the presence of informative censoring with a simulation study and provide a data analysis of recurrent lung exacerbations in cystic fibrosis patients when some patients are lost to follow-up. 相似文献

2.

Non-stationary spatiotemporal analysis of karst water levels

I. L. Dryden L. Márkus C. C. Taylor J. Kovács 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(3):673-690

Summary. We consider non-stationary spatiotemporal modelling in an investigation into karst water levels in western Hungary. A strong feature of the data set is the extraction of large amounts of water from mines, which caused the water levels to reduce until about 1990 when the mining ceased, and then the levels increased quickly. We discuss some traditional hydrogeological models which might be considered to be appropriate for this situation, and various alternative stochastic models. In particular, a separable space–time covariance model is proposed which is then deformed in time to account for the non-stationary nature of the lagged correlations between sites. Suitable covariance functions are investigated and then the models are fitted by using weighted least squares and cross-validation. Forecasting and prediction are carried out by using spatiotemporal kriging. We assess the performance of the method with one-step-ahead forecasting and make comparisons with naïve estimators. We also consider spatiotemporal prediction at a set of new sites. The new model performs favourably compared with the deterministic model and the naïve estimators, and the deformation by time shifting is worthwhile. 相似文献

3.

Predicting the outcomes of annual sporting contests

Rose Baker Philip Scarf 《Journal of the Royal Statistical Society. Series C, Applied statistics》2006,55(2):225-239

Summary. Data from 20 sporting contests in which the same two teams compete regularly are studied. Strong and weak symmetry requirements for possible models are identified, and some simple models are proposed and fitted to the data. The need to compute the exact likelihood function and the presence of missing values make this non-trivial. Forecasting match outcomes by using the models can give a modest improvement over a naïve forecast. Significance tests for studying the effect of 'match covariates' such as playing at home or away or winning the toss are introduced, and the effect of these covariates is in general found to be quite large. 相似文献

4.

On variable bandwidth selection in local polynomial regression

Kjell Doksum Derick Peterson & Alex Samarov 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(3):431-448

The performances of data-driven bandwidth selection procedures in local polynomial regression are investigated by using asymptotic methods and simulation. The bandwidth selection procedures considered are based on minimizing 'prelimit' approximations to the (conditional) mean-squared error (MSE) when the MSE is considered as a function of the bandwidth h . We first consider approximations to the MSE that are based on Taylor expansions around h=0 of the bias part of the MSE. These approximations lead to estimators of the MSE that are accurate only for small bandwidths h . We also consider a bias estimator which instead of using small h approximations to bias naïvely estimates bias as the difference of two local polynomial estimators of different order and we show that this estimator performs well only for moderate to large h . We next define a hybrid bias estimator which equals the Taylor-expansion-based estimator for small h and the difference estimator for moderate to large h . We find that the MSE estimator based on this hybrid bias estimator leads to a bandwidth selection procedure with good asymptotic and, for our Monte Carlo examples, finite sample properties. 相似文献

5.

Anticipating catastrophes through extreme value modelling 总被引：11，自引：0，他引：11

Stuart Coles Luis Pericchi 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(4):405-416

Summary. When catastrophes strike it is easy to be wise after the event. It is also often argued that such catastrophic events are unforeseeable, or at least so implausible as to be negligible for planning purposes. We consider these issues in the context of daily rainfall measurements recorded in Venezuela. Before 1999 simple extreme value techniques were used to assess likely future levels of extreme rainfall, and these gave no particular cause for concern. In December 1999 a daily precipitation event of more than 410 mm, almost three times the magnitude of the previously recorded maximum, caused devastation and an estimated 30000 deaths. We look carefully at the previous history of the process and offer an extreme value analysis of the data—with some methodological novelty—that suggests that the 1999 event was much more plausible than the previous analyses had claimed. Deriving design parameters from the results of such an analysis may have had some mitigating effects on the consequences of the subsequent disaster. The themes of the new analysis are simple: the full exploitation of available data, proper accounting of uncertainty, careful interpretation of asymptotic limit laws and allowance for non-stationarity. The effect on the Venezuelan data analysis is dramatic. The broader implications are equally dramatic; that a naïve use of extreme value techniques is likely to lead to a false sense of security that might have devastating consequences in practice. 相似文献

6.

Variance of the number of false discoveries

Art B. Owen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(3):411-426

Summary. In high throughput genomic work, a very large number d of hypotheses are tested based on n ≪ d data samples. The large number of tests necessitates an adjustment for false discoveries in which a true null hypothesis was rejected. The expected number of false discoveries is easy to obtain. Dependences between the hypothesis tests greatly affect the variance of the number of false discoveries. Assuming that the tests are independent gives an inadequate variance formula. The paper presents a variance formula that takes account of the correlations between test statistics. That formula involves O ( d ²) correlations, and so a naïve implementation has cost O ( nd ²). A method based on sampling pairs of tests allows the variance to be approximated at a cost that is independent of d . 相似文献

7.

Bootstrap adjustments for empirical bayes interval estimates of small-area proportions

Patrick J. Farrell Brenda Macgibbon Thomas J. Tomberlin 《Revue canadienne de statistique》1997,25(1):75-89

Empirical Bayes approaches have often been applied to the problem of estimating small-area parameters. As a compromise between synthetic and direct survey estimators, an estimator based on an empirical Bayes procedure is not subject to the large bias that is sometimes associated with a synthetic estimator, nor is it as variable as a direct survey estimator. Although the point estimates perform very well, naïve empirical Bayes confidence intervals tend to be too short to attain the desired coverage probability, since they fail to incorporate the uncertainty which results from having to estimate the prior distribution. Several alternative methodologies for interval estimation which correct for the deficiencies associated with the naïve approach have been suggested. Laird and Louis (1987) proposed three types of bootstrap for correcting naïve empirical Bayes confidence intervals. Calling the methodology of Laird and Louis (1987) an unconditional bias-corrected naïve approach, Carlin and Gelfand (1991) suggested a modification to the Type III parametric bootstrap which corrects for bias in the naïve intervals by conditioning on the data. Here we empirically evaluate the Type II and Type III bootstrap proposed by Laird and Louis, as well as the modification suggested by Carlin and Gelfand (1991), with the objective of examining coverage properties of empirical Bayes confidence intervals for small-area proportions. 相似文献

8.

Linkage bias in estimating the association between childhood exposures and propensity to become a mother: an example of simple sensitivity analyses

D. Nitsch B. L. DeStavola S. M. B. Morton D. A. Leon 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(3):493-505

Summary. Record linkage is a powerful tool to obtain individual follow-up information that is held in routinely collected databases. However, this method is potentially limited not only by the quality of the original data but also by the temporal and geographic coverage of the routine data. Migration in particular is a factor that might introduce systematic bias even in analyses of data covering relatively large geographical areas. We describe a linkage application where emigration bias might be an issue and use the sensitivity analysis approach that has been described by Molenberghs and co-workers and Kenward and co-workers to assess the extent of this bias. 相似文献

9.

Assessing accuracy of a continuous screening test in the presence of verification bias 总被引：1，自引：1，他引：0

Todd A. Alonzo Margaret Sullivan Pepe 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(1):173-190

Summary. In studies to assess the accuracy of a screening test, often definitive disease assessment is too invasive or expensive to be ascertained on all the study subjects. Although it may be more ethical or cost effective to ascertain the true disease status with a higher rate in study subjects where the screening test or additional information is suggestive of disease, estimates of accuracy can be biased in a study with such a design. This bias is known as verification bias. Verification bias correction methods that accommodate screening tests with binary or ordinal responses have been developed; however, no verification bias correction methods exist for tests with continuous results. We propose and compare imputation and reweighting bias-corrected estimators of true and false positive rates, receiver operating characteristic curves and area under the receiver operating characteristic curve for continuous tests. Distribution theory and simulation studies are used to compare the proposed estimators with respect to bias, relative efficiency and robustness to model misspecification. The bias correction estimators proposed are applied to data from a study of screening tests for neonatal hearing loss. 相似文献

10.

Adjusting ROC curves for covariates in the presence of verification bias

Ronen Fluss Benjamin ReiserDavid Faraggi 《Journal of statistical planning and inference》2012,142(1):1-11

The ROC (receiver operating characteristic) curve is frequently used for describing effectiveness of a diagnostic marker or test. Classical estimation of the ROC curve uses independent identically distributed samples taken randomly from the healthy and diseased populations. Frequently not all subjects undergo a definitive gold standard assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased (verification bias). In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve adjusted for covariates (ROC regression) under verification bias. We develop the estimator's asymptotic distribution and examine its finite sample size properties via a simulation study. We apply this procedure to fingerstick postprandial blood glucose measurement data adjusting for age. 相似文献

11.

Trends in inequality in infant mortality in the north of England, 1921–1973, and their association with urban and social structure

P. Congdon H. Southall 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2005,168(4):679-700

Summary. The paper analyses a time series of infant mortality rates in the north of England from 1921 to the early 1970s at a spatial scale that is more disaggregated than in previous studies of infant mortality trends in this period. The paper describes regression methods to obtain mortality gradients over socioeconomic indicators from the censuses of 1931, 1951, 1961 and 1971 and to assess whether there is any evidence for widening spatial inequalities in infant mortality outcomes against a background of an overall reduction in the infant mortality rate. Changes in the degree of inequality are also formally assessed by inequality measures such as the Gini and Theil indices, for which sampling densities are obtained and significant changes assessed. The analysis concerns a relatively infrequent outcome (especially towards the end of the period that is considered) and a high proportion of districts with small populations, so necessitating the use of appropriate methods for deriving indices of inequality and for regression modelling. 相似文献

12.

Hypothesis Tests for Neyman's Bias in Case–Control Studies

D. M. Swanson C. D. Anderson R. A. Betensky 《Journal of applied statistics》2018,45(11):1956-1977

Survival bias is a long recognized problem in case–control studies, and many varieties of bias can come under this umbrella term. We focus on one of them, termed Neyman's bias or ‘prevalence–incidence bias’. It occurs in case–control studies when exposure affects both disease and disease-induced mortality, and we give a formula for the observed, biased odds ratio under such conditions. We compare our result with previous investigations into this phenomenon and consider models under which this bias may or may not be important. Finally, we propose three hypothesis tests to identify when Neyman's bias may be present in case–control studies. We apply these tests to three data sets, one of stroke mortality, another of brain tumors, and the last of atrial fibrillation, and find some evidence of Neyman's bias in the former two cases, but not the last case. 相似文献

13.

Inference Methods for Correlated Left Truncated Lifetimes: Parent and Offspring Relations in an Adoption Study

Petersen L Sørensen TI Nielsen GG Andersen PK 《Lifetime data analysis》2006,12(1):5-20

The associations in mortality of adult adoptees and their biological or adoptive parents have been studied in order to separate genetic and environmental influences. The 1003 Danish adoptees born 1924–26 have previously been analysed in a Cox regression model, using dichotomised versions of the parents’ lifetimes as covariates. This model will be referred to as the conditional Cox model, as it analyses lifetimes of adoptees conditional on parental lifetimes. Shared frailty models may be more satisfactory by using the entire observed lifetime of the parents. In a simulation study, sample size, distribution of lifetimes, truncation- and censoring patterns were chosen to illustrate aspects of the adoption dataset, and were generated from the conditional Cox model or a shared frailty model with gamma distributed frailties. First, efficiency was compared in the conditional Cox model and a shared frailty model, based on the conditional approach. For data with type 1 censoring the models showed no differences, whereas in data with random or no censoring, the models had different power in favour of the one from which data were generated. Secondly, estimation in the shared frailty model by a conditional approach or a two-stage copula approach was compared. Both approaches worked well, with no sign of dependence upon the truncation pattern, but some sign of bias depending on the censoring. For frailty parameters close to zero, we found bias when the estimation procedure used did not allow negative estimates. Based on this evaluation, we prefer to use frailty models allowing for negative frailty parameter estimates. The conclusions from earlier analyses of the adoption study were confirmed, though without greater precision than using the conditional Cox model. Analyses of associations between parental lifetimes are also presented. 相似文献

14.

Asymptotic bias in the linear mixed effects model under non-ignorable missing data mechanisms

Chandan Saha Michael P. Jones 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(1):167-182

Summary. In longitudinal studies, missingness of data is often an unavoidable problem. Estimators from the linear mixed effects model assume that missing data are missing at random. However, estimators are biased when this assumption is not met. In the paper, theoretical results for the asymptotic bias are established under non-ignorable drop-out, drop-in and other missing data patterns. The asymptotic bias is large when the drop-out subjects have only one or no observation, especially for slope-related parameters of the linear mixed effects model. In the drop-in case, intercept-related parameter estimators show substantial asymptotic bias when subjects enter late in the study. Eight other missing data patterns are considered and these produce asymptotic biases of a variety of magnitudes. 相似文献

15.

Variable Selection for Panel Count Data via Non-Concave Penalized Estimating Function

XINGWEI TONG XIN HE LIUQUAN SUN JIANGUO SUN 《Scandinavian Journal of Statistics》2009,36(4):620-635

Abstract. Variable selection is an important issue in all regression analyses, and in this paper we discuss this in the context of regression analysis of panel count data. Panel count data often occur in long-term studies that concern occurrence rate of a recurrent event, and their analysis has recently attracted a great deal of attention. However, there does not seem to exist any established approach for variable selection with respect to panel count data. For the problem, we adopt the idea behind the non-concave penalized likelihood approach and develop a non-concave penalized estimating function approach. The proposed methodology selects variables and estimates regression coefficients simultaneously, and an algorithm is presented for this process. We show that the proposed procedure performs as well as the oracle procedure in that it yields the estimates as if the correct submodel were known. Simulation studies are conducted for assessing the performance of the proposed approach and suggest that it works well for practical situations. An illustrative example from a cancer study is provided. 相似文献

16.

On difference-based variance estimation in nonparametric regression when the covariate is high dimensional

Axel Munk Nicolai Bissantz Thorsten Wagner Gudrun Freitag 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(1):19-41

Summary. We consider the problem of estimating the noise variance in homoscedastic nonparametric regression models. For low dimensional covariates t ∈ R ^d, d =1, 2, difference-based estimators have been investigated in a series of papers. For a given length of such an estimator, difference schemes which minimize the asymptotic mean-squared error can be computed for d =1 and d =2. However, from numerical studies it is known that for finite sample sizes the performance of these estimators may be deficient owing to a large finite sample bias. We provide theoretical support for these findings. In particular, we show that with increasing dimension d this becomes more drastic. If d 4, these estimators even fail to be consistent. A different class of estimators is discussed which allow better control of the bias and remain consistent when d 4. These estimators are compared numerically with kernel-type estimators (which are asymptotically efficient), and some guidance is given about when their use becomes necessary. 相似文献

17.

Does social mobility affect the size of the socioeconomic mortality differential?: evidence from the Office for National Statistics Longitudinal Study 总被引：2，自引：0，他引：2

D. Blane S. Harding & M. Rosato 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1999,162(1):59-70

The effect of social mobility on the socioeconomic differential in mortality is examined with data from the Office for National Statistics Longitudinal Study. The analyses involve 46 980 men aged 45–64 years in 1981. The mortality risk of the socially mobile is compared with the mortality risk of the socially stable after adjustment for their class of origin (their social class in 1971) and class of destination (their social class in 1981) separately. Among those in employment there is some evidence that movement out of their class of origin is in the direction predicted by the idea of health-related social mobility. This evidence, however, seems strongest for causes of death which are least likely to have been preceded by prolonged incapacity. Movement into the class of destination, however, shows the opposite relationship with mortality. Compared with the socially stable members of their class of destination, the upwardly mobile tend to have higher mortality and the downwardly mobile tend to have lower mortality. This relationship with the class of destination, it is suggested, may explain why socioeconomic mortality differentials do not widen with increasing age. 相似文献

18.

Landmark estimation of survival and treatment effects in observational studies

Layla Parast Beth Ann Griffin 《Lifetime data analysis》2017,23(2):161-182

Clinical studies aimed at identifying effective treatments to reduce the risk of disease or death often require long term follow-up of participants in order to observe a sufficient number of events to precisely estimate the treatment effect. In such studies, observing the outcome of interest during follow-up may be difficult and high rates of censoring may be observed which often leads to reduced power when applying straightforward statistical methods developed for time-to-event data. Alternative methods have been proposed to take advantage of auxiliary information that may potentially improve efficiency when estimating marginal survival and improve power when testing for a treatment effect. Recently, Parast et al. (J Am Stat Assoc 109(505):384–394, 2014) proposed a landmark estimation procedure for the estimation of survival and treatment effects in a randomized clinical trial setting and demonstrated that significant gains in efficiency and power could be obtained by incorporating intermediate event information as well as baseline covariates. However, the procedure requires the assumption that the potential outcomes for each individual under treatment and control are independent of treatment group assignment which is unlikely to hold in an observational study setting. In this paper we develop the landmark estimation procedure for use in an observational setting. In particular, we incorporate inverse probability of treatment weights (IPTW) in the landmark estimation procedure to account for selection bias on observed baseline (pretreatment) covariates. We demonstrate that consistent estimates of survival and treatment effects can be obtained by using IPTW and that there is improved efficiency by using auxiliary intermediate event and baseline information. We compare our proposed estimates to those obtained using the Kaplan–Meier estimator, the original landmark estimation procedure, and the IPTW Kaplan–Meier estimator. We illustrate our resulting reduction in bias and gains in efficiency through a simulation study and apply our procedure to an AIDS dataset to examine the effect of previous antiretroviral therapy on survival. 相似文献

19.

Likelihood inference for correlated binary data without any information about the joint distributions

Tsung-Shan Tsou Wei-Cheng Hsiao 《统计学通讯:理论与方法》2017,46(5):2151-2160

We propose a universal robust likelihood that is able to accommodate correlated binary data without any information about the underlying joint distributions. This likelihood function is asymptotically valid for the regression parameter for any underlying correlation configurations, including varying under- or over-dispersion situations, which undermines one of the regularity conditions ensuring the validity of crucial large sample theories. This robust likelihood procedure can be easily implemented by using any statistical software that provides naïve and sandwich covariance matrices for regression parameter estimates. Simulations and real data analyses are used to demonstrate the efficacy of this parametric robust method. 相似文献

20.

The Effect of Verification Bias in the Naïve Estimators of Accuracy of a Binary Diagnostic Test

J. A. Roldán Nofuentes J. D. Luna del Castillo 《统计学通讯:模拟与计算》2013,42(5):959-972

The assessment of a binary diagnostic test requires a knowledge of the disease status of all the patients in the sample through the application of a gold standard. In practice, the gold standard is not always applied to all of the patients, which leads to the problem of partial verification of the disease. When the accuracy of the diagnostic test is assessed using only those patients whose disease status has been verified using the gold standard, the estimators obtained in this way, known as Naïve estimators, may be biased. In this study, we obtain the explicit expressions of the bias of the Naïve estimators of sensitivity and specificity of a binary diagnostic test. We also carry out simulation experiments in order to study the effect of the verification probabilities on the Naïve estimators of sensitivity and specificity. 相似文献