期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fusing point and areal level space–time data with application to wet deposition

Sujit K. Sahu Alan E. Gelfand David M. Holland 《Journal of the Royal Statistical Society. Series C, Applied statistics》2010,59(1):77-103

Summary. Motivated by the problem of predicting chemical deposition in eastern USA at weekly, seasonal and annual scales, the paper develops a framework for joint modelling of point- and grid-referenced spatiotemporal data in this context. The hierarchical model proposed can provide accurate spatial interpolation and temporal aggregation by combining information from observed point-referenced monitoring data and gridded output from a numerical simulation model known as the 'community multi-scale air quality model'. The technique avoids the change-of-support problem which arises in other hierarchical models for data fusion settings to combine point- and grid-referenced data. The hierarchical space–time model is fitted to weekly wet sulphate and nitrate deposition data over eastern USA. The model is validated with set-aside data from a number of monitoring sites. Predictive Bayesian methods are developed and illustrated for inference on aggregated summaries such as quarterly and annual sulphate and nitrate deposition maps. The highest wet sulphate deposition occurs near major emissions sources such as fossil-fuelled power plants whereas lower values occur near background monitoring sites. 相似文献

2.

Modelling mercury deposition through latent space–time processes

Ana G. Rappold Alan E. Gelfand David M. Holland 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(2):187-205

Summary. The paper provides a space–time process model for total wet mercury deposition. Key methodological features that are introduced include direct modelling of deposition rather than of expected deposition, the utilization of precipitation information (there is no deposition without precipitation) without having to construct a precipitation model and the handling of point masses at 0 in the distributions of both precipitation and deposition. The result is a specification that enables spatial interpolation and temporal prediction of deposition as well as aggregation in space or time to see patterns and trends in deposition. We use weekly deposition monitoring data from the National Atmospheric Deposition Program–Mercury Deposition Network for 2003 restricted to the eastern USA and Canada. Our spatiotemporal hierarchical model allows us to interpolate to arbitrary locations and, hence, to an arbitrary grid, enabling weekly deposition surfaces (with associated uncertainties) for this region. It also allows us to aggregate weekly depositions at coarser, quarterly and annual, temporal levels. 相似文献

3.

A Bayesian model for longitudinal count data with non-ignorable dropout

Kaciroti NA Raghunathan TE Schork MA Clark NM 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(5):521-534

Asthma is an important chronic disease of childhood. An intervention programme for managing asthma was designed on principles of self-regulation and was evaluated by a randomized longitudinal study.The study focused on several outcomes, and, typically, missing data remained a pervasive problem. We develop a pattern-mixture model to evaluate the outcome of intervention on the number of hospitalizations with non-ignorable dropouts. Pattern-mixture models are not generally identifiable as no data may be available to estimate a number of model parameters. Sensitivity analyses are performed by imposing structures on the unidentified parameters.We propose a parameterization which permits sensitivity analyses on clustered longitudinal count data that have missing values due to non-ignorable missing data mechanisms. This parameterization is expressed as ratios between event rates across missing data patterns and the observed data pattern and thus measures departures from an ignorable missing data mechanism. Sensitivity analyses are performed within a Bayesian framework by averaging over different prior distributions on the event ratios. This model has the advantage of providing an intuitive and flexible framework for incorporating the uncertainty of the missing data mechanism in the final analysis. 相似文献

4.

Likelihood-based confidence sets for partially identified parameters

Zhiwei Zhang 《Journal of statistical planning and inference》2009

There has been growing interest in partial identification of probability distributions and parameters. This paper considers statistical inference on parameters that are partially identified because data are incompletely observed, due to nonresponse or censoring, for instance. A method based on likelihood ratios is proposed for constructing confidence sets for partially identified parameters. The method can be used to estimate a proportion or a mean in the presence of missing data, without assuming missing-at-random or modeling the missing-data mechanism. It can also be used to estimate a survival probability with censored data without assuming independent censoring or modeling the censoring mechanism. A version of the verification bias problem is studied as well. 相似文献

5.

The Effects of Imputing the Missing Standard Deviations on the Standard Error of Meta Analysis Estimates

Nik Ruzni Nik Idris Chris Robertson 《统计学通讯:模拟与计算》2013,42(3):513-526

A common problem in the meta analysis of continuous data is that some studies do not report sufficient information to calculate the standard deviation (SDs) of the treatment effect. One of the approaches in handling this problem is through imputation. This article examines the empirical implications of imputing the missing SDs on the standard error (SE) of the overall meta analysis estimate. The simulation results show that if the SDs are missing under Missing Completely at Random and Missing at Random mechanism, imputation is recommended. With non random missing, imputation can lead to overestimation of the SE of the estimate. 相似文献

6.

Partially linear single-index model with missing responses at random

Peng Lai Qihua Wang 《Journal of statistical planning and inference》2011,141(2):1047-1058

This paper considers semiparametric partially linear single-index model with missing responses at random. Imputation approach is developed to estimate the regression coefficients, single-index coefficients and the nonparametric function, respectively. The imputation estimators for the regression coefficients and single-index coefficients are obtained by a stepwise approach. These estimators are shown to be asymptotically normal, and the estimator for the nonparametric function is proved to be asymptotically normal at any fixed point. The bandwidth problem is also considered in this paper, a delete-one cross validation method is used to select the optimal bandwidth. A simulation study is conducted to evaluate the proposed methods. 相似文献

7.

PARAMETRIC FRACTIONAL IMPUTATION FOR NON‐IGNORABLE CATEGORICAL MISSING DATA WITH FOLLOW‐UP

Ji Young Kim 《Australian & New Zealand Journal of Statistics》2012,54(2):239-250

Incomplete data subject to non‐ignorable non‐response are often encountered in practice and have a non‐identifiability problem. A follow‐up sample is randomly selected from the set of non‐respondents to avoid the non‐identifiability problem and get complete responses. Glynn, Laird, & Rubin analyzed non‐ignorable missing data with a follow‐up sample under a pattern mixture model. In this article, maximum likelihood estimation of parameters of the categorical missing data is considered with a follow‐up sample under a selection model. To estimate the parameters with non‐ignorable missing data, the EM algorithm with weighting, proposed by Ibrahim, is used. That is, in the E‐step, the weighted mean is calculated using the fractional weights for imputed data. Variances are estimated using the approximated jacknife method. Simulation results are presented to compare the proposed method with previously presented methods. 相似文献

8.

Missing data in clinical trials: control‐based mean imputation and sensitivity analysis

下载免费PDF全文

Devan V. Mehrotra Fang Liu Thomas Permutt 《Pharmaceutical statistics》2017,16(5):378-392

In some randomized (drug versus placebo) clinical trials, the estimand of interest is the between‐treatment difference in population means of a clinical endpoint that is free from the confounding effects of “rescue” medication (e.g., HbA1c change from baseline at 24 weeks that would be observed without rescue medication regardless of whether or when the assigned treatment was discontinued). In such settings, a missing data problem arises if some patients prematurely discontinue from the trial or initiate rescue medication while in the trial, the latter necessitating the discarding of post‐rescue data. We caution that the commonly used mixed‐effects model repeated measures analysis with the embedded missing at random assumption can deliver an exaggerated estimate of the aforementioned estimand of interest. This happens, in part, due to implicit imputation of an overly optimistic mean for “dropouts” (i.e., patients with missing endpoint data of interest) in the drug arm. We propose an alternative approach in which the missing mean for the drug arm dropouts is explicitly replaced with either the estimated mean of the entire endpoint distribution under placebo (primary analysis) or a sequence of increasingly more conservative means within a tipping point framework (sensitivity analysis); patient‐level imputation is not required. A supplemental “dropout = failure” analysis is considered in which a common poor outcome is imputed for all dropouts followed by a between‐treatment comparison using quantile regression. All analyses address the same estimand and can adjust for baseline covariates. Three examples and simulation results are used to support our recommendations. 相似文献

9.

Semiparametric Analysis of Isotonic Errors-in-Variables Regression Models with Missing Response

Zhimeng Sun Zhongzhan Zhang Jiang Du 《统计学通讯:理论与方法》2013,42(11):2034-2060

This article is concerned with the estimation problem in the semiparametric isotonic regression model when the covariates are measured with additive errors and the response is missing at random. An inverse marginal probability weighted imputation approach is developed to estimate the regression parameters and a least-square approach under monotone constraint is employed to estimate the functional component. We show that the proposed estimator of the regression parameter is root-n consistent and asymptotically normal and the isotonic estimator of the functional component, at a fixed point, is cubic root-n consistent. A simulation study is conducted to examine the finite-sample properties of the proposed estimators. A data set is used to demonstrate the proposed approach. 相似文献

10.

Identifying Guttman Structures in Incomplete Rasch Datasets

Lucio Bertoli-Barsotti Silvia Bacci 《统计学通讯:理论与方法》2014,43(3):470-497

In applications of IRT, it often happens that many examinees omit a substantial proportion of item responses. This can occur for various reasons, though it may well be due to no more than the simple fact of design incompleteness. In such circumstances, literature not infrequently refers to various types of estimation problem, often in terms of generic “convergence problems” in the software used to estimate model parameters. With reference to the Partial Credit Model and the instance of data missing at random, this article demonstrates that as their number increases, so does that of anomalous datasets, intended as those not corresponding to a finite estimate of (the vector parameter that identifies) the model. Moreover, the necessary and sufficient conditions for the existence and uniqueness of the maximum likelihood estimation of the Partial Credit Model (and hence, in particular, the Rasch model) in the case of incomplete data are given – with reference to the model in its more general form, the number of response categories varying according to item. A taxonomy of possible cases of anomaly is then presented, together with an algorithm useful in diagnostics. 相似文献

11.

A note on dealing with missing standard errors in meta-analyses of continuous outcome measures in WinBUGS

Stevens JW 《Pharmaceutical statistics》2011,10(4):374-378

A meta-analysis of a continuous outcome measure may involve missing standard errors. This is not a problem depending on assumptions made about the population standard deviation. Multiple imputation can be used to impute missing values while allowing for uncertainty in the imputation. Markov chain Monte Carlo simulation is a multiple imputation technique for generating posterior predictive distributions for missing data. We present an example of imputing missing variances using WinBUGS. The example highlights the importance of checking model assumptions, whether for missing or observed data. 相似文献

12.

Efficiency gains due to using missing data procedures in regression models

Theo Nijman Franz Palm 《Statistical Papers》1988,29(1):249-256

The problem of missing observations in regression models is often solved by using imputed values to complete the sample. As an alternative for static models, it has been suggested to limit the analysis to the periods or units for which all relevant variables are observed. The choice of an imputation procedure affects the asymptotic efficiency of the method used to subsequently estimate the parameters of the model. In this note, we show that the relative asymptotic efficiency of three estimators designed to handle incomplete samples depends on parameters that have a straightforward statistical interpretation. In terms of a gain of asymptotic efficiency, the use of these estimators is equivalent to the observation of a percentage of the values which are actually missing. This percentage depends on three R²-measures only, which can be straightforwardly computed in applied work. Therefore it should be easy in practice to check whether it is worthwhile to use a more elaborate estimator. 相似文献

13.

Estimating the underlying change in unemployment in the UK 总被引：2，自引：0，他引：2

Andrew Harvey & Chia-Hui Chung 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2000,163(3):303-309

By setting up a suitable time series model in state space form, the latest estimate of the underlying current change in a series may be computed by the Kalman filter. This may be done even if the observations are only available in a time-aggregated form subject to survey sampling error. A related series, possibly observed more frequently, may be used to improve the estimate of change further. The paper applies these techniques to the important problem of estimating the underlying monthly change in unemployment in the UK measured according to the definition of the International Labour Organisation by the Labour Force Survey. The fitted models suggest a reduction in root-mean-squared error of around 10% over a simple estimate based on differences if a univariate model is used and a further reduction of 50% if information on claimant counts is taken into account. With seasonally unadjusted data, the bivariate model offers a gain of roughly 40% over the use of annual differences. For both adjusted and unadjusted data, there is a further gain of around 10% if the next month's figure on claimant counts is used. The method preferred is based on a bivariate model with unadjusted data. If the next month's claimant count is known, the root-mean-squared error for the estimate of change is just over 10000. 相似文献

14.

An estimated‐score approach for dealing with missing covariate data in matched case–control studies

Samiran Sinha 《Revue canadienne de statistique》2010,38(4):680-697

Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada 相似文献

15.

Maximum-Entropy Prior Uncertainty and Correlation of Statistical Economic Data

João D. F. Rodrigues 《商业与经济统计学杂志》2016,34(3):357-367

Empirical estimates of source statistical economic data such as trade flows, greenhouse gas emissions, or employment figures are always subject to uncertainty (stemming from measurement errors or confidentiality) but information concerning that uncertainty is often missing. This article uses concepts from Bayesian inference and the maximum entropy principle to estimate the prior probability distribution, uncertainty, and correlations of source data when such information is not explicitly provided. In the absence of additional information, an isolated datum is described by a truncated Gaussian distribution, and if an uncertainty estimate is missing, its prior equals the best guess. When the sum of a set of disaggregate data is constrained to match an aggregate datum, it is possible to determine the prior correlations among disaggregate data. If aggregate uncertainty is missing, all prior correlations are positive. If aggregate uncertainty is available, prior correlations can be either all positive, all negative, or a mix of both. An empirical example is presented, which reports relative uncertainties and correlation priors for the County Business Patterns database. In this example, relative uncertainties range from 1% to 80% and 20% of data pairs exhibit correlations below ?0.9 or above 0.9. Supplementary materials for this article are available online. 相似文献

16.

A SEMIPARAMETRIC REGRESSION MODEL WITH MISSING COVARIATES IN CONTINUOUS-TIME CAPTURE-RECAPTURE STUDIES

Yan Wang 《Australian & New Zealand Journal of Statistics》2005,47(3):287-297

Covariate data were missing when a semiparametric regression model was used to study bird abundance in the Mai Po Sanctuary, Hong Kong. This paper proposes an EM‐type algorithm to estimate the regression parameters for that study. Analytical calculation of the expectation in the EM method is difficult, or even impossible, especially when missing covariates are continuous. A Monte Carlo method is used in the EM algorithm to ease the calculation complexity. Asymptotic variances of the parameter estimates are also derived. Properties of the proposed estimators are assessed through numerical simulations and a real example. 相似文献

17.

Empirical likelihood method for non-ignorable missing data problems

Zhong Guan Jing Qin 《Lifetime data analysis》2017,23(1):113-135

Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased. 相似文献

18.

Handling of missing data in long‐term clinical trials: a case study

Mark Janssens Geert Molenberghs René Kerstens 《Pharmaceutical statistics》2012,11(6):442-448

Missing data in clinical trials is a well‐known problem, and the classical statistical methods used can be overly simple. This case study shows how well‐established missing data theory can be applied to efficacy data collected in a long‐term open‐label trial with a discontinuation rate of almost 50%. Satisfaction with treatment in chronically constipated patients was the efficacy measure assessed at baseline and every 3 months postbaseline. The improvement in treatment satisfaction from baseline was originally analyzed with a paired t‐test ignoring missing data and discarding the correlation structure of the longitudinal data. As the original analysis started from missing completely at random assumptions regarding the missing data process, the satisfaction data were re‐examined, and several missing at random (MAR) and missing not at random (MNAR) techniques resulted in adjusted estimate for the improvement in satisfaction over 12 months. Throughout the different sensitivity analyses, the effect sizes remained significant and clinically relevant. Thus, even for an open‐label trial design, sensitivity analysis, with different assumptions for the nature of dropouts (MAR or MNAR) and with different classes of models (selection, pattern‐mixture, or multiple imputation models), has been found useful and provides evidence towards the robustness of the original analyses; additional sensitivity analyses could be undertaken to further qualify robustness. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

19.

On the analysis of long-term experiments

Thomas M. Loughin Mollie Poehlman Roediger George A. Milliken John P. Schmidt 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2007,170(1):29-42

Summary. Long-term experiments are commonly used tools in agronomy, soil science and other disciplines for comparing the effects of different treatment regimes over an extended length of time. Periodic measurements, typically annual, are taken on experimental units and are often analysed by using customary tools and models for repeated measures. These models contain nothing that accounts for the random environmental variations that typically affect all experimental units simultaneously and can alter treatment effects. This added variability can dominate that from all other sources and can adversely influence the results of a statistical analysis and interfere with its interpretation. The effect that this has on the standard repeated measures analysis is quantified by using an alternative model that allows for random variations over time. This model, however, is not useful for analysis because the random effects are confounded with fixed effects that are already in the repeated measures model. Possible solutions are reviewed and recommendations are made for improving statistical analysis and interpretation in the presence of these extra random variations. 相似文献

20.

Did the Military Interventions in the Mexican Drug War Increase Violence?

Valeria Espinosa Donald B. Rubin 《The American statistician》2015,69(1):17-27

We analyze publicly available data to estimate the causal effects of military interventions on the homicide rates in certain problematic regions in Mexico. We use the Rubin causal model to compare the post-intervention homicide rate in each intervened region to the hypothetical homicide rate for that same year had the military intervention not taken place. Because the effect of a military intervention is not confined to the municipality subject to the intervention, a nonstandard definition of units is necessary to estimate the causal effect of the intervention under the standard no-interference assumption of stable-unit treatment value assumption (SUTVA). Donor pools are created for each missing potential outcome under no intervention, thereby allowing for the estimation of unit-level causal effects. A multiple imputation approach accounts for uncertainty about the missing potential outcomes. 相似文献