期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A three-sample multiple-recapture approach to census population estimation with heterogeneous catchability

Derroch JN Fienberg SE Glonek GF Junker BW 《Journal of the American Statistical Association》1993,88(423):1,137-1,148

"A central assumption in the standard capture-recapture approach to the estimation of the size of a closed population is the homogeneity of the 'capture' probabilities. In this article we develop an approach that allows for varying susceptibility to capture through individual parameters using a variant of the Rasch model from psychological measurement situations. Our approach requires an additional recapture. In the context of census undercount estimation, this requirement amounts to the use of a second independent sample or alternative data source to be matched with census and Post-Enumeration Survey (PES) data.... We illustrate [our] models and their estimation using data from a 1988 dress-rehearsal study for the 1990 census conducted by the U.S. Bureau of the Census, which explored the use of administrative data as a supplement to the PES. The article includes a discussion of extensions and related models." 相似文献

2.

A comparison of using weighted distribution and joint modeling for analyzing non-ignorable missing responses

Zahra Sadat Meshkani Farahani Mojtaba Ganjali 《统计学通讯:模拟与计算》2019,48(3):704-722

In this study, we reconsider weighted distribution from the perspective of missing mechanism since weighted distribution instead of being the distribution of the whole population of interest is only the distribution of respondents (sub-population). After defining some weighted distributions by different mechanisms for indicator of response, we show, by some simulation studies, that using weighted distributions may lead to biased estimates of parameters under the non-ignorable missing mechanism. On the other hand, joint modeling of the response and selection mechanism could result in more efficient and valid estimates of parameters. The lower root of mean squared errors of estimates from the joint modeling approach than those of the weighted distribution is a warranty to the statement that the joint modeling method is more efficient than weighted distribution; this is proved by diverse simulation studies along the article. However, these two methods of the weighted approach and joint modeling give similar results if the selection mechanism is at random. Finally, the methods are applied and compared in the analysis of one well-used real dataset. 相似文献

3.

On inference for Kendall's τ within a longitudinal data setting

Yan Ma 《Journal of applied statistics》2012,39(11):2441-2452

Kendall's τ is a non-parametric measure of correlation based on ranks and is used in a wide range of research disciplines. Although methods are available for making inference about Kendall's τ, none has been extended to modeling multiple Kendall's τs arising in longitudinal data analysis. Compounding this problem is the pervasive issue of missing data in such study designs. In this article, we develop a novel approach to provide inference about Kendall's τ within a longitudinal study setting under both complete and missing data. The proposed approach is illustrated with simulated data and applied to an HIV prevention study. 相似文献

4.

Assessing between-block heterogeneity within the post-strata of the 1990 Post-Enumeration Survey

Hengartner N Speed TP 《Journal of the American Statistical Association》1993,88(423):1,119-1,129

"The 1990 [U.S.] Post-Enumeration Survey (PES) stratified the population into 1,392 subpopulations called post-strata based on location, race, tenure, sex and age, in the hope that these subpopulations were homogeneous in relation to factors affecting the Census coverage....With block-level data from the PES for sites around Detroit and Texas, we are able to examine empirically the extent to which this hope was realized. Using various measures, we find that between-block variation in erroneous enumeration and gross omission rates is about the same magnitude as, and largely in addition to, the corresponding between-post-stratum variation." Comments by Joseph L. Schafer and Donald Ylvisaker and a rejoinder by the authors are included (pp. 1,125-9). 相似文献

5.

A pseudo likelihood approach to analyze rate difference of binary response data in longitudinal factorial studies

Chunpeng Fan 《统计学通讯:模拟与计算》2018,47(7):2169-2183

Although Fan showed that the mixed-effects model for repeated measures (MMRM) is appropriate to analyze complete longitudinal binary data in terms of the rate difference, they focused on using the generalized estimating equations (GEE) to make statistical inference. The current article emphasizes validity of the MMRM when the normal-distribution-based pseudo likelihood approach is used to make inference for complete longitudinal binary data. For incomplete longitudinal binary data with missing at random missing mechanism, however, the MMRM, using either the GEE or the normal-distribution-based pseudo likelihood inferential procedure, gives biased results in general and should not be used for analysis. 相似文献

6.

Multiple imputation compared with restricted pseudo‐likelihood and generalized estimating equations for analysis of binary repeated measures in clinical studies

Ilya Lipkovich Yuyan Duan Saeeduddin Ahmed 《Pharmaceutical statistics》2005,4(4):267-285

Non‐likelihood‐based methods for repeated measures analysis of binary data in clinical trials can result in biased estimates of treatment effects and associated standard errors when the dropout process is not completely at random. We tested the utility of a multiple imputation approach in reducing these biases. Simulations were used to compare performance of multiple imputation with generalized estimating equations and restricted pseudo‐likelihood in five representative clinical trial profiles for estimating (a) overall treatment effects and (b) treatment differences at the last scheduled visit. In clinical trials with moderate to high (40–60%) dropout rates with dropouts missing at random, multiple imputation led to less biased and more precise estimates of treatment differences for binary outcomes based on underlying continuous scores. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

7.

Some asymptotic results for semiparametric nonlinear mixed-effects models with incomplete data

Wei Liu Lang Wu 《Journal of statistical planning and inference》2010,140(1):52-64

In modeling complex longitudinal data, semiparametric nonlinear mixed-effects (SNLME) models are very flexible and useful. Covariates are often introduced in the models to partially explain the inter-individual variations. In practice, data are often incomplete in the sense that there are often measurement errors and missing data in longitudinal studies. The likelihood method is a standard approach for inference for these models but it can be computationally very challenging, so computationally efficient approximate methods are quite valuable. However, the performance of these approximate methods is often based on limited simulation studies, and theoretical results are unavailable for many approximate methods. In this article, we consider a computationally efficient approximate method for a class of SNLME models with incomplete data and investigate its theoretical properties. We show that the estimates based on the approximate method are consistent and asymptotically normally distributed. 相似文献

8.

Estimating heterogeneity in the probabilities of enumeration for dual-system estimation

Alho JM Mulry MH Wurdeman K Kim J 《Journal of the American Statistical Association》1993,88(423):1,130-1,136

"We show how conditional logistic regression can be used to estimate the probability of being enumerated in a census and apply the model to the 1990 Post-Enumeration Survey (PES) in the United States.... We discuss some special problems caused by the fact that the PES sample area is open to migration between the captures. We also consider the effect of data errors in estimation. We characterize hard-to-enumerate populations and give some tentative estimates of correlation bias." 相似文献

9.

Estimation in Regressive Logistic Regression Analyses of Familial Data with Missing Outcomes

Patrick E.B. FitzGerald & Matthew W. Knuiman 《Australian & New Zealand Journal of Statistics》1998,40(3):305-316

This paper examines a number of methods of handling missing outcomes in regressive logistic regression modelling of familial binary data, and compares them with an EM algorithm approach via a simulation study. The results indicate that a strategy based on imputation of missing values leads to biased estimates, and that a strategy of excluding incomplete families has a substantial effect on the variability of the parameter estimates. Recommendations are made which depend, amongst other factors, on the amount of missing data and on the availability of software. 相似文献

10.

Modeling sensitivity and specificity with a time-varying reference standard within a longitudinal setting

Qin Yu Wan Tang Sue Marcus Yan Ma Hui Zhang 《Journal of applied statistics》2010,37(7):1213-1230

Diagnostic tests are used in a wide range of behavioral, medical, psychosocial, and healthcare-related research. Test sensitivity and specificity are the most popular measures of accuracy for diagnostic tests. Available methods for analyzing longitudinal study designs assume fixed gold or reference standards and as such do not apply to studies with dynamically changing reference standards, which are especially popular in psychosocial research. In this article, we develop a novel approach to address missing data and other related issues for modeling sensitivity and specificity within such a time-varying reference standard setting. The approach is illustrated with real as well as simulated data. 相似文献

11.

The impact of missing data on the results of a schizophrenia study

下载免费PDF全文

Denis Rybin Gheorghe Doros Robert Rosenheck Robert Lew 《Pharmaceutical statistics》2015,14(1):4-10

Missing data pose a serious challenge to the integrity of randomized clinical trials, especially of treatments for prolonged illnesses such as schizophrenia, in which long‐term impact assessment is of great importance, but the follow‐up rates are often no more than 50%. Sensitivity analysis using Bayesian modeling for missing data offers a systematic approach to assessing the sensitivity of the inferences made on the basis of observed data. This paper uses data from an 18‐month study of veterans with schizophrenia to demonstrate this approach. Data were obtained from a randomized clinical trial involving 369 patients diagnosed with schizophrenia that compared long‐acting injectable risperidone with a psychiatrist's choice of oral treatment. Bayesian analysis utilizing a pattern‐mixture modeling approach was used to validate the reported results by detecting bias due to non‐random patterns of missing data. The analysis was applied to several outcomes including standard measures of schizophrenia symptoms, quality of life, alcohol use, and global mental status. The original study results for several measures were confirmed against a wide range of patterns of non‐random missingness. Robustness of the conclusions was assessed using sensitivity parameters. The missing data in the trial did not likely threaten the validity of previously reported results. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

12.

Addressing the problem of missing data in decision tree modeling

Saiedeh Haji-Maghsoudi Azam Rastegari Behshid Garrusi 《Journal of applied statistics》2018,45(3):547-557

Tree-based models (TBMs) can substitute missing data using the surrogate approach (SUR). The aim of this study is to compare the performance of statistical imputation against the performance of SUR in TBMs. Employing empirical data, a TBM was constructed. Thereafter, 10%, 20%, and 40% of variable values appeared as the first split was deleted, and imputed with and without the use of outcome variables in the imputation model (IMP? and IMP+). This was repeated one thousand times. Absolute relative bias above 0.10 was defined as sever (SARB). Subsequently, in a series of simulations, the following parameters were changed: the degree of correlation among variables, the number of variables truly associated with the outcome, and the missing rate. At a 10% missing rate, the proportion of times SARB was observed in either SUR or IMP? was two times higher than in IMP+ (28% versus 13%). When the missing rate was increased to 20%, all these proportions were approximately doubled. Irrespective of the missing rate, IMP+ was about 65% less likely to produce SARB than SUR. Results of IMP? and SUR were comparable up to a 20% missing rate. At a high missing rate, IMP? was 76% more likely to provide SARB estimates. Statistical imputation of missing data and the use of outcome variable in the imputation model is recommended, even in the content of TBM. 相似文献

13.

Power and sample size for GEE analysis of incomplete paired outcomes in 2 × 2 crossover trials

Yongqiang Tang 《Pharmaceutical statistics》2021,20(4):820-839

The 2 × 2 crossover trial uses subjects as their own control to reduce the intersubject variability in the treatment comparison, and typically requires fewer subjects than a parallel design. The generalized estimating equations (GEE) methodology has been commonly used to analyze incomplete discrete outcomes from crossover trials. We propose a unified approach to the power and sample size determination for the Wald Z-test and t-test from GEE analysis of paired binary, ordinal and count outcomes in crossover trials. The proposed method allows misspecification of the variance and correlation of the outcomes, missing outcomes, and adjustment for the period effect. We demonstrate that misspecification of the working variance and correlation functions leads to no or minimal efficiency loss in GEE analysis of paired outcomes. In general, GEE requires the assumption of missing completely at random. For bivariate binary outcomes, we show by simulation that the GEE estimate is asymptotically unbiased or only minimally biased, and the proposed sample size method is suitable under missing at random (MAR) if the working correlation is correctly specified. The performance of the proposed method is illustrated with several numerical examples. Adaption of the method to other paired outcomes is discussed. 相似文献

14.

Collective Labor Supply,Taxes, and Intrahousehold Allocation: An Empirical Approach

Hans G. Bloemen 《商业与经济统计学杂志》2013,31(3):471-483

ABSTRACT

Most empirical studies of the impact of labor income taxation on the labor supply behavior of households use a unitary modeling approach. In this article, we empirically analyze income taxation and the choice of working hours by combining the collective approach for household behavior and the discrete hours choice framework with fixed costs of work. We identify the sharing rule parameters with data on working hours of both the husband and the wife within a couple. Parameter estimates are used to evaluate various model outcomes, like the wage elasticities of labor supply and the impacts of wage changes on the intrahousehold allocation of income. We also simulate the consequences of a policy change in the tax system. We find that the collective model has different empirical outcomes of income sharing than a restricted model that imposes income pooling. In particular, a specification with income pooling fails to capture asymmetries in the income sharing across spouses. These differences in outcomes have consequences for the evaluation of policy changes in the tax system and shed light on the effectiveness of certain policies. 相似文献

15.

Multiple imputation for gamma outcome variable using generalized linear model

Vinay K. Gupta Gurprit Grover 《Journal of Statistical Computation and Simulation》2017,87(10):1980-1988

We used a proper multiple imputation (MI) through Gibbs sampling approach to impute missing values of a gamma distributed outcome variable which were missing at random, using generalized linear model (GLM) with identity link function. The missing values of the outcome variable were multiply imputed using GLM and then the complete data sets obtained after MI were analysed through GLM again for the estimation purpose. We examined the performance of the proposed technique through a simulation study with the data sets having four moderate and large proportions of missing values, 10%, 20%, 30% and 50%. We also applied this technique on a real life data and compared the results with those obtained by applying GLM only on observed cases. The results showed that the proposed technique gave better results for moderate proportions of missing values. 相似文献

16.

Latent class based multiple imputation approach for missing categorical data

Mulugeta Gebregziabher Stacia M. DeSantis 《Journal of statistical planning and inference》2010

In this paper we propose a latent class based multiple imputation approach for analyzing missing categorical covariate data in a highly stratified data model. In this approach, we impute the missing data assuming a latent class imputation model and we use likelihood methods to analyze the imputed data. Via extensive simulations, we study its statistical properties and make comparisons with complete case analysis, multiple imputation, saturated log-linear multiple imputation and the Expectation–Maximization approach under seven missing data mechanisms (including missing completely at random, missing at random and not missing at random). These methods are compared with respect to bias, asymptotic standard error, type I error, and 95% coverage probabilities of parameter estimates. Simulations show that, under many missingness scenarios, latent class multiple imputation performs favorably when jointly considering these criteria. A data example from a matched case–control study of the association between multiple myeloma and polymorphisms of the Inter-Leukin 6 genes is considered. 相似文献

17.

Bayesian semiparametric models for nonignorable missing mechanisms in generalized linear models

Z. I. Kalaylioglu O. Ozturk 《Journal of applied statistics》2013,40(8):1746-1763

Semiparametric models provide a more flexible form for modeling the relationship between the response and the explanatory variables. On the other hand in the literature of modeling for the missing variables, canonical form of the probability of the variable being missing (p) is modeled taking a fully parametric approach. Here we consider a regression spline based semiparametric approach to model the missingness mechanism of nonignorably missing covariates. In this model the relationship between the suitable canonical form of p (e.g. probit p) and the missing covariate is modeled through several splines. A Bayesian procedure is developed to efficiently estimate the parameters. A computationally advantageous prior construction is proposed for the parameters of the semiparametric part. A WinBUGS code is constructed to apply Gibbs sampling to obtain the posterior distributions. We show through an extensive Monte Carlo simulation experiment that response model coefficent estimators maintain better (when the true missingness mechanism is nonlinear) or equivalent (when the true missingness mechanism is linear) bias and efficiency properties with the use of proposed semiparametric missingness model compared to the conventional model. 相似文献

18.

Inference methods for saturated models in longitudinal clinical trials with incomplete binary data

Song JX 《Pharmaceutical statistics》2006,5(4):295-304

In the longitudinal studies with binary response, it is often of interest to estimate the percentage of positive responses at each time point and the percentage of having at least one positive response by each time point. When missing data exist, the conventional method based on observed percentages could result in erroneous estimates. This study demonstrates two methods of using expectation-maximization (EM) and data augmentation (DA) algorithms in the estimation of the marginal and cumulative probabilities for incomplete longitudinal binary response data. Both methods provide unbiased estimates when the missingness mechanism is missing at random (MAR) assumption. Sensitivity analyses have been performed for cases when the MAR assumption is in question. 相似文献

19.

Multivariate forests with missing mixed outcomes

Abdessamad Dine François Bellavance 《统计学通讯:理论与方法》2017,46(23):11500-11513

In this article, we propose a multivariate random forest method for multiple responses of mixed types with missing responses. Imputation is performed for each bootstrap sample used to build the individual trees that form the forest. The individual trees are built using a weighted splitting rule allowing downweighting of imputed observations. A simulation study shows the benefits of this approach over complete case analysis when missing responses are missing completely at random and missing at random (MAR). In particular, the gain in prediction accuracy of the proposed method is larger in the MAR case and also increases as the proportion of missing increases. 相似文献

20.

Marginal and association regression models for longitudinal binary data with drop‐outs: A likelihood‐based approach

Grace Y. Yi Mary E. Thompson 《Revue canadienne de statistique》2005,33(1):3-20

Longitudinal data often contain missing observations, and it is in general difficult to justify particular missing data mechanisms, whether random or not, that may be hard to distinguish. The authors describe a likelihood‐based approach to estimating both the mean response and association parameters for longitudinal binary data with drop‐outs. They specify marginal and dependence structures as regression models which link the responses to the covariates. They illustrate their approach using a data set from the Waterloo Smoking Prevention Project They also report the results of simulation studies carried out to assess the performance of their technique under various circumstances. 相似文献