期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimation of a Semiparametric Recursive Bivariate Probit Model with Nonparametric Mixing

Giampiero Marra Georgios Papageorgiou Rosalba Radice 《Australian & New Zealand Journal of Statistics》2013,55(3):321-342

We consider an extension of the recursive bivariate probit model for estimating the effect of a binary variable on a binary outcome in the presence of unobserved confounders, nonlinear covariate effects and overdispersion. Specifically, the model consists of a system of two binary outcomes with a binary endogenous regressor which includes smooth functions of covariates, hence allowing for flexible functional dependence of the responses on the continuous regressors, and arbitrary random intercepts to deal with overdispersion arising from correlated observations on clusters or from the omission of non‐confounding covariates. We fit the model by maximizing a penalized likelihood using an Expectation‐Maximisation algorithm. The issues of automatic multiple smoothing parameter selection and inference are also addressed. The empirical properties of the proposed algorithm are examined in a simulation study. The method is then illustrated using data from a survey on health, aging and wealth. 相似文献

2.

Multistage sampling for latent variable models

Thomas DC 《Lifetime data analysis》2007,13(4):565-581

I consider the design of multistage sampling schemes for epidemiologic studies involving latent variable models, with surrogate measurements of the latent variables on a subset of subjects. Such models arise in various situations: when detailed exposure measurements are combined with variables that can be used to assign exposures to unmeasured subjects; when biomarkers are obtained to assess an unobserved pathophysiologic process; or when additional information is to be obtained on confounding or modifying variables. In such situations, it may be possible to stratify the subsample on data available for all subjects in the main study, such as outcomes, exposure predictors, or geographic locations. Three circumstances where analytic calculations of the optimal design are possible are considered: (i) when all variables are binary; (ii) when all are normally distributed; and (iii) when the latent variable and its measurement are normally distributed, but the outcome is binary. In each of these cases, it is often possible to considerably improve the cost efficiency of the design by appropriate selection of the sampling fractions. More complex situations arise when the data are spatially distributed: the spatial correlation can be exploited to improve exposure assignment for unmeasured locations using available measurements on neighboring locations; some approaches for informative selection of the measurement sample using location and/or exposure predictor data are considered. 相似文献

3.

Estimating a Marginal Causal Odds Ratio Subject to Confounding

Zhiwei Zhang 《统计学通讯:理论与方法》2013,42(3):309-321

Odds ratios are frequently used to describe the relationship between a binary treatment or exposure and a binary outcome. An odds ratio can be interpreted as a causal effect or a measure of association, depending on whether it involves potential outcomes or the actual outcome. An odds ratio can also be characterized as marginal versus conditional, depending on whether it involves conditioning on covariates. This article proposes a method for estimating a marginal causal odds ratio subject to confounding. The proposed method is based on a logistic regression model relating the outcome to the treatment indicator and potential confounders. Simulation results show that the proposed method performs reasonably well in moderate-sized samples and may even offer an efficiency gain over the direct method based on the sample odds ratio in the absence of confounding. The method is illustrated with a real example concerning coronary heart disease. 相似文献

4.

Addressing misclassification for binary data: probit and t-link regressions

《Journal of Statistical Computation and Simulation》2012,82(10):2187-2213

Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation–maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed. 相似文献

5.

Controlled Direct and Mediated Effects: Definition,Identification and Bounds

TYLER J. VANDERWEELE 《Scandinavian Journal of Statistics》2011,38(3):551-563

Abstract. Results are given which provide bounds for controlled direct effects when nounmeasured confounding assumptions required for the identification of these effects do not hold. Previous results concerning bounds for controlled direct effects rely on monotonicity relationships between the treatment, mediator and the outcome themselves; the results presented in this article instead assume that monotonicity relationships hold between the unmeasured confounding variable or variables and the treatment, mediator and outcome. Whereas prior results give bounds that contain the null hypothesis of no direct effect, the results presented here will in many instances yield bounds that do not contain the null hypothesis of no direct effect. For contexts in which a set of variables intercepts all paths between a treatment and an outcome, it is possible to provide a definition for a controlled mediated effect. We discuss the identification of these controlled mediated effects; the bounds for controlled direct effects are applicable also to controlled mediated effects. An example is given to illustrate how the results in the article can be used to draw inferences about direct and mediated effects in the presence of unmeasured confounding variables. 相似文献

6.

Inclusion of binary proxy variables in logistic regression improves treatment effect estimation in observational studies in the presence of binary unmeasured confounding variables

Cornelius Rosenbaum Qingzhao Yu Sarah Buzhardt Elizabeth Sutton Andrew G. Chapple 《Pharmaceutical statistics》2023,22(6):995-1015

We present a simulation study and application that shows inclusion of binary proxy variables related to binary unmeasured confounders improves the estimate of a related treatment effect in binary logistic regression. The simulation study included 60,000 randomly generated parameter scenarios of sample size 10,000 across six different simulation structures. We assessed bias by comparing the probability of finding the expected treatment effect relative to the modeled treatment effect with and without the proxy variable. Inclusion of a proxy variable in the logistic regression model significantly reduced the bias of the treatment or exposure effect when compared to logistic regression without the proxy variable. Including proxy variables in the logistic regression model improves the estimation of the treatment effect at weak, moderate, and strong association with unmeasured confounders and the outcome, treatment, or proxy variables. Comparative advantages held for weakly and strongly collapsible situations, as the number of unmeasured confounders increased, and as the number of proxy variables adjusted for increased. 相似文献

7.

Selection effects of source of contraceptive supply in an analysis of discontinuation of contraception: multilevel modelling when random effects are correlated with an explanatory variable

Fiona Steele 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2003,166(3):407-423

Summary. Conventional multilevel models assume that the explanatory variables are uncorrelated with the random effects. In some situations, this assumption may be invalid. One such example is the evaluation of a health or social programme that is non-randomly placed and/or in which participation is voluntary. In this case, there may be unobserved factors influencing the placement of the programme and the decision to participate that are correlated with the unobserved factors that influence the outcome of interest. The paper presents an application of a multiprocess multilevel model to assess the difference in rates of discontinuation of contraception between private and Government family planning providers, while accounting for the possibility that there may be unobserved individual and community level factors that influence both a couple's choice of provider and their probability of discontinuation. 相似文献

8.

Monte-Carlo Sensitivity Analysis for Controlled Direct Effects Using Marginal Structural Models in the Presence of Confounded Mediators

Yasutaka Chiba 《统计学通讯:理论与方法》2013,42(10):1739-1749

In randomized trials, investigators are frequently interested in estimating the direct effect of a treatment on an outcome that is not relayed by intermediate variables, in addition to the usual intention-to-treat (ITT) effect. Even if the ITT effect is not confounded due to randomization, the direct effect is not identified when unmeasured variables affect the intermediate and outcome variables. Although the unmeasured variables cannot be adjusted for in the models, it is still important to evaluate the potential bias of these variables quantitatively. This article proposes a sensitivity analysis method for controlled direct effects using a marginal structural model that is an extension of the sensitivity analysis method of unmeasured confounding introduced in the context of observational studies. The proposed method is illustrated using a randomized trial of depression. 相似文献

9.

A generalized bivariate Bernoulli model with covariate dependence

M. Ataharul Islam Abdulhamid A. Alzaid Rafiqul I. Chowdhury Khalaf S. Sultan 《Journal of applied statistics》2013,40(5):1064-1075

Dependence in outcome variables may pose formidable difficulty in analyzing data in longitudinal studies. In the past, most of the studies made attempts to address this problem using the marginal models. However, using the marginal models alone, it is difficult to specify the measures of dependence in outcomes due to association between outcomes as well as between outcomes and explanatory variables. In this paper, a generalized approach is demonstrated using both the conditional and marginal models. This model uses link functions to test for dependence in outcome variables. The estimation and test procedures are illustrated with an application to the mobility index data from the Health and Retirement Survey and also simulations are performed for correlated binary data generated from the bivariate Bernoulli distributions. The results indicate the usefulness of the proposed method. 相似文献

10.

Decomposition analysis as a framework for understanding heterogeneity of treatment effects in non-randomized health care studies

William H. Crown 《Pharmaceutical statistics》2021,20(5):945-951

This paper uses the decomposition framework from the economics literature to examine the statistical structure of treatment effects estimated with observational data compared to those estimated from randomized studies. It begins with the estimation of treatment effects using a dummy variable in regression models and then presents the decomposition method from economics which estimates separate regression models for the comparison groups and recovers the treatment effect using bootstrapping methods. This method shows that the overall treatment effect is a weighted average of structural relationships of patient features with outcomes within each treatment arm and differences in the distributions of these features across the arms. In large randomized trials, it is assumed that the distribution of features across arms is very similar. Importantly, randomization not only balances observed features but also unobserved. Applying high dimensional balancing methods such as propensity score matching to the observational data causes the distributional terms of the decomposition model to be eliminated but unobserved features may still not be balanced in the observational data. Finally, a correction for non-random selection into the treatment groups is introduced via a switching regime model. Theoretically, the treatment effect estimates obtained from this model should be the same as those from a randomized trial. However, there are significant challenges in identifying instrumental variables that are necessary for estimating such models. At a minimum, decomposition models are useful tools for understanding the relationship between treatment effects estimated from observational versus randomized data. 相似文献

11.

A semiparametric approach to hidden Markov models under longitudinal observations

Antonello Maruotti Tobias Rydén 《Statistics and Computing》2009,19(4):381-393

We propose a hidden Markov model for longitudinal count data where sources of unobserved heterogeneity arise, making data overdispersed. The observed process, conditionally on the hidden states, is assumed to follow an inhomogeneous Poisson kernel, where the unobserved heterogeneity is modeled in a generalized linear model (GLM) framework by adding individual-specific random effects in the link function. Due to the complexity of the likelihood within the GLM framework, model parameters may be estimated by numerical maximization of the log-likelihood function or by simulation methods; we propose a more flexible approach based on the Expectation Maximization (EM) algorithm. Parameter estimation is carried out using a non-parametric maximum likelihood (NPML) approach in a finite mixture context. Simulation results and two empirical examples are provided. 相似文献

12.

Identification of the Direction of a Causal Effect by Instrumental Variables

Brendan Kline 《商业与经济统计学杂志》2016,34(2):176-184

This article provides a strategy to identify the existence and direction of a causal effect in a generalized nonparametric and nonseparable model identified by instrumental variables. The causal effect concerns how the outcome depends on the endogenous treatment variable. The outcome variable, treatment variable, other explanatory variables, and the instrumental variable can be essentially any combination of continuous, discrete, or “other” variables. In particular, it is not necessary to have any continuous variables, none of the variables need to have large support, and the instrument can be binary even if the corresponding endogenous treatment variable and/or outcome is continuous. The outcome can be mismeasured or interval-measured, and the endogenous treatment variable need not even be observed. The identification results are constructive, and can be empirically implemented using standard estimation results. 相似文献

13.

A causal proportional hazards estimator under homogeneous or heterogeneous selection in an IV setting

Sørensen Ditte Nørbo Martinussen Torben Tchetgen Tchetgen Eric 《Lifetime data analysis》2019,25(4):639-659

Lifetime Data Analysis - In this paper we present a framework to do estimation in a structural Cox model when there may be unobserved confounding. The model is phrased in terms of a selection bias... 相似文献

14.

Structural equation models for area health outcomes with model selection

Peter Congdon 《Journal of applied statistics》2011,38(4):745-767

Recent analyses seeking to explain variation in area health outcomes often consider the impact on them of latent measures (i.e. unobserved constructs) of population health risk. The latter are typically obtained by forms of multivariate analysis, with a small set of latent constructs derived from a collection of observed indicators, and a few recent area studies take such constructs to be spatially structured rather than independent over areas. A confirmatory approach is often applicable to the model linking indicators to constructs, based on substantive knowledge of relevant risks for particular diseases or outcomes. In this paper, population constructs relevant to a particular set of health outcomes are derived using an integrated model containing all the manifest variables, namely health outcome variables, as well as indicator variables underlying the latent constructs. A further feature of the approach is the use of variable selection techniques to select significant loadings and factors (especially in terms of effects of constructs on health outcomes), so ensuring parsimonious models are selected. A case study considers suicide mortality and self-harm contrasts in the East of England in relation to three latent constructs: deprivation, fragmentation and urbanicity. 相似文献

15.

A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications

Leila Amiri Mojtaba Khazaei Mojtaba Ganjali 《AStA Advances in Statistical Analysis》2018,102(1):95-115

Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model. 相似文献

16.

Hierarchical likelihood approach to non-Gaussian factor analysis

Maengseok Noh Johan H.L. Oud Toni Toharudin 《Journal of Statistical Computation and Simulation》2019,89(9):1555-1573

Factor models, structural equation models (SEMs) and random-effect models share the common feature that they assume latent or unobserved random variables. Factor models and SEMs allow well developed procedures for a rich class of covariance models with many parameters, while random-effect models allow well developed procedures for non-normal models including heavy-tailed distributions for responses and random effects. In this paper, we show how these two developments can be combined to result in an extremely rich class of models, which can be beneficial to both areas. A new fitting procedures for binary factor models and a robust estimation approach for continuous factor models are proposed. 相似文献

17.

Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes

Yue Zhang Kiros Berhane 《Journal of applied statistics》2016,43(4):704-720

We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development. 相似文献

18.

Generalized linear models with covariate measurement error and unknown link function

Nels Johnson 《Journal of applied statistics》2017,44(5):833-852

Generalized linear models (GLMs) with error-in-covariates are useful in epidemiological research due to the ubiquity of non-normal response variables and inaccurate measurements. The link function in GLMs is chosen by the user depending on the type of response variable, frequently the canonical link function. When covariates are measured with error, incorrect inference can be made, compounded by incorrect choice of link function. In this article we propose three flexible approaches for handling error-in-covariates and estimating an unknown link simultaneously. The first approach uses a fully Bayesian (FB) hierarchical framework, treating the unobserved covariate as a latent variable to be integrated over. The second and third are approximate Bayesian approach which use a Laplace approximation to marginalize the variables measured with error out of the likelihood. Our simulation results show support that the FB approach is often a better choice than the approximate Bayesian approaches for adjusting for measurement error, particularly when the measurement error distribution is misspecified. These approaches are demonstrated on an application with binary response. 相似文献

19.

处理效应模型的理论拓展及在政策评价中的应用

纪园园等《统计研究》2020,37(9):106-119

现有文献在利用处理效应模型评估政策时,模型中的假设条件局限性大多较强,在实际应用中很难验证,且一旦这些假设错误,就会引起参数估计的不一致。本文首先在非参数框架下提出了一种关于处理效应模型的半参数估计方法,其既不对模型中的函数形式做任何假定,也允许误差项的联合分布是广义异方差形式,从而大大减少因模型误设而引起的估计偏误。考虑到处理效应的内生性问题,提出了一个两步估计量。第一步关于选择方程进行非参数估计;第二步在结果方程中,利用工具变量法估计平均处理效应。其次,对估计量的大样本性质进行分析,表明了估计量的一致性和渐近正态性质。再次,通过蒙特卡罗模拟与已有估计方法进行比较,结果表明本文的方法具有较强的稳健性。最后,本文将该方法应用于研究高新技术企业认证政策对企业盈利能力影响,研究发现该政策提升了高新技术企业的盈利能力,并且相比于国有企业,该政策对民营企业促进效应更大。相似文献

20.

Trends in smoking cessation: a Markov model approach

Charles G. Minard David W. Wetter Carol J. Etzel 《Journal of applied statistics》2012,39(1):113-127

Intervention trials such as studies on smoking cessation may observe multiple, discrete outcomes over time. When the outcome is binary, participant observations may alternate between two states over the course of the study. The generalized estimating equation (GEE) approach is commonly used to analyze binary, longitudinal data in the context of independent variables. However, the sequence of observations may be assumed to follow a Markov chain with stationary transition probabilities when observations are made at fixed time points. Participants favoring the transition to one particular state over the other would be evidence of a trend in the observations. Using a log-transformed trend parameter, the determinants of a trend in a binary, longitudinal study may be evaluated by maximizing the likelihood function. A new methodology is presented here to test for the presence and determinants of a trend in binary, longitudinal observations. Empirical studies are evaluated and comparisons are made with the GEE approach. Practical application of the proposed method is made to the data available from an intervention study on smoking cessation. 相似文献