期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The magnitude of the difference of crude and adjusted log odds ratios

Ruth M. Mickey 《统计学通讯:理论与方法》2013,42(11):3403-3415

Crude and adjusted odds ratios, calculated from a collapsed 2×2 table or a stratified 2×2×K table, can be very similar or quite different when significant associations are found between each dichotomous variable and the K-level stratifying variable. It is demonstrated here that the magnitude of the difference between the logs of the two estimators can be approximated by 4 times the covariance between log linear interactions describing the associations of each of the binary variables with the stratifying variable. Two data examples illustrate the usefulness of the variability and covariability of the interactions in providing a statistical accounting for the magnitude of the difference between the logs of the crude and adjusted odds ratios. Other interpretations and applications of the variances and covariances of the log linear interactions are discussed. 相似文献

2.

Inclusion of binary proxy variables in logistic regression improves treatment effect estimation in observational studies in the presence of binary unmeasured confounding variables

Cornelius Rosenbaum Qingzhao Yu Sarah Buzhardt Elizabeth Sutton Andrew G. Chapple 《Pharmaceutical statistics》2023,22(6):995-1015

We present a simulation study and application that shows inclusion of binary proxy variables related to binary unmeasured confounders improves the estimate of a related treatment effect in binary logistic regression. The simulation study included 60,000 randomly generated parameter scenarios of sample size 10,000 across six different simulation structures. We assessed bias by comparing the probability of finding the expected treatment effect relative to the modeled treatment effect with and without the proxy variable. Inclusion of a proxy variable in the logistic regression model significantly reduced the bias of the treatment or exposure effect when compared to logistic regression without the proxy variable. Including proxy variables in the logistic regression model improves the estimation of the treatment effect at weak, moderate, and strong association with unmeasured confounders and the outcome, treatment, or proxy variables. Comparative advantages held for weakly and strongly collapsible situations, as the number of unmeasured confounders increased, and as the number of proxy variables adjusted for increased. 相似文献

3.

Variable selection and importance in presence of high collinearity: an application to the prediction of lean body mass from multi-frequency bioelectrical impedance

Camillo Cammarota Alessandro Pinto 《Journal of applied statistics》2021,48(9):1644

In prediction problems both response and covariates may have high correlation with a second group of influential regressors, that can be considered as background variables. An important challenge is to perform variable selection and importance assessment among the covariates in the presence of these variables. A clinical example is the prediction of the lean body mass (response) from bioimpedance (covariates), where anthropometric measures play the role of background variables. We introduce a reduced dataset in which the variables are defined as the residuals with respect to the background, and perform variable selection and importance assessment both in linear and random forest models. Using a clinical dataset of multi-frequency bioimpedance, we show the effectiveness of this method to select the most relevant predictors of the lean body mass beyond anthropometry. 相似文献

4.

On Multi-Treatment Adaptive Allocation Design for Dichotomous Response

Uttam Bandyopadhyay Saurav De 《统计学通讯:理论与方法》2013,42(22):4104-4124

The use of ridit, as a probability score, is a very common practice to compare discrete random variables in discrete data analysis. In the present work we formulate ridit reliability functionals for some comparison of K independent binary random variables. We use such functionals to provide a generalized response-adaptive design (GRAD) on K(≥ +2) treatment-arms for dichotomous response variables. We exhibit some properties of the proposed design and compare it with some of the existing competitors by computing its various performance measures. We also provide a discussion towards a possible modification of the GRAD in the presence of covariates. 相似文献

5.

Modified SEE variable selection for varying coefficient instrumental variable models

《Statistical Methodology》2013

We consider the problem of variable selection for a class of varying coefficient models with instrumental variables. We focus on the case that some covariates are endogenous variables, and some auxiliary instrumental variables are available. An instrumental variable based variable selection procedure is proposed by using modified smooth-threshold estimating equations (SEEs). The proposed procedure can automatically eliminate the irrelevant covariates by setting the corresponding coefficient functions as zero, and simultaneously estimate the nonzero regression coefficients by solving the smooth-threshold estimating equations. The proposed variable selection procedure avoids the convex optimization problem, and is flexible and easy to implement. Simulation studies are carried out to assess the performance of the proposed variable selection method. 相似文献

6.

Applications of the Bootstrap in ROC Analysis

《统计学通讯:模拟与计算》2012,41(6):865-877

The problem of estimating standard errors for diagnostic accuracy measures might be challenging for many complicated models. We can address such a problem by using the Bootstrap methods to blunt its technical edge with resampled empirical distributions. We consider two cases where bootstrap methods can successfully improve our knowledge of the sampling variability of the diagnostic accuracy estimators. The first application is to make inference for the area under the ROC curve resulted from a functional logistic regression model which is a sophisticated modelling device to describe the relationship between a dichotomous response and multiple covariates. We consider using this regression method to model the predictive effects of multiple independent variables on the occurrence of a disease. The accuracy measures, such as the area under the ROC curve (AUC) are developed from the functional regression. Asymptotical results for the empirical estimators are provided to facilitate inferences. The second application is to test the difference of two weighted areas under the ROC curve (WAUC) from a paired two sample study. The correlation between the two WAUC complicates the asymptotic distribution of the test statistic. We then employ the bootstrap methods to gain satisfactory inference results. Simulations and examples are supplied in this article to confirm the merits of the bootstrap methods. 相似文献

7.

A maximin criterion for the logistic random intercept model with covariates

《Journal of statistical planning and inference》2006,136(3):962-981

A maximin criterion is used to find optimal designs for the logistic random intercept model with dichotomous independent variables. The dichotomous independent variables can be subdivided into variables for which the distribution is specified prior to data sampling, called variates, and into variables for which the distribution is not specified prior to data sampling, but is obtained from data sampling, called covariates. The proposed maximin criterion maximizes the smallest possible relative efficiency not only with respect to all possible values of the model parameters, but also with respect to the joint distribution of the covariates. We have shown that, under certain conditions, the maximin design is balanced with respect to the joint distribution of the variates. The proposed method will be used to plan a (stratified) clinical trial where variates and covariates are involved. 相似文献

8.

A Contrasting Study of Likelihood Methods for the Analysis of Longitudinal Binary Data

Weiming Yang 《统计学通讯:理论与方法》2014,43(14):3027-3046

相似文献

9.

Quantifying R 2 bias in the presence of measurement error

Karl D. Majeske Terri Lynch-Caris Janet Brelin-Fornari 《Journal of applied statistics》2010,37(4):667-677

相似文献

10.

Fast robust feature screening for ultrahigh-dimensional varying coefficient models

Xuejun Ma Xin Chen 《Journal of Statistical Computation and Simulation》2017,87(4):724-732

In this paper, we propose a new partial correlation, the so-called composite quantile partial correlation, to measure the relationship of two variables given other variables. We further use this correlation to screen variables in ultrahigh-dimensional varying coefficient models. Our proposed method is fast and robust against outliers and can be efficiently employed in both single index variable and multiple index variable varying coefficient models. Numerical results indicate the preference of our proposed method. 相似文献

11.

Pseudo latent models: Goodness of fit measures, residuals, estimation, testing, and simulation

Olaf Hübler 《Statistical Papers》1997,38(3):271-285

Binary response models consider pseudo-R ² measures which are not based on residuals while several concepts of residuals were developed for tests. In this paper the endogenous variable of the latent model corresponding to the binary observable model is substituted by a pseudo variable. Then goodness of fit measures and tests can be based on a joint concept of residuals as for linear models. Different kinds of residuals based on probit ML estimates are employed. The analytical investigations and the simulation results lead to the recommendation to use standardized residuals where there is no difference between observed and generalized residuals. In none of the investigated situations this estimator is far away from the best result. While in large samples all considered estimators are very similar, small sample properties speak in favour of residuals which are modifications of those suggested in the literature. An empirical application demonstrates that it is not necessary to develop new testing procedures for the observable models with dichotomous regressands. Well-know approaches for linear models with continuous endogenous variables which are implemented in usual econometric packages can be used for pseudo latent models. An erratum to this article is available at . 相似文献

12.

A Bayesian Adjustment for Covariate Misclassification with Correlated Binary Outcome Data

Dianxu Ren Roslyn A. Stone 《Journal of applied statistics》2007,34(9):1019-1034

Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty. 相似文献

13.

Goodness-of-fit measures in binary choice models

Frank A. G. Windmeijer 《Econometric Reviews》1995,14(1):101-116

In this paper, a review is given of various goodness-of-fit measures that have been proposed for the binary choice model in the last two decades. The relative behaviour of several pseudo-R² measures is analysed in a series of misspecified binary choice models, the misspecification being omitted variables or an included irrelevant variable. A comparison is made with the OLS-R² of the underlying latent variable model and with the squared sample correlation coefficient of the true and predicted probabilities. Further, it is investigated how the values of the measures change with a changing frequency rate of successes. 相似文献

14.

Confidence intervals for the difference between two median survival times for clustered survival data

Yu-Mei Chang Pao-Sheng Shen Guan-Wei Liu 《Journal of applied statistics》2016,43(12):2325-2345

Clustered survival data arise often in clinical trial design, where the correlated subunits from the same cluster are randomized to different treatment groups. Under such design, we consider the problem of constructing confidence interval for the difference of two median survival time given the covariates. We use Cox gamma frailty model to account for the within-cluster correlation. Based on the conditional confidence intervals, we can identify the possible range of covariates over which the two groups would provide different median survival times. The associated coverage probability and the expected length of the proposed interval are investigated via a simulation study. The implementation of the confidence intervals is illustrated using a real data set. 相似文献

15.

Testing for constancy in varying coefficient models

Mohamed Ahkim 《统计学通讯:理论与方法》2018,47(4):890-911

We consider varying coefficient models, which are an extension of the classical linear regression models in the sense that the regression coefficients are replaced by functions in certain variables (for example, time), the covariates are also allowed to depend on other variables. Varying coefficient models are popular in longitudinal data and panel data studies, and have been applied in fields such as finance and health sciences. We consider longitudinal data and estimate the coefficient functions by the flexible B-spline technique. An important question in a varying coefficient model is whether an estimated coefficient function is statistically different from a constant (or zero). We develop testing procedures based on the estimated B-spline coefficients by making use of nice properties of a B-spline basis. Our method allows longitudinal data where repeated measurements for an individual can be correlated. We obtain the asymptotic null distribution of the test statistic. The power of the proposed testing procedures are illustrated on simulated data where we highlight the importance of including the correlation structure of the response variable and on real data. 相似文献

16.

Bayesian testing for independence of two categorical variables under two-stage cluster sampling with covariates

Dilli Bhatta Balgobin Nandram Joseph Sedransk 《Journal of applied statistics》2018,45(13):2365-2393

We consider Bayesian testing for independence of two categorical variables with covariates for a two-stage cluster sample. This is a difficult problem because we have a complex sample (i.e. cluster sample), not a simple random sample. Our approach is to convert the cluster sample with covariates into an equivalent simple random sample without covariates, which provides a surrogate of the original sample. Then, this surrogate sample is used to compute the Bayes factor to make an inference about independence. We apply our methodology to the data from the Trend in International Mathematics and Science Study [30] for fourth grade US students to assess the association between the mathematics and science scores represented as categorical variables. We show that if there is strong association between two categorical variables, there is no significant difference between the tests with and without the covariates. We also performed a simulation study to further understand the effect of covariates in various situations. We found that for borderline cases (moderate association between the two categorical variables), there are noticeable differences in the test with and without covariates. 相似文献

17.

Sensitivity analysis for two control groups

Paul R Rosenbaum 《统计学通讯:理论与方法》2013,42(11):2687-2698

In many experiments where data have been collected at two points in time (pre-treatment and post-treatment), investigators wish to determine if there is a difference between two treatment groups. In recent years it has been proposed that an appropriate statistical analysis to determine if treatment differences exist is to use the post-treatment values as the primary comparison variables and the pre-treatment values as covariates. When there are several outcome variables, we propose new tests based on residuals as alternatives to existing methods and investigate how the powers of the new and existing tests are affected by various choices of covariates. The limiting distribution of the test statistic of the new test based on residuals is given. Monte Carlo simulations are employed in the power comparisons. 相似文献

18.

Covariate selection for estimating the causal effect of control plans by using causal diagrams

Manabu Kuroki Masami Miyakawa 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(1):209-222

Summary. Consider a case where cause–effect relationships between variables can be described by a causal path diagram and the corresponding linear structural equation model. The paper proposes a graphical selection criterion for covariates to estimate the causal effect of a control plan. For designing the control plan, it is essential to determine both covariates that are used for control and covariates that are used for identification. The selection of covariates used for control is only constrained by the requirement that the covariates be non-descendants of a treatment variable. However, the selection of covariates used for identification is dependent on the selection of covariates used for control and is not unique. In the paper, the difference between covariates that are used for identification is evaluated on the basis of the asymptotic variance of the estimated causal effect of an effective control plan. Furthermore, the results can be also described in terms of a graph structure. 相似文献

19.

Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies

Austin PC 《Pharmaceutical statistics》2011,10(2):150-161

In a study comparing the effects of two treatments, the propensity score is the probability of assignment to one treatment conditional on a subject's measured baseline covariates. Propensity-score matching is increasingly being used to estimate the effects of exposures using observational data. In the most common implementation of propensity-score matching, pairs of treated and untreated subjects are formed whose propensity scores differ by at most a pre-specified amount (the caliper width). There has been a little research into the optimal caliper width. We conducted an extensive series of Monte Carlo simulations to determine the optimal caliper width for estimating differences in means (for continuous outcomes) and risk differences (for binary outcomes). When estimating differences in means or risk differences, we recommend that researchers match on the logit of the propensity score using calipers of width equal to 0.2 of the standard deviation of the logit of the propensity score. When at least some of the covariates were continuous, then either this value, or one close to it, minimized the mean square error of the resultant estimated treatment effect. It also eliminated at least 98% of the bias in the crude estimator, and it resulted in confidence intervals with approximately the correct coverage rates. Furthermore, the empirical type I error rate was approximately correct. When all of the covariates were binary, then the choice of caliper width had a much smaller impact on the performance of estimation of risk differences and differences in means. 相似文献

20.

Joint GEEs for multivariate correlated data with incomplete binary outcomes

G. Inan R. Yucel 《Journal of applied statistics》2017,44(11):1920-1937

This study considers a fully-parametric but uncongenial multiple imputation (MI) inference to jointly analyze incomplete binary response variables observed in a correlated data settings. Multiple imputation model is specified as a fully-parametric model based on a multivariate extension of mixed-effects models. Dichotomized imputed datasets are then analyzed using joint GEE models where covariates are associated with the marginal mean of responses with response-specific regression coefficients and a Kronecker product is accommodated for cluster-specific correlation structure for a given response variable and correlation structure between multiple response variables. The validity of the proposed MI-based JGEE (MI-JGEE) approach is assessed through a Monte Carlo simulation study under different scenarios. The simulation results, which are evaluated in terms of bias, mean-squared error, and coverage rate, show that MI-JGEE has promising inferential properties even when the underlying multiple imputation is misspecified. Finally, Adolescent Alcohol Prevention Trial data are used for illustration. 相似文献