期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A table of predictive success probabilities for logistic regression

S. G. Meester D. Eaves 《统计学通讯:模拟与计算》2013,42(4):1137-1139

A table of expected success rates under normally distributed success logit, used in conjunction with logistic regression analysis, enables easy calculation of expected win for betting on success of a future dichotomous trial. 相似文献

2.

Predictive comparisons in ordinal models

Aipore R. de Moraes Ian R. Dunsmore 《统计学通讯:理论与方法》2013,42(8):2145-2164

Bayesian predictive probability function approximations are derived and compared for ordinal logistic regression models. Classification and variable selection problems are also discussed. The methods are illustrated on a large data set of head injury patients. 相似文献

3.

Optimal designs for multivariate logistic mixed models with longitudinal data

Hong-Yan Jiang Xiao-Dong Zhou 《统计学通讯:理论与方法》2019,48(4):850-864

This paper considers the optimal design problem for multivariate mixed-effects logistic models with longitudinal data. A decomposition method of the binary outcome and the penalized quasi-likelihood are used to obtain the information matrix. The D-optimality criterion based on the approximate information matrix is minimized under different cost constraints. The results show that the autocorrelation coefficient plays a significant role in the design. To overcome the dependence of the D-optimal designs on the unknown fixed-effects parameters, the Bayesian D-optimality criterion is proposed. The relative efficiencies of designs reveal that both the cost ratio and autocorrelation coefficient play an important role in the optimal designs. 相似文献

4.

A comparison of ordinal logistic regression models using Classical and Bayesian approaches in an analysis of factors associated with diabetic retinopathy

K. Vaitheeswaran M. Subbiah R. Ramakrishnan T. Kannan 《Journal of applied statistics》2016,43(12):2254-2260

Estimating the risk factors of a disease such as diabetic retinopathy (DR) is one of the important research problems among bio-medical and statistical practitioners as well as epidemiologists. Incidentally many studies have focused in building models with binary outcomes, that may not exploit the available information. This article has investigated the importance of retaining the ordinal nature of the response variable (e.g. severity level of a disease) while determining the risk factors associated with DR. A generalized linear model approach with appropriate link functions has been studied using both Classical and Bayesian frameworks. From the result of this study, it can be observed that the ordinal logistic regression with probit link function could be more appropriate approach in determining the risk factors of DR. The study has emphasized the ways to handle the ordinal nature of the response variable with better model fit compared to other link functions. 相似文献

5.

Tests for goodness of fit in ordinal logistic regression models

《Journal of Statistical Computation and Simulation》2012,82(17):3398-3418

相似文献

6.

The analysis of age-specific fertility patterns via logistic models

Cristina Rueda-Sabater Pedro C. Alvarez-Esteban 《Journal of applied statistics》2008,35(9):1053-1070

In this paper, we introduce logistic models to analyse fertility curves. The models are formulated as linear models of the log odds of fertility and are defined in terms of parameters that are interpreted as measures of level, location and shape of the fertility schedule. This parameterization is useful for the evaluation, and interpretation of fertility trends and projections of future period fertility. For a series of years, the proposed models admit a state-space formulation that allows a coherent joint estimation of parameters and forecasting. The main features of the models compared with other alternatives are the functional simplicity, the flexibility, and the interpretability of the parameters. These and other features are analysed in this paper using examples and theoretical results. Data from different countries are analysed, and to validate the logistic approach, we compare the goodness of fit of the new model against well-known alternatives; the analysis gives superior results in most developed countries. 相似文献

7.

Inverse probability of censoring weighting for visual predictive checks of time-to-event models with time-varying covariates

Christian Bartels Thomas Dumortier 《Pharmaceutical statistics》2021,20(6):1051-1060

When constructing models to summarize clinical data to be used for simulations, it is good practice to evaluate the models for their capacity to reproduce the data. This can be done by means of Visual Predictive Checks (VPC), which consist of several reproductions of the original study by simulation from the model under evaluation, calculating estimates of interest for each simulated study and comparing the distribution of those estimates with the estimate from the original study. This procedure is a generic method that is straightforward to apply, in general. Here we consider the application of the method to time-to-event data and consider the special case when a time-varying covariate is not known or cannot be approximated after event time. In this case, simulations cannot be conducted beyond the end of the follow-up time (event or censoring time) in the original study. Thus, the simulations must be censored at the end of the follow-up time. Since this censoring is not random, the standard KM estimates from the simulated studies and the resulting VPC will be biased. We propose to use inverse probability of censoring weighting (IPoC) method to correct the KM estimator for the simulated studies and obtain unbiased VPCs. For analyzing the Cantos study, the IPoC weighting as described here proved valuable and enabled the generation of VPCs to qualify PKPD models for simulations. Here, we use a generated data set, which allows illustration of the different situations and evaluation against the known truth. 相似文献

8.

Influential observations in GARCH models

《Journal of Statistical Computation and Simulation》2012,82(11):1571-1589

This paper examines local influence assessment in generalized autoregressive conditional heteroscesdasticity models with Gaussian and Student-t errors, where influence is examined via the likelihood displacement. The analysis of local influence is discussed under three perturbation schemes: data perturbation, innovative model perturbation and additive model perturbation. For each case, expressions for slope and curvature diagnostics are derived. Monte Carlo experiments are presented to determine the threshold values for locating influential observations. The empirical study of daily returns of the New York Stock Exchange composite index shows that local influence analysis is a useful technique for detecting influential observations; most of the observations detected as influential are associated with historical shocks in the market. Finally, based on this empirical study and the analysis of simulated data, some advice is given on how to use the discussed methodology. 相似文献

9.

Efficiency of reduced logistic regression models

Shelley B. Bull Celia M.T. Greenwood Allan Donner 《Revue canadienne de statistique》1994,22(3):319-334

One feature of the usual polychotomous logistic regression model for categorical outcomes is that a covariate must be included in all the regression equations. If a covariate is not important in all of them, the procedure will estimate unnecessary parameters. More flexible approaches allow different subsets of covariates in different regressions. One alternative uses individualized regressions which express the polychotomous model as a series of dichotomous models. Another uses a model in which a reduced set of parameters is simultaneously estimated for all the regressions. Large-sample efficiencies of these procedures were compared in a variety of circumstances in which there was a common baseline category for the outcome and the covariates were normally distributed. For a correctly specified model, the reduced estimates were over 100% efficient for nonzero slope parameters and up to 500% efficient when the baseline frequency and the effect of interest were small. The individualized estimates could have efficiencies less than 50% when the effect of interest was large, but were also up to 130% efficient when the baseline frequency was large and the effect of interest was small. Efficiency was usually enhanced by correlation among the covariates. For an underspecified reduced model, asymptotic bias in the reduced estimates was approximately proportional to the magnitude of the omitted parameter and to the reciprocal of the baseline frequency. 相似文献

10.

Robust logistic regression of family data in the presence of missing genotypes

Yanping Qiu 《Journal of applied statistics》2019,46(5):926-945

Large cohort studies are commonly launched to study the risk effect of genetic variants or other risk factors on a chronic disorder. In these studies, family data are often collected to provide additional information for the purpose of improving the inference results. Statistical analysis of the family data can be very challenging due to the missing observations of genotypes, incomplete records of disease occurrences in family members, and the complicated dependence attributed to the shared genetic background and environmental factors. In this article, we investigate a class of logistic models with family-shared random effects to tackle these challenges, and develop a robust regression method based on the conditional logistic technique for statistical inference. An expectation–maximization (EM) algorithm with fast computation speed is developed to handle the missing genotypes. The proposed estimators are shown to be consistent and asymptotically normal. Additionally, a score test based on the proposed method is derived to test the genetic effect. Extensive simulation studies demonstrate that the proposed method performs well in finite samples in terms of estimate accuracy, robustness and computational speed. The proposed procedure is applied to an Alzheimer's disease study. 相似文献

11.

Assessing goodness-of-fit of categorical regression models based on case-control data

Biao Zhang 《Australian & New Zealand Journal of Statistics》2004,46(3):407-423

Demonstrated equivalence between a categorical regression model based on case‐control data and an I‐sample semiparametric selection bias model leads to a new goodness‐of‐fit test. The proposed test statistic is an extension of an existing Kolmogorov–Smirnov‐type statistic and is the weighted average of the absolute differences between two estimated distribution functions in each response category. The paper establishes an optimal property for the maximum semiparametric likelihood estimator of the parameters in the I‐sample semiparametric selection bias model. It also presents a bootstrap procedure, some simulation results and an analysis of two real datasets. 相似文献

12.

Nonparametric predictive comparison of lifetime data under progressive censoring

Tahani A. Maturi Pauline Coolen-Schrijner Frank P.A. Coolen 《Journal of statistical planning and inference》2010

In reliability and lifetime testing, comparison of two groups of data is a common problem. It is often attractive, or even necessary, to make a quick and efficient decision in order to save time and costs. This paper presents a nonparametric predictive inference (NPI) approach to compare two groups, say X and Y, when one (or both) is (are) progressively censored. NPI can easily be applied to different types of progressive censoring schemes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. These inferences consider the event that the lifetime of a future unit from Y is greater than the lifetime of a future unit from X. 相似文献

13.

The biases associated with maximum likelihood methods of estimation of the multivariate logistic risk function

K. Byth G.J. Mclachlan 《统计学通讯:理论与方法》2013,42(9):877-890

The posterior probability of an object belonging to one of two populations can be estimated using multivariate logistic regression. The bias associated with this procedure is derived In the context of normal populations with different mean vectors and a common covariance matrix and is compared with the bias of the classical method based on this normality assumption, -It Is found that the bias of the more robust procedure of logistic regression is of a lower order than that of the normality based method. 相似文献

14.

A model-based concordance-type index for evaluating the added predictive ability of novel risk factors and markers in the logistic regression models

M. Shafiqur Rahman Afrin Sadia Rumana 《Journal of applied statistics》2019,46(12):2145-2163

ABSTRACT

The Concordance statistic (C-statistic) is commonly used to assess the predictive performance (discriminatory ability) of logistic regression model. Although there are several approaches for the C-statistic, their performance in quantifying the subsequent improvement in predictive accuracy due to inclusion of novel risk factors or biomarkers in the model has been extremely criticized in literature. This paper proposed a model-based concordance-type index, C_K, for use with logistic regression model. The C_K and its asymptotic sampling distribution is derived following Gonen and Heller's approach for Cox PH model for survival data but taking necessary modifications for use with binary data. Unlike the existing C-statistics for logistic model, it quantifies the concordance probability by taking the difference in the predicted risks between two subjects in a pair rather than ranking them and hence is able to quantify the equivalent incremental value from the new risk factor or marker. The simulation study revealed that the C_K performed well when the model parameters are correctly estimated for large sample and showed greater improvement in quantifying the additional predictive value from the new risk factor or marker than the existing C-statistics. Furthermore, the illustration using three datasets supports the findings from simulation study. 相似文献

15.

The effect of small sample on optimal designs for logistic regression models

S. Mehr Mansour 《统计学通讯:理论与方法》2019,48(12):2893-2903

Asymptotic methods are commonly used in statistical inference for unknown parameters in binary data models. These methods are based on large sample theory, a condition which may be in conflict with small sample size and hence leads to poor results in the optimal designs theory. In this paper, we apply the second order expansions of the maximum likelihood estimator and derive a matrix formula for the mean square error (MSE) to obtain more precise optimal designs based on the MSE. Numerical results indicate the new optimal designs are more efficient than the optimal designs based on the information matrix. 相似文献

16.

The extreme residuals in logistic regression models

《Journal of Statistical Computation and Simulation》2012,82(1-2):115-125

Goodness-of-fit tests for logistic regression models using extreme residuals are considered. Approximations to the moments of the Pearson residuals are given for model fits made by maximum likelihood, minimum chi-square and weighted least squares and used to define modified residuals. Approximations to the critical values of the extreme statistics based on the ordinary and modified Pearson residuals are developed and assessed for the case of a single explanatory variable. 相似文献

17.

The probabilistic reduction approach to specifying multinomial logistic regression models in health outcomes research

Jason S. Bergtold Eberechukwu Onukwugha 《Journal of applied statistics》2014,41(10):2206-2221

The paper provides a novel application of the probabilistic reduction (PR) approach to the analysis of multi-categorical outcomes. The PR approach, which systematically takes account of heterogeneity and functional form concerns, can improve the specification of binary regression models. However, its utility for systematically enriching the specification of and inference from models of multi-categorical outcomes has not been examined, while multinomial logistic regression models are commonly used for inference and, increasingly, prediction. Following a theoretical derivation of the PR-based multinomial logistic model (MLM), we compare functional specification and marginal effects from a traditional specification and a PR-based specification in a model of post-stroke hospital discharge disposition and find that the traditional MLM is misspecified. Results suggest that the impact on the reliability of substantive inferences from a misspecified model may be significant, even when model fit statistics do not suggest a strong lack of fit compared with a properly specified model using the PR approach. We identify situations under which a PR-based MLM specification can be advantageous to the applied researcher. 相似文献

18.

Comparative GMM and GQL logistic regression models on hierarchical data

Bei Wang Jeffrey R. Wilson 《Journal of applied statistics》2018,45(3):409-425

We often rely on the likelihood to obtain estimates of regression parameters but it is not readily available for generalized linear mixed models (GLMMs). Inferences for the regression coefficients and the covariance parameters are key in these models. We presented alternative approaches for analyzing binary data from a hierarchical structure that do not rely on any distributional assumptions: a generalized quasi-likelihood (GQL) approach and a generalized method of moments (GMM) approach. These are alternative approaches to the typical maximum-likelihood approximation approach in Statistical Analysis System (SAS) such as Laplace approximation (LAP). We examined and compared the performance of GQL and GMM approaches with multiple random effects to the LAP approach as used in PROC GLIMMIX, SAS. The GQL approach tends to produce unbiased estimates, whereas the LAP approach can lead to highly biased estimates for certain scenarios. The GQL approach produces more accurate estimates on both the regression coefficients and the covariance parameters with smaller standard errors as compared to the GMM approach. We found that both GQL and GMM approaches are less likely to result in non-convergence as opposed to the LAP approach. A simulation study was conducted and a numerical example was presented for illustrative purposes. 相似文献

19.

Identification of multiple high leverage points in logistic regression

A.H.M. Rahmatullah Imon Ali S. Hadi 《Journal of applied statistics》2013,40(12):2601-2616

Leverage values are being used in regression diagnostics as measures of unusual observations in the X-space. Detection of high leverage observations or points is crucial due to their responsibility for masking outliers. In linear regression, high leverage points (HLP) are those that stand far apart from the center (mean) of the data and hence the most extreme points in the covariate space get the highest leverage. But Hosemer and Lemeshow [Applied logistic regression, Wiley, New York, 1980] pointed out that in logistic regression, the leverage measure contains a component which can make the leverage values of genuine HLP misleadingly very small and that creates problem in the correct identification of the cases. Attempts have been made to identify the HLP based on the median distances from the mean, but since they are designed for the identification of a single high leverage point they may not be very effective in the presence of multiple HLP due to their masking (false–negative) and swamping (false–positive) effects. In this paper we propose a new method for the identification of multiple HLP in logistic regression where the suspect cases are identified by a robust group deletion technique and they are confirmed using diagnostic techniques. The usefulness of the proposed method is then investigated through several well-known examples and a Monte Carlo simulation. 相似文献

20.

Evidence of bias in the Eurovision song contest: modelling the votes using Bayesian hierarchical models

Marta Blangiardo 《Journal of applied statistics》2014,41(10):2312-2322

The Eurovision Song Contest is an annual musical competition held among active members of the European Broadcasting Union since 1956. The event is televised live across Europe. Each participating country presents a song and receive a vote based on a combination of tele-voting and jury. Over the years, this has led to speculations of tactical voting, discriminating against some participants and thus inducing bias in the final results. In this paper we investigate the presence of positive or negative bias (which may roughly indicate favouritisms or discrimination) in the votes based on geographical proximity, migration and cultural characteristics of the participating countries through a Bayesian hierarchical model. Our analysis found no evidence of negative bias, although mild positive bias does seem to emerge systematically, linking voters to performers. 相似文献