共查询到20条相似文献,搜索用时 15 毫秒
1.
Shelley B. Bull Celia M.T. Greenwood Allan Donner 《Revue canadienne de statistique》1994,22(3):319-334
One feature of the usual polychotomous logistic regression model for categorical outcomes is that a covariate must be included in all the regression equations. If a covariate is not important in all of them, the procedure will estimate unnecessary parameters. More flexible approaches allow different subsets of covariates in different regressions. One alternative uses individualized regressions which express the polychotomous model as a series of dichotomous models. Another uses a model in which a reduced set of parameters is simultaneously estimated for all the regressions. Large-sample efficiencies of these procedures were compared in a variety of circumstances in which there was a common baseline category for the outcome and the covariates were normally distributed. For a correctly specified model, the reduced estimates were over 100% efficient for nonzero slope parameters and up to 500% efficient when the baseline frequency and the effect of interest were small. The individualized estimates could have efficiencies less than 50% when the effect of interest was large, but were also up to 130% efficient when the baseline frequency was large and the effect of interest was small. Efficiency was usually enhanced by correlation among the covariates. For an underspecified reduced model, asymptotic bias in the reduced estimates was approximately proportional to the magnitude of the omitted parameter and to the reciprocal of the baseline frequency. 相似文献
2.
Yizheng Wei Yanyuan Ma Tanya P. Garcia Samiran Sinha 《Revue canadienne de statistique》2019,47(2):140-156
We propose a consistent and locally efficient method of estimating the model parameters of a logistic mixed effect model with random slopes. Our approach relaxes two typical assumptions: the random effects being normally distributed, and the covariates and random effects being independent of each other. Adhering to these assumptions is particularly difficult in health studies where, in many cases, we have limited resources to design experiments and gather data in long‐term studies, while new findings from other fields might emerge, suggesting the violation of such assumptions. So it is crucial to have an estimator that is robust to such violations; then we could make better use of current data harvested using various valuable resources. Our method generalizes the framework presented in Garcia & Ma (2016) which also deals with a logistic mixed effect model but only considers a random intercept. A simulation study reveals that our proposed estimator remains consistent even when the independence and normality assumptions are violated. This contrasts favourably with the traditional maximum likelihood estimator which is likely to be inconsistent when there is dependence between the covariates and random effects. Application of this work to a study of Huntington's disease reveals that disease diagnosis can be enhanced using assessments of cognitive performance. The Canadian Journal of Statistics 47: 140–156; 2019 © 2019 Statistical Society of Canada 相似文献
3.
This paper provides a partial solution to a problem posed by J. Neyman (1965) regarding the characterization of multivariate negative binomial distribution based on the properties of regression. It is shown that some of the properties of regression characterize the form of the nonsingular dispersion matrix of the parent distribution, which, interestingly enough, corresponds to only two types viz. those of positive and negative multivariate binomial distributions. 相似文献
4.
A table of expected success rates under normally distributed success logit, used in conjunction with logistic regression analysis, enables easy calculation of expected win for betting on success of a future dichotomous trial. 相似文献
5.
The paper provides a novel application of the probabilistic reduction (PR) approach to the analysis of multi-categorical outcomes. The PR approach, which systematically takes account of heterogeneity and functional form concerns, can improve the specification of binary regression models. However, its utility for systematically enriching the specification of and inference from models of multi-categorical outcomes has not been examined, while multinomial logistic regression models are commonly used for inference and, increasingly, prediction. Following a theoretical derivation of the PR-based multinomial logistic model (MLM), we compare functional specification and marginal effects from a traditional specification and a PR-based specification in a model of post-stroke hospital discharge disposition and find that the traditional MLM is misspecified. Results suggest that the impact on the reliability of substantive inferences from a misspecified model may be significant, even when model fit statistics do not suggest a strong lack of fit compared with a properly specified model using the PR approach. We identify situations under which a PR-based MLM specification can be advantageous to the applied researcher. 相似文献
6.
7.
E. M. Hashimoto E. M. M. Ortega G. M. Cordeiro A. K. Suzuki M. W. Kattan 《Journal of applied statistics》2020,47(12):2159
The multinomial logistic regression model (MLRM) can be interpreted as a natural extension of the binomial model with logit link function to situations where the response variable can have three or more possible outcomes. In addition, when the categories of the response variable are nominal, the MLRM can be expressed in terms of two or more logistic models and analyzed in both frequentist and Bayesian approaches. However, few discussions about post modeling in categorical data models are found in the literature, and they mainly use Bayesian inference. The objective of this work is to present classic and Bayesian diagnostic measures for categorical data models. These measures are applied to a dataset (status) of patients undergoing kidney transplantation. 相似文献
8.
9.
《Journal of Statistical Computation and Simulation》2012,82(7):1412-1426
In the multinomial regression model, we consider the methodology for simultaneous model selection and parameter estimation by using the shrinkage and LASSO (least absolute shrinkage and selection operation) [R. Tibshirani, Regression shrinkage and selection via the LASSO, J. R. Statist. Soc. Ser. B 58 (1996), pp. 267–288] strategies. The shrinkage estimators (SEs) provide significant improvement over their classical counterparts in the case where some of the predictors may or may not be active for the response of interest. The asymptotic properties of the SEs are developed using the notion of asymptotic distributional risk. We then compare the relative performance of the LASSO estimator with two SEs in terms of simulated relative efficiency. A simulation study shows that the shrinkage and LASSO estimators dominate the full model estimator. Further, both SEs perform better than the LASSO estimators when there are many inactive predictors in the model. A real-life data set is used to illustrate the suggested shrinkage and LASSO estimators. 相似文献
10.
11.
S. Karagulle 《Journal of applied statistics》2016,43(3):538-549
Current statistical methods for analyzing epidemiological data with disease subtype information allow us to acquire knowledge not only for risk factor-disease subtype association but also, on a more profound account, heterogeneity in these associations by multiple disease characteristics (so-called etiologic heterogeneity of the disease). Current interest, particularly in cancer epidemiology, lies in obtaining a valid p-value for testing the hypothesis whether a particular cancer is etiologically heterogeneous. We consider the two-stage logistic regression model along with pseudo-conditional likelihood estimation method and design a testing strategy based on Rao's score test. An extensive Monte Carlo simulation study is carried out, false discovery rate and statistical power of the suggested test are investigated. Simulation results indicate that applying the proposed testing strategy, even a small degree of true etiologic heterogeneity can be recovered with a large statistical power from the sampled data. The strategy is then applied on a breast cancer data set to illustrate its use in practice where there are multiple risk factors and multiple disease characteristics of simultaneous concern. 相似文献
12.
Henrick J. Malik 《统计学通讯:理论与方法》2013,42(14):1527-1534
In this paper we obtain an exact formula for the cumulative distribution function of the rth quasi-range from the logistic distribution. For the special case r = 0, the result agrees for the rangegiven by Gupta and Shah (1965). 相似文献
13.
In this paper, we obtain a new approximation of the Student's t distribution by using the symmetric generalized logistic (SGL) distribution function. The error of this approximation is shown to be 0(1/n2 )where nis the degrees of freedom of thetdistribution. In comparison to similar approximations by George and Ojo and George et al. (1986), this new approximation is much simpler and more accurate. It is also shown that under some conditions, the tdistribution is a good approximation of the SGL distribution. Therefore, the complicated expressions for the cumulants and moments of the SGL can be approximated by those of the t, distribution. Finally, numerical results are given. 相似文献
14.
Kadri Ulas Akay 《Journal of applied statistics》2014,41(6):1217-1232
In comparison to other experimental studies, multicollinearity appears frequently in mixture experiments, a special study area of response surface methodology, due to the constraints on the components composing the mixture. In the analysis of mixture experiments by using a special generalized linear model, logistic regression model, multicollinearity causes precision problems in the maximum-likelihood logistic regression estimate. Therefore, effects due to multicollinearity can be reduced to a certain extent by using alternative approaches. One of these approaches is to use biased estimators for the estimation of the coefficients. In this paper, we suggest the use of logistic ridge regression (RR) estimator in the cases where there is multicollinearity during the analysis of mixture experiments using logistic regression. Also, for the selection of the biasing parameter, we use fraction of design space plots for evaluating the effect of the logistic RR estimator with respect to the scaled mean squared error of prediction. The suggested graphical approaches are illustrated on the tumor incidence data set. 相似文献
15.
Ozge Tanju 《Journal of Statistical Computation and Simulation》2018,88(7):1394-1414
Model selection methods are important to identify the best approximating model. To identify the best meaningful model, purpose of the model should be clearly pre-stated. The focus of this paper is model selection when the modelling purpose is classification. We propose a new model selection approach designed for logistic regression model selection where main modelling purpose is classification. The method is based on the distance between the two clustering trees. We also question and evaluate the performances of conventional model selection methods based on information theory concepts in determining best logistic regression classifier. An extensive simulation study is used to assess the finite sample performances of the cluster tree based and the information theoretic model selection methods. Simulations are adjusted for whether the true model is in the candidate set or not. Results show that the new approach is highly promising. Finally, they are applied to a real data set to select a binary model as a means of classifying the subjects with respect to their risk of breast cancer. 相似文献
16.
The logistic distribution is a simple distribution possessing many useful properties and has been used extensively for analyzing growth. Recently, van Staden and King proposed a quantile-based skew logistic distribution. In this paper, we introduce an alternative skew logistic distribution. We then establish recurrence relations for the computation of the single and product moments of order statistics from the standard skew logistic distribution by using the moments of order statistics from the standard half logistic distribution. These enable an efficient computation of means, variances and covariances of order statistics from the skew logistic distibution for all sample sizes. The results become useful in determining the best linear unbiased estimators of the location and scale paramters of the skew logistic distribution. Finally, we provide an example to illustrate the usefulness of the developed model and then compare its fit with that provided by the model of van Staden and King. 相似文献
17.
Brandi N. Falley James D. Stamey A. Alexander Beaujean 《Journal of applied statistics》2018,45(10):1756-1769
Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered. 相似文献
18.
In epidemiologic studies where the outcome is binary, the data often arise as clusters, as when siblings, friends or neighbors are used as matched controls in a case-control study. Conditional logistic regression (CLR) is typically used for such studies to estimate the odds ratio for an exposure of interest. However, CLR assumes the exposure coefficient is the same in every cluster, and CLR-based inference can be badly biased when homogeneity is violated. Existing methods for testing goodness-of-fit for CLR are not designed to detect such violations. Good alternative methods of analysis exist if one suspects there is heterogeneity across clusters. However, routine use of alternative robust approaches when there is no appreciable heterogeneity could cause loss of precision and be computationally difficult, particularly if the clusters are small. We propose a simple non-parametric test, the test of heterogeneous susceptibility (THS), to assess the assumption of homogeneity of a coefficient across clusters. The test is easy to apply and provides guidance as to the appropriate method of analysis. Simulations demonstrate that the THS has reasonable power to reveal violations of homogeneity. We illustrate by applying the THS to a study of periodontal disease. 相似文献
19.
20.
Guoping Zeng 《统计学通讯:理论与方法》2017,46(22):11194-11203
The problems of existence and uniqueness of maximum likelihood estimates for logistic regression were completely solved by Silvapulle in 1981 and Albert and Anderson in 1984. In this paper, we extend the well-known results by Silvapulle and by Albert and Anderson to weighted logistic regression. We analytically prove the equivalence between the overlap condition used by Albert and Anderson and that used by Silvapulle. We show that the maximum likelihood estimate of weighted logistic regression does not exist if there is a complete separation or a quasicomplete separation of the data points, and exists and is unique if there is an overlap of data points. Our proofs and results for weighted logistic apply to unweighted logistic regression. 相似文献