首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
A Sampling experiment performed using data collected for a large clinical trial shows that the discriminant function estimates of the logistic regression coefficients for discrete variables may be severely biased. The simulations show that the mixed variable location model coefficient estimates have bias which is of the same magnitude as the bias in the coefficient estimates obtained using conditional maximum likelihood estimates but require about one-tenth of the computer time.  相似文献   

2.
This paper discusses recovery of information regarding logistic regression parameters in cases when maximum likelihood estimates of some parameters are infinite. An algorithm for detecting such cases and characterizing the divergence of the parameter estimates is presented. A method for fitting the remaining parameters is also presented . All of these methods rely only on sufficient statistics rather than less aggregated quantities, as required for inference according to the method of Kolassa & Tanner (1994). These results are applied to approximate conditional inference via saddlepoint methods. Specifically, the double saddlepoint method of Skovgaard (1987) is adapted to the case when the solution to the saddlepoint equations exists as a point at infinity  相似文献   

3.
We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probabilities than the other two estimators. We concluded that analysts should be cautious in using aggregate data to estimate the parameters of the logistic regression model for the underlying individual data.  相似文献   

4.
There have been a number of procedures used to analyze non-monotonic binary data to predict the probability of response. Some classical procedures are the Up and Down strategy, the Robbins–Monro procedure, and other sequential optimization designs. Recently, nonparametric procedures such as kernel regression and local linear regression (llogr) have been applied to this type of data. It is a well known fact that kernel regression has problems fitting the data near the boundaries and a drawback with local linear regression is that it may be “too linear” when fitting data from a curvilinear function. The procedure introduced in this paper is called local logistic regression, which fits a logistic regression function at each of the data points. An example is given using United States Army projectile data that supports the use of local logistic regression when analyzing non-monotonic binary data for certain response curves. Properties of local logistic regression will be presented along with simulation results that indicate some of the strengths of the procedure.  相似文献   

5.
Goodness of fit tests for the multiple logistic regression model   总被引:1,自引:0,他引:1  
Several test statistics are proposed for the purpose of assessing the goodness of fit of the multiple logistic regression model. The test statistics are obtained by applying a chi-square test for a contingency table in which the expected frequencies are determined using two different grouping strategies and two different sets of distributional assumptions. The null distributions of these statistics are examined by applying the theory for chi-square tests of Moore Spruill (1975) and through computer simulations. All statistics are shown to have a chi-square distribution or a distribution which can be well approximated by a chi-square. The degrees of freedom are shown to depend on the particular statistic and the distributional assumptions.

The power of each of the proposed statistics is examined for the normal, linear, and exponential alternative models using computer simulations.  相似文献   

6.
This study investigates the use of stratification to improve discrimination when prior probabilities vary across strata of a population of interest. Sources of heterogeneity in prior probabilities include differences in geographic locale, age differences in the population studied, or differences in the time component of the data collected. The article suggests using logistic regression both to identify the underlying stratification and to estimate prior probabilities. A simulation study compares misclassification rates under two alternative stratification schemes with the traditional discriminant approach that ignores stratification in favor of pooled prior estimates. The simulations show that large asymptotic gains can be realized by stratification, and that these gains can be realized in finite samples, given moderate differences in prior probabilities.  相似文献   

7.
This paper proposes a class of lack-of-fit tests for fitting a linear regression model when some response variables are missing at random. These tests are based on a class of minimum integrated square distances between a kernel type estimator of a regression function and the parametric regression function being fitted. These tests are shown to be consistent against a large class of fixed alternatives. The corresponding test statistics are shown to have asymptotic normal distributions under null hypothesis and a class of nonparametric local alternatives. Some simulation results are also presented.  相似文献   

8.
The problem of estimation of the parameters in a logistic regression model is considered under multicollinearity situation when it is suspected that the parameter of the logistic regression model may be restricted to a subspace. We study the properties of the preliminary test based on the minimum ϕ -divergence estimator as well as in the ϕ -divergence test statistic. The minimum ϕ -divergence estimator is a natural extension of the maximum likelihood estimator and the ϕ -divergence test statistics is a family of the test statistics for testing the hypothesis that the regression coefficients may be restricted to a subspace.  相似文献   

9.
We propose two retrospective test statistics for testing the vector of odds ratio parameters under the logistic regression model based on case–control data by exploiting the density ratio structure under a two-sample semiparametric model, which is equivalent to the assumed logistic regression model. The proposed test statistics are based on Kullback–Leibler entropy distance and are particularly relevant to the case–control sampling plan. These two test statistics have identical asymptotic chi-squared distributions under the null hypothesis and identical asymptotic noncentral chi-squared distributions under local alternatives to the null hypothesis. Moreover, the proposed test statistics require computation of the maximum semiparametric likelihood estimators of the underlying parameters, but are otherwise easily computed. We present some results on simulation and on the analysis of two real data sets.  相似文献   

10.
Outcome-dependent sampling increases the efficiency of studies of rare outcomes, examples being case—control studies in epidemiology and choice–based sampling in econometrics. Two-phase or double sampling is a standard technique for drawing efficient stratified samples. We develop maximum likelihood estimation of logistic regression coefficients for a hybrid two-phase, outcome–dependent sampling design. An algorithm is given for determining the estimates by repeated fitting of ordinary logistic regression models. Simulation results demonstrate the efficiency loss associated with alternative pseudolikelihood and weighted likelihood methods for certain data configurations. These results provide an efficient solution to the measurement error problem with validation sampling based on a discrete surrogate.  相似文献   

11.
苎麻针织物在贴身穿着过程中产生较强的刺痒感,在一定程度上限制了苎麻织物的服用范围。为解决此问题,文章简要介绍了织物刺痒感的评价并针对如何改善甚至消除苎麻针织物刺痒感的问题进行了研究。首先通过单因素实验确定出各影响因素的范围,再通过正交试验并结合前臂实验法,利用评分的方式得出了一套最佳的酶处理工艺条件:pH值5、酶用量3%(owf)、浴比1∶20、温度45℃、时间45 min。  相似文献   

12.
Monotonic transformations of explanatory continuous variables are often used to improve the fit of the logistic regression model to the data. However, no analytic studies have been done to study the impact of such transformations. In this paper, we study invariant properties of the logistic regression model under monotonic transformations. We prove that the maximum likelihood estimates, information value, mutual information, Kolmogorov–Smirnov (KS) statistics, and lift table are all invariant under certain monotonic transformations.  相似文献   

13.
Neuroimaging studies aim to analyze imaging data with complex spatial patterns in a large number of locations (called voxels) on a two-dimensional (2D) surface or in a 3D volume. Conventional analyses of imaging data include two sequential steps: spatially smoothing imaging data and then independently fitting a statistical model at each voxel. However, conventional analyses suffer from the same amount of smoothing throughout the whole image, the arbitrary choice of smoothing extent, and low statistical power in detecting spatial patterns. We propose a multiscale adaptive regression model (MARM) to integrate the propagation-separation (PS) approach (Polzehl and Spokoiny, 2000, 2006) with statistical modeling at each voxel for spatial and adaptive analysis of neuroimaging data from multiple subjects. MARM has three features: being spatial, being hierarchical, and being adaptive. We use a multiscale adaptive estimation and testing procedure (MAET) to utilize imaging observations from the neighboring voxels of the current voxel to adaptively calculate parameter estimates and test statistics. Theoretically, we establish consistency and asymptotic normality of the adaptive parameter estimates and the asymptotic distribution of the adaptive test statistics. Our simulation studies and real data analysis confirm that MARM significantly outperforms conventional analyses of imaging data.  相似文献   

14.
Violation of correct specification may cause some undesirable results such as biased logistic regression coefficients and less efficient test statistics. In this paper, asymptotic relative efficiency (ARE) of various coefficients of determination in misspecified binary logistic regression models is investigated. Seven types of misspecification have been included. ARE of test statistics for exponential and Weibull distributions as a method of calculating optimal cutpoints is derived to demonstrate misspecification. Theoretical relationships between coefficients of determination have also been analyzed. Extensive simulations using bootstrap method and a real data application reveal more efficient one under various modeling scenarios.  相似文献   

15.
It is suggested that inference under the proportional hazard model can be carried out by programs for exact inference under the logistic regression model. Advantages of such inference is that software is available and that multivariate models can be addressed. The method has been evaluated by means of coverage and power calculations in certain situations. In all situations coverage was above the nominal level, but on the other hand rather conservative. A different type of exact inference is developed under Type II censoring. Inference was then less conservative, however there are limitations with respect to censoring mechanism, multivariate generalizations and software is not available. This method also requires extensive computational power. Performance of large sample Wald, score and likelihood inference was also considered. Large sample methods works remarkably well with small data sets, but inference by score statistics seems to be the best choice. There seems to be some problems with likelihood ratio inference that may originate from how this method works with infinite estimates of the regression parameter. Inference by Wald statistics can be quite conservative with very small data sets.  相似文献   

16.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

17.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

18.
19.
We propose a method for assessing an individual patient's risk of a future clinical event using clinical trial or cohort data and Cox proportional hazards regression, combining the information from several studies using meta-analysis techniques. The method combines patient-specific estimates of the log cumulative hazard across studies, weighting by the relative precision of the estimates, using either fixed- or random-effects meta-analysis calculations. Risk assessment can be done for any future patient using a few key summary statistics determined once and for all from each study. Generalizations of the method to logistic regression and linear models are immediate. We evaluate the methods using simulation studies and illustrate their application using real data.  相似文献   

20.
Classical statistical approaches for multiclass probability estimation are typically based on regression techniques such as multiple logistic regression, or density estimation approaches such as linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). These methods often make certain assumptions on the form of probability functions or on the underlying distributions of subclasses. In this article, we develop a model-free procedure to estimate multiclass probabilities based on large-margin classifiers. In particular, the new estimation scheme is employed by solving a series of weighted large-margin classifiers and then systematically extracting the probability information from these multiple classification rules. A main advantage of the proposed probability estimation technique is that it does not impose any strong parametric assumption on the underlying distribution and can be applied for a wide range of large-margin classification methods. A general computational algorithm is developed for class probability estimation. Furthermore, we establish asymptotic consistency of the probability estimates. Both simulated and real data examples are presented to illustrate competitive performance of the new approach and compare it with several other existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号