首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Outcome-dependent sampling increases the efficiency of studies of rare outcomes, examples being case—control studies in epidemiology and choice–based sampling in econometrics. Two-phase or double sampling is a standard technique for drawing efficient stratified samples. We develop maximum likelihood estimation of logistic regression coefficients for a hybrid two-phase, outcome–dependent sampling design. An algorithm is given for determining the estimates by repeated fitting of ordinary logistic regression models. Simulation results demonstrate the efficiency loss associated with alternative pseudolikelihood and weighted likelihood methods for certain data configurations. These results provide an efficient solution to the measurement error problem with validation sampling based on a discrete surrogate.  相似文献   

2.
Likelihood-ratio tests (LRTs) are often used for inferences on one or more logistic regression coefficients. Conventionally, for given parameters of interest, the nuisance parameters of the likelihood function are replaced by their maximum likelihood estimates. The new function created is called the profile likelihood function, and is used for inference from LRT. In small samples, LRT based on the profile likelihood does not follow χ2 distribution. Several corrections have been proposed to improve LRT when used with small-sample data. Additionally, complete or quasi-complete separation is a common geometric feature for small-sample binary data. In this article, for small-sample binary data, we have derived explicitly the correction factors of LRT for models with and without separation, and proposed an algorithm to construct confidence intervals. We have investigated the performances of different LRT corrections, and the corresponding confidence intervals through simulations. Based on the simulation results, we propose an empirical rule of thumb on the use of these methods. Our simulation findings are also supported by real-world data.  相似文献   

3.
Logistic regression is used by practitioners and researchers in many fields, but is undoubtedly used most frequently in medical and biostatistical applications. Maximum likelihood is generally the estimation method of choice, but we show that maximum likelihood can produce very poor results under certain conditions. Specifically, the poor performance of maximum likelihood in the case of rare events is known and we review research on this topic. We primarily examine the performance of maximum likelihood in the presence of near separation, which has apparently not been studied. Exact logistic regression is the logical alternative to maximum likelihood. We offer a comparison of the two methods of estimation.  相似文献   

4.
Two-phase stratified sampling is used to select subjects for the collection of additional data, e.g. validation data in measurement error problems. Stratification jointly by outcome and covariates, with sampling fractions chosen to achieve approximately equal numbers per stratum at the second phase of sampling, enhances efficiency compared with stratification based on the outcome or covariates alone. Nonparametric maximum likelihood may result in substantially more efficient estimates of logistic regression coefficients than weighted or pseudolikelihood procedures. Software to implement all three procedures is available. We demonstrate the practical importance of these design and analysis principles by an analysis of, and simulations based on, data from the US National Wilms Tumor Study.  相似文献   

5.
Motivated by an application with complex survey data, we show that for logistic regression with a simple matched-pairs design, infinitely replicating observations and maximizing the conditional likelihood results in an estimator exactly identical to the unconditional maximum likelihood estimator based on the original sample, which is inconsistent. Therefore, applying conditional likelihood methods to a pseudosample with observations replicated a large number of times can lead to an inconsistent estimator; this casts doubt on one possible approach to conditional logistic regression with complex survey data. We speculate that for more general designs, an asymptotic equivalence holds.  相似文献   

6.
The logistic regression model is used when the response variables are dichotomous. In the presence of multicollinearity, the variance of the maximum likelihood estimator (MLE) becomes inflated. The Liu estimator for the linear regression model is proposed by Liu to remedy this problem. Urgan and Tez and Mansson et al. examined the Liu estimator (LE) for the logistic regression model. We introduced the restricted Liu estimator (RLE) for the logistic regression model. Moreover, a Monte Carlo simulation study is conducted for comparing the performances of the MLE, restricted maximum likelihood estimator (RMLE), LE, and RLE for the logistic regression model.  相似文献   

7.
In two-parameter family of distribution, conditions for a modified maximum likelihood estimator to be second-order admissible are given. Applying these results to two-parameter logistic regression model, it is shown that the maximum likelihood estimator is always second-order inadmissible and the Rao-Blackwellized minimum logit chi-squared estimator is second-order admissible if and only if the number of the doses is greater than or equal to 6.  相似文献   

8.
Logistic regression is frequently used for classifying observations into two groups. Unfortunately there are often outlying observations in a data set and these might affect the estimated model and the associated classification error rate. In this paper, the authors study the effect of observations in the training sample on the error rate by deriving influence functions. They obtain a general expression for the influence function of the error rate, and they compute it for the maximum likelihood estimator as well as for several robust logistic discrimination procedures. Besides being of interest in their own right, the influence functions are also used to derive asymptotic classification efficiencies of different logistic discrimination rules. The authors also show how influential points can be detected by means of a diagnostic plot based on the values of the influence function  相似文献   

9.
Goodness-of-fit tests for logistic regression models using extreme residuals are considered. Approximations to the moments of the Pearson residuals are given for model fits made by maximum likelihood, minimum chi-square and weighted least squares and used to define modified residuals. Approximations to the critical values of the extreme statistics based on the ordinary and modified Pearson residuals are developed and assessed for the case of a single explanatory variable.  相似文献   

10.
This article applies and investigates a number of logistic ridge regression (RR) parameters that are estimable by using the maximum likelihood (ML) method. By conducting an extensive Monte Carlo study, the performances of ML and logistic RR are investigated in the presence of multicollinearity and under different conditions. The simulation study evaluates a number of methods of estimating the RR parameter k that has recently been developed for use in linear regression analysis. The results from the simulation study show that there is at least one RR estimator that has a lower mean squared error (MSE) than the ML method for all the different evaluated situations.  相似文献   

11.
The binary logistic regression is a commonly used statistical method when the outcome variable is dichotomous or binary. The explanatory variables are correlated in some situations of the logit model. This problem is called multicollinearity. It is known that the variance of the maximum likelihood estimator (MLE) is inflated in the presence of multicollinearity. Therefore, in this study, we define a new two-parameter ridge estimator for the logistic regression model to decrease the variance and overcome multicollinearity problem. We compare the new estimator to the other well-known estimators by studying their mean squared error (MSE) properties. Moreover, a Monte Carlo simulation is designed to evaluate the performances of the estimators. Finally, a real data application is illustrated to show the applicability of the new method. According to the results of the simulation and real application, the new estimator outperforms the other estimators for all of the situations considered.  相似文献   

12.
In this study, we present different estimation procedures for the parameters of the Poisson–exponential distribution, such as the maximum likelihood, method of moments, modified moments, ordinary and weighted least-squares, percentile, maximum product of spacings, Cramer–von Mises and the Anderson–Darling maximum goodness-of-fit estimators and compare them using extensive numerical simulations. We showed that the Anderson–Darling estimator is the most efficient for estimating the parameters of the proposed distribution. Our proposed methodology was also illustrated in three real data sets related to the minimum, average and the maximum flows during October at São Carlos River in Brazil demonstrating that the PE distribution is a simple alternative to be used in hydrological applications.  相似文献   

13.
Empirical Likelihood for Censored Linear Regression   总被引:5,自引:0,他引:5  
In this paper we investigate the empirical likelihood method in a linear regression model when the observations are subject to random censoring. An empirical likelihood ratio for the slope parameter vector is defined and it is shown that its limiting distribution is a weighted sum of independent chi-square distributions. This reduces to the empirical likelihood to the linear regression model first studied by Owen (1991) if there is no censoring present. Some simulation studies are presented to compare the empirical likelihood method with the normal approximation based method proposed in Lai et al. (1995). It was found that the empirical likelihood method performs much better than the normal approximation method.  相似文献   

14.
Local maximum likelihood estimation is a nonparametric counterpart of the widely used parametric maximum likelihood technique. It extends the scope of the parametric maximum likelihood method to a much wider class of parametric spaces. Associated with this nonparametric estimation scheme is the issue of bandwidth selection and bias and variance assessment. This paper provides a unified approach to selecting a bandwidth and constructing confidence intervals in local maximum likelihood estimation. The approach is then applied to least squares nonparametric regression and to nonparametric logistic regression. Our experiences in these two settings show that the general idea outlined here is powerful and encouraging.  相似文献   

15.
The penalized logistic regression is a useful tool for classifying samples and feature selection. Although the methodology has been widely used in various fields of research, their performance takes a sudden turn for the worst in the presence of outlier, since the logistic regression is based on the maximum log-likelihood method which is sensitive to outliers. It implies that we cannot accurately classify samples and find important factors having crucial information for classification. To overcome the problem, we propose a robust penalized logistic regression based on a weighted likelihood methodology. We also derive an information criterion for choosing the tuning parameters, which is a vital matter in robust penalized logistic regression modelling in line with generalized information criteria. We demonstrate through Monte Carlo simulations and real-world example that the proposed robust modelling strategies perform well for sparse logistic regression modelling even in the presence of outliers.  相似文献   

16.
Estimation for the log-logistic and Weibull distributions can be performed by using the equations used for probability plotting, and this technique outperforms the maximum likelihood (ML) estimation often in small samples. This leads to a highly heteroskedastic regression problem. Exact expressions for the variances of the residuals are derived which can be used to perform weighted regression. In large samples, the ML performs best, but it is shown that in smaller samples, the weighted regression outperforms the ML estimation with respect to bias and mean square error.  相似文献   

17.
This article develops three empirical likelihood (EL) approaches to estimate parameters in nonlinear regression models in the presence of nonignorable missing responses. These are based on the inverse probability weighted (IPW) method, the augmented IPW (AIPW) method and the imputation technique. A logistic regression model is adopted to specify the propensity score. Maximum likelihood estimation is used to estimate parameters in the propensity score by combining the idea of importance sampling and imputing estimating equations. Under some regularity conditions, we obtain the asymptotic properties of the maximum EL estimators of these unknown parameters. Simulation studies are conducted to investigate the finite sample performance of our proposed estimation procedures. Empirical results provide evidence that the AIPW procedure exhibits better performance than the other two procedures. Data from a survey conducted in 2002 are used to illustrate the proposed estimation procedure. The Canadian Journal of Statistics 48: 386–416; 2020 © 2020 Statistical Society of Canada  相似文献   

18.

We consider nonparametric logistic regression and propose a generalized likelihood test for detecting a threshold effect that indicates a relationship between some risk factor and a defined outcome above the threshold but none below it. One important field of application is occupational medicine and in particular, epidemiological studies. In epidemiological studies, segmented fully parametric logistic regression models are often threshold models, where it is assumed that the exposure has no influence on a response up to a possible unknown threshold, and has an effect beyond that threshold. Finding efficient methods for detection and estimation of a threshold is a very important task in these studies. This article proposes such methods in a context of nonparametric logistic regression. We use a local version of unknown likelihood functions and show that under rather common assumptions the asymptotic power of our test is one. We present a guaranteed non asymptotic upper bound for the significance level of the proposed test. If applying the test yields the acceptance of the conclusion that there was a change point (and hence a threshold limit value), we suggest using the local maximum likelihood estimator of the change point and consider the asymptotic properties of this estimator.  相似文献   

19.
Logistic regression using conditional maximum likelihood estimation has recently gained widespread use. Many of the applications of logistic regression have been in situations in which the independent variables are collinear. It is shown that collinearity among the independent variables seriously effects the conditional maximum likelihood estimator in that the variance of this estimator is inflated in much the same way that collinearity inflates the variance of the least squares estimator in multiple regression. Drawing on the similarities between multiple and logistic regression several alternative estimators, which reduce the effect of the collinearity and are easy to obtain in practice, are suggested and compared in a simulation study.  相似文献   

20.
Monotonic transformations of explanatory continuous variables are often used to improve the fit of the logistic regression model to the data. However, no analytic studies have been done to study the impact of such transformations. In this paper, we study invariant properties of the logistic regression model under monotonic transformations. We prove that the maximum likelihood estimates, information value, mutual information, Kolmogorov–Smirnov (KS) statistics, and lift table are all invariant under certain monotonic transformations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号