共查询到20条相似文献,搜索用时 15 毫秒
1.
The use of logistic regression modeling has seen a great deal of attention in the literature in recent years. This includes all aspects of the logistic regression model including the identification of outliers. A variety of methods for the identification of outliers, such as the standardized Pearson residuals, are now available in the literature. These methods, however, are successful only if the data contain a single outlier. In the presence of multiple outliers in the data, which is often the case in practice, these methods fail to detect the outliers. This is due to the well-known problems of masking (false negative) and swamping (false positive) effects. In this article, we propose a new method for the identification of multiple outliers in logistic regression. We develop a generalized version of standardized Pearson residuals based on group deletion and then propose a technique for identifying multiple outliers. The performance of the proposed method is then investigated through several examples. 相似文献
2.
Several variations of monotone nonparametric regression have been developed over the past 30 years. One approach is to first apply nonparametric regression to data and then monotone smooth the initial estimates to “iron out” violations to the assumed order. Here, such estimators are considered, where local polynomial regression is first used, followed by either least squares isotonic regression or a monotone method using simple averages. The primary focus of this work is to evaluate different types of confidence intervals for these monotone nonparametric regression estimators through Monte Carlo simulation. Most of the confidence intervals use bootstrap or jackknife procedures. Estimation of a response variable as a function of two continuous predictor variables is considered, where the estimation is performed at the observed values of the predictors (instead of on a grid). The methods are then applied to data involving subjects that worked at plants that use beryllium metal who have developed chronic beryllium disease. 相似文献
3.
The purpose of this article is to present a new method to predict the response variable of an observation in a new cluster for a multilevel logistic regression. The central idea is based on the empirical best estimator for the random effect. Two estimation methods for multilevel model are compared: penalized quasi-likelihood and Gauss–Hermite quadrature. The performance measures for the prediction of the probability for a new cluster observation of the multilevel logistic model in comparison with the usual logistic model are examined through simulations and an application. 相似文献
4.
Lorenzo Camponovo 《Econometric Reviews》2013,32(3):352-393
This paper studies robustness of bootstrap inference methods for instrumental variable (IV) regression models. We consider test statistics for parameter hypotheses based on the IV estimator and generalized method of trimmed moments (GMTM) estimator introduced by ?í?ek (2008, 2009), and compare the pairs and implied probability bootstrap approximations for these statistics by applying the finite sample breakdown point theory. In particular, we study limiting behaviors of the bootstrap quantiles when the values of outliers diverge to infinity but the sample size is held fixed. The outliers are defined as anomalous observations that can arbitrarily change the value of the statistic of interest. We analyze both just- and overidentified cases and discuss implications of the breakdown point analysis to the size and power properties of bootstrap tests. We conclude that the implied probability bootstrap test using the statistic based on the GMTM estimator shows desirable robustness properties. Simulation studies endorse this conclusion. An empirical example based on Romer's (1993) study on the effect of openness of countries to inflation rates is presented. Several extensions including the analysis for the residual bootstrap are provided. 相似文献
5.
Kristofer Månsson 《统计学通讯:理论与方法》2013,42(18):3366-3381
This article applies and investigates a number of logistic ridge regression (RR) parameters that are estimable by using the maximum likelihood (ML) method. By conducting an extensive Monte Carlo study, the performances of ML and logistic RR are investigated in the presence of multicollinearity and under different conditions. The simulation study evaluates a number of methods of estimating the RR parameter k that has recently been developed for use in linear regression analysis. The results from the simulation study show that there is at least one RR estimator that has a lower mean squared error (MSE) than the ML method for all the different evaluated situations. 相似文献
6.
文章通过自编的考试焦虑自我检验问卷对在校大学生进行了抽样调查,并应用聚类分析和Logistic回归分析对影响考试焦虑的原因进行探讨。得出影响因素主要有担心他人对自己的评价,担心个人未来前途,担心个人对考试准备不足三个方面。 相似文献
7.
《统计学通讯:理论与方法》2013,42(9):1817-1833
Abstract It is known that due to the existence of the nonparametric component, the usual estimators for the parametric component or its function in partially linear regression models are biased. Sometimes this bias is severe. To reduce the bias, we propose two jackknife estimators and compare them with the naive estimator. All three estimators are shown to be asymptotically equivalent and asymptotically normally distributed under some regularity conditions. However, through simulation we demonstrate that the jackknife estimators perform better than the naive estimator in terms of bias when the sample size is small to moderate. To make our results more useful, we also construct consistent estimators of the asymptotic variance, which are robust against heterogeneity of the error variances. 相似文献
8.
混沌理论认为,人类行为大多具有非线性特征。会计舞弊属于行为会计的研究范畴,而传统上基于统计理论构建的舞弊识别模型大多受限于线性约束假设,可能存在模型设定偏误和信息提取不充分的缺陷。以沪深A股受到监管处罚的上市公司及其配对公司为样本,借鉴Taylor展开式的非线性思想,并使用主成分分析消除变量多重共线性,构建了非线性-主成分Logistic回归的会计舞弊识别模型。与线性回归模型对比发现,前者具有更高的舞弊识别正确率,模型拟合度更优。应用这一模型有助于更加充分提取舞弊识别信息,提高舞弊识别效率。 相似文献
9.
In this paper we introduce and study two new families of statistics for the problem of testing linear combinations of the parameters in logistic regression models. These families are based on the phi-divergence measures. One of them includes the classical likelihood ratio statistic and the other the classical Pearson's statistic for this problem. It is interesting to note that the vector of unknown parameters, in the two new families of phi-divergence statistics considered in this paper, is estimated using the minimum phi-divergence estimator instead of the maximum likelihood estimator. Minimum phi-divergence estimators are a natural extension of the maximum likelihood estimator. 相似文献
10.
The logistic regression model is used when the response variables are dichotomous. In the presence of multicollinearity, the variance of the maximum likelihood estimator (MLE) becomes inflated. The Liu estimator for the linear regression model is proposed by Liu to remedy this problem. Urgan and Tez and Mansson et al. examined the Liu estimator (LE) for the logistic regression model. We introduced the restricted Liu estimator (RLE) for the logistic regression model. Moreover, a Monte Carlo simulation study is conducted for comparing the performances of the MLE, restricted maximum likelihood estimator (RMLE), LE, and RLE for the logistic regression model. 相似文献
11.
Singh et al. (1986) proposed an almost unbiased ridge estimator using Jackknife method that required transformation of the regression parameters. This article shows that the same method can be used to derive the Jackknifed ridge estimator of the original (untransformed) parameter without transformation. This method also leads in deriving easily the second-order Jackknifed ridge that may reduce the bias further. We further investigate the performance of these estimators along with a recent method by Batah et al. (2008) called modified Jackknifed ridge theoretically as well as numerically. 相似文献
12.
关于分层线性模型样本容量问题的研究 总被引:1,自引:0,他引:1
文章运用lackknife和Bootstrap的方法,对参数估计的方差进行改进,构造了合适的参数估计的置信区间.通过样本组数和组内个体数的变化,利用数据模拟的方法进行研究,表明参数估计的可靠性很大程度上依赖于组数;对于固定效应参数,组数取30就可以得到可靠的估计值.对于σ和方差协方差成分T,组数分别取50和70才能得到可靠的估计. 相似文献
13.
ANDERS BJÖRKSTRÖM 《Scandinavian Journal of Statistics》2010,37(1):166-175
Abstract. We use Krylov sequences to analyse a class of regression methods based on successive identification of latent factors. Some results already proved for partial least squares regression (PLSR) are shown to hold for other methods also. We prove that the well-known peculiar pattern of alternating shrinkage and inflation of the principal components is not unique for PLSR. We also show that for any method in the class under study, the coefficient of determination is always at least as high as for principal components regression with the same number of factors. 相似文献
14.
The weighted generalized estimating equation (WGEE), an extension of the generalized estimating equation (GEE) method, is a method for analyzing incomplete longitudinal data. An inappropriate specification of the working correlation structure results in the loss of efficiency of the GEE estimation. In this study, we evaluated the efficiency of WGEE estimation for incomplete longitudinal data when the working correlation structure was misspecified. As a result, we found that the efficiency of the WGEE estimation was lower when an improper working correlation structure was selected, similar to the case of the GEE method. Furthermore, we modified the criterion proposed by Gosho et al. (2011) for selecting a working correlation structure, such that the GEE and WGEE methods can be applied to incomplete longitudinal data, and we investigated the performance of the modified criterion. The results revealed that when the modified criterion was adopted, the proportion that the true correlation structure was selected was likely higher than that in the case of adopting other competing approaches. 相似文献
15.
In the literature, there were only a few reports on goodness-of-fit tests on logistic regression models specifically derived for case-control studies. In this article, we propose a goodness-of-fit test for logistic regression models in stratified case-control studies using an empirical likelihood approach. The proposed statistic is an alternative to the statistic G o , recently proposed by Arbigast and Lin (2005). Simulation results show that the proposed statistic is often slightly more powerful than G o , although their performances are always close to each other. Moreover, implementation of our method is easy since the usual stratified logistic regression procedures in many statistical softwares can be employed. Some asymptotic results and application of the proposed statistic to two real datasets are also presented. 相似文献
16.
Four strategies for bias correction of the maximum likelihood estimator of the parameters in the Type I generalized logistic distribution are studied. First, we consider an analytic bias-corrected estimator, which is obtained by deriving an analytic expression for the bias to order n ?1; second, a method based on modifying the likelihood equations; third, we consider the jackknife bias-corrected estimator; and fourth, we consider two bootstrap bias-corrected estimators. All bias correction estimators are compared by simulation. Finally, an example with a real data set is also presented. 相似文献
17.
18.
It is known that multicollinearity inflates the variance of the maximum likelihood estimator in logistic regression. Especially, if the primary interest is in the coefficients, the impact of collinearity can be very serious. To deal with collinearity, a ridge estimator was proposed by Schaefer et al. The primary interest of this article is to introduce a Liu-type estimator that had a smaller total mean squared error (MSE) than the Schaefer's ridge estimator under certain conditions. Simulation studies were conducted that evaluated the performance of this estimator. Furthermore, the proposed estimator was applied to a real-life dataset. 相似文献
19.
Based on the large-sample normal distribution of the sample log odds ratio and its asymptotic variance from maximum likelihood logistic regression, shortest 95% confidence intervals for the odds ratio are developed. Although the usual confidence interval on the odds ratio is unbiased, the shortest interval is not. That is, while covering the true odds ratio with the stated probability, the shortest interval covers some values below the true odds ratio with higher probability. The upper and lower limits of the shortest interval are shifted to the left of those of the usual interval, with greater shifts in the upper limits. With the log odds model γ + Xβ, in which X is binary, simulation studies showed that the approximate average percent difference in length is 7.4% for n (sample size) = 100, and 3.8% for n = 200. Precise estimates of the covering probabilities of the two types of intervals were obtained from simulation studies, and are compared graphically. For odds ratio estimates greater (less) than one, shortest intervals are more (less) likely to include one than are the usual intervals. The usual intervals are likelihood-based and the shortest intervals are not. The usual intervals have minimum expected length among the class of unbiased intervals. Shortest intervals do not provide important advantages over the usual intervals, which we recommend for practical use. 相似文献
20.
Methods are developed for combining data collected by satellite with data collected in an area survey to estimate crop acreages. The basic procedure is that of survey regression estimation. Two methods of transforming the satellite information prior to regression estimation are compared. 相似文献