首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
The penalized logistic regression (PLR) is a powerful statistical tool for classification. It has been commonly used in many practical problems. Despite its success, since the loss function of the PLR is unbounded, resulting classifiers can be sensitive to outliers. To build more robust classifiers, we propose the robust PLR (RPLR) which uses truncated logistic loss functions, and suggest three schemes to estimate conditional class probabilities. Connections of the RPLR with some other existing work on robust logistic regression have been discussed. Our theoretical results indicate that the RPLR is Fisher consistent and more robust to outliers. Moreover, we develop estimated generalized approximate cross validation (EGACV) for the tuning parameter selection. Through numerical examples, we demonstrate that truncating the loss function indeed yields better performance in terms of classification accuracy and class probability estimation.  相似文献   

Marginal imputation, that consists of imputing items separately, generally leads to biased estimators of bivariate parameters such as finite population coefficients of correlation. To overcome this problem, two main approaches have been considered in the literature: the first consists of using customary imputation methods such as random hot‐deck imputation and adjusting for the bias at the estimation stage. This approach was studied in Skinner & Rao 2002 . In this paper, we extend the results of Skinner & Rao 2002 to the case of arbitrary sampling designs and three variants of random hot‐deck imputation. The second approach consists of using an imputation method, which preserves the relationship between variables. Shao & Wang 2002 proposed a joint random regression imputation procedure that succeeds in preserving the relationships between two study variables. One drawback of the Shao–Wang procedure is that it suffers from an additional variability (called the imputation variance) due to the random selection of residuals, resulting in potentially inefficient estimators. Following Chauvet, Deville, & Haziza 2011 , we propose a fully efficient version of the Shao–Wang procedure that preserves the relationship between two study variables, while virtually eliminating the imputation variance. Results of a simulation study support our findings. An application using data from the Workplace and Employees Survey is also presented. The Canadian Journal of Statistics 40: 124–149; 2012 © 2011 Statistical Society of Canada  相似文献   

We consider the maximum likelihood estimator $\hat{F}_n$ of a distribution function in a class of deconvolution models where the known density of the noise variable is of bounded variation. This class of noise densities contains in particular bounded, decreasing densities. The estimator $\hat{F}_n$ is defined, characterized in terms of Fenchel optimality conditions and computed. Under appropriate conditions, various consistency results for $\hat{F}_n$ are derived, including uniform strong consistency. The Canadian Journal of Statistics 41: 98–110; 2013 © 2012 Statistical Society of Canada  相似文献   

In this article, we address the testing problem for additivity in nonparametric regression models. We develop a kernel‐based consistent test of a hypothesis of additivity in nonparametric regression, and establish its asymptotic distribution under a sequence of local alternatives. Compared to other existing kernel‐based tests, the proposed test is shown to effectively ameliorate the influence from estimation bias of the additive component of the nonparametric regression, and hence increase its efficiency. Most importantly, it avoids the tuning difficulties by using estimation‐based optimal criteria, while there is no direct tuning strategy for other existing kernel‐based testing methods. We discuss the usage of the new test and give numerical examples to demonstrate the practical performance of the test. The Canadian Journal of Statistics 39: 632–655; 2011. © 2011 Statistical Society of Canada  相似文献   

Many methods have been developed for the nonparametric estimation of a mean response function, but most of these methods do not lend themselves to simultaneous estimation of the mean response function and its derivatives. Recovering derivatives is important for analyzing human growth data, studying physical systems described by differential equations, and characterizing nanoparticles from scattering data. In this article the authors propose a new compound estimator that synthesizes information from numerous pointwise estimators indexed by a discrete set. Unlike spline and kernel smooths, the compound estimator is infinitely differentiable; unlike local regression smooths, the compound estimator is self‐consistent in that its derivatives estimate the derivatives of the mean response function. The authors show that the compound estimator and its derivatives can attain essentially optimal convergence rates in consistency. The authors also provide a filtration and extrapolation enhancement for finite samples, and the authors assess the empirical performance of the compound estimator and its derivatives via a simulation study and an application to real data. The Canadian Journal of Statistics 39: 280–299; 2011 © 2011 Statistical Society of Canada  相似文献   

The class of joint mean‐covariance models uses the modified Cholesky decomposition of the within subject covariance matrix in order to arrive to an unconstrained, statistically meaningful reparameterisation. The new parameterisation of the covariance matrix has two sets of parameters that separately describe the variances and correlations. Thus, with the mean or regression parameters, these models have three sets of distinct parameters. In order to alleviate the problem of inefficient estimation and downward bias in the variance estimates, inherent in the maximum likelihood estimation procedure, the usual REML estimation procedure adjusts for the degrees of freedom lost due to the estimation of the mean parameters. Because of the parameterisation of the joint mean covariance models, it is possible to adapt the usual REML procedure in order to estimate the variance (correlation) parameters by taking into account the degrees of freedom lost by the estimation of both the mean and correlation (variance) parameters. To this end, here we propose adjustments to the estimation procedures based on the modified and adjusted profile likelihoods. The methods are illustrated by an application to a real data set and simulation studies. The Canadian Journal of Statistics 40: 225–242; 2012 © 2012 Statistical Society of Canada  相似文献   

A precision matrix is an important parameter of interests because its elements describe useful association information among multiple variables, which has a wide variety of applications. For example, it is used for inferring gene regulation networks in genomic studies and stock association networks in financial studies. However, in many cases, the precision matrix needs to be robustly estimated due to the presence of outliers. We propose estimating a sparse scaled precision matrix via weighted median regression with regularization. Our weighted median regression approach is consistent under various distributional assumptions including multivariate t‐ or contaminated Gaussian distributions. This fact is illustrated with simulation studies and a real data analysis with monthly stock return data. The Canadian Journal of Statistics 46: 265–278; 2018 © 2018 Statistical Society of Canada  相似文献   

The authors propose a robust transformation linear mixed‐effects model for longitudinal continuous proportional data when some of the subjects exhibit outlying trajectories over time. It becomes troublesome when including or excluding such subjects in the data analysis results in different statistical conclusions. To robustify the longitudinal analysis using the mixed‐effects model, they utilize the multivariate t distribution for random effects or/and error terms. Estimation and inference in the proposed model are established and illustrated by a real data example from an ophthalmology study. Simulation studies show a substantial robustness gain by the proposed model in comparison to the mixed‐effects model based on Aitchison's logit‐normal approach. As a result, the data analysis benefits from the robustness of making consistent conclusions in the presence of influential outliers. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

Accurate diagnosis of disease is a critical part of health care. New diagnostic and screening tests must be evaluated based on their abilities to discriminate diseased conditions from non‐diseased conditions. For a continuous‐scale diagnostic test, a popular summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC). However, when our focus is on a certain region of false positive rates, we often use the partial AUC instead. In this paper we have derived the asymptotic normal distribution for the non‐parametric estimator of the partial AUC with an explicit variance formula. The empirical likelihood (EL) ratio for the partial AUC is defined and it is shown that its limiting distribution is a scaled chi‐square distribution. Hybrid bootstrap and EL confidence intervals for the partial AUC are proposed by using the newly developed EL theory. We also conduct extensive simulation studies to compare the relative performance of the proposed intervals and existing intervals for the partial AUC. A real example is used to illustrate the application of the recommended intervals. The Canadian Journal of Statistics 39: 17–33; 2011 © 2011 Statistical Society of Canada  相似文献   

This paper treats an abstract parametric family of symmetric linear estimators for the mean vector of a standard linear model. The estimator in this family that has smallest estimated quadratic risk is shown to attain, asymptotically, the smallest risk achievable over all candidate estimators in the family. The asymptotic analysis is carried out under a strong Gauss–Markov form of the linear model in which the dimension of the regression space tends to infinity. Leading examples to which the results apply include: (a) penalized least squares fits constrained by multiple, weighted, quadratic penalties; and (b) running, symmetrically weighted, means. In both instances, the weights define a parameter vector whose natural domain is a continuum.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号