首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
Empirical likelihood for generalized linear models with missing responses   总被引:1,自引:0,他引:1  
The paper uses the empirical likelihood method to study the construction of confidence intervals and regions for regression coefficients and response mean in generalized linear models with missing response. By using the inverse selection probability weighted imputation technique, the proposed empirical likelihood ratios are asymptotically chi-squared. Our approach is to directly calibrate the empirical likelihood ratio, which is called as a bias-correction method. Also, a class of estimators for the parameters of interest is constructed, and the asymptotic distributions of the proposed estimators are obtained. A simulation study indicates that the proposed methods are comparable in terms of coverage probabilities and average lengths/areas of confidence intervals/regions. An example of a real data set is used for illustrating our methods.  相似文献   

2.
Many sparse linear discriminant analysis (LDA) methods have been proposed to overcome the major problems of the classic LDA in high‐dimensional settings. However, the asymptotic optimality results are limited to the case with only two classes. When there are more than two classes, the classification boundary is complicated and no explicit formulas for the classification errors exist. We consider the asymptotic optimality in the high‐dimensional settings for a large family of linear classification rules with arbitrary number of classes. Our main theorem provides easy‐to‐check criteria for the asymptotic optimality of a general classification rule in this family as dimensionality and sample size both go to infinity and the number of classes is arbitrary. We establish the corresponding convergence rates. The general theory is applied to the classic LDA and the extensions of two recently proposed sparse LDA methods to obtain the asymptotic optimality.  相似文献   

3.
Fisher's Linear Discriminant Function Can be used to classify an individual who has sampled from one of two multivariate normal Populations. In the following, this function is viewed as the other given his data vector it is assumed that the Population means and common covariance matrix are unknown. The vector of discriminant coeffients β(p×1) is the gradient of posterior log-odds and certain of its lineqar functions are directional derivatives which have a practical meaning. Accordingly, we treat the problems of estimating several linear functions of β The usual estimatoes of these functions are scaled versions of the unbiased estmators. In this Paper, these estimators are domainated by explicit alterenatives under a quadratic loss function. we reduce the problem of estimating β to that of estimating the inverse convariance matrix.  相似文献   

4.
The location linear discriminant function is used in a two-population classification problem when the available data are generated from both binary and continuous random variables. Asymptotic distribution of the studentized location linear discriminant function is derived directly without the inversion of the corresponding characteristic function. The resulting plug-in estimate of the overall error of misclassification consists of the estimate based on the limiting distribution of the discriminant plus a correction term up to the second order. By comparison, our estimate avoids exact knowledge of the Mahalanobis distances which is necessary when the expansions of Vlachonikolis (1985) are used in the case of an arbitrary cut-off point. An example is re-examined and analysed in the present context.  相似文献   

5.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

6.
Earlier investigations used a one-sided inequality to consltuct confidence regions for the variance ratios or balanced randoiu models. In this study, confidence regions are based on a two-sided generalisation of this inequality and the results are illustrated by estimating the parameters of some elementary random models.  相似文献   

7.
The restrictive properties of compositional data, that is multivariate data with positive parts that carry only relative information in their components, call for special care to be taken while performing standard statistical methods, for example, regression analysis. Among the special methods suitable for handling this problem is the total least squares procedure (TLS, orthogonal regression, regression with errors in variables, calibration problem), performed after an appropriate log-ratio transformation. The difficulty or even impossibility of deeper statistical analysis (confidence regions, hypotheses testing) using the standard TLS techniques can be overcome by calibration solution based on linear regression. This approach can be combined with standard statistical inference, for example, confidence and prediction regions and bounds, hypotheses testing, etc., suitable for interpretation of results. Here, we deal with the simplest TLS problem where we assume a linear relationship between two errorless measurements of the same object (substance, quantity). We propose an iterative algorithm for estimating the calibration line and also give confidence ellipses for the location of unknown errorless results of measurement. Moreover, illustrative examples from the fields of geology, geochemistry and medicine are included. It is shown that the iterative algorithm converges to the same values as those obtained using the standard TLS techniques. Fitted lines and confidence regions are presented for both original and transformed compositional data. The paper contains basic principles of linear models and addresses many related problems.  相似文献   

8.
In simulation studies for discriminant analysis, misclassification errors are often computed using the Monte Carlo method, by testing a classifier on large samples generated from known populations. Although large samples are expected to behave closely to the underlying distributions, they may not do so in a small interval or region, and thus may lead to unexpected results. We demonstrate with an example that the LDA misclassification error computed via the Monte Carlo method may often be smaller than the Bayes error. We give a rigorous explanation and recommend a method to properly compute misclassification errors.  相似文献   

9.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

10.
We study the design problem for the optimal classification of functional data. The goal is to select sampling time points so that functional data observed at these time points can be classified accurately. We propose optimal designs that are applicable to either dense or sparse functional data. Using linear discriminant analysis, we formulate our design objectives as explicit functions of the sampling points. We study the theoretical properties of the proposed design objectives and provide a practical implementation. The performance of the proposed design is evaluated through simulations and real data applications. The Canadian Journal of Statistics 48: 285–307; 2020 © 2019 Statistical Society of Canada  相似文献   

11.
The problem of updating discriminant functions estimated from inverse Gaussian populations is investigated in situations when the additional observations are mixed (unclassified) or classified. In each case two types of discriminant functions, linear and quadratic, are considered. Using simulation experiments the performance of the updating procedures is evaluated by means of relative efficiencies.  相似文献   

12.
The aim of this article is to improve the quality of cookies production by classifying them as good or bad from the curves of resistance of dough observed during the kneading process. As the predictor variable is functional, functional classification methodologies such as functional logit regression and functional discriminant analysis are considered. A P-spline approximation of the sample curves is proposed to improve the classification ability of these models and to suitably estimate the relationship between the quality of cookies and the resistance of dough. Inference results on the functional parameters and related odds ratios are obtained using the asymptotic normality of the maximum likelihood estimators under the classical regularity conditions. Finally, the classification results are compared with alternative functional data analysis approaches such as componentwise classification on the logit regression model.  相似文献   

13.
We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

14.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

15.
判别分析已越来越受到人们的重视并取得了重要的应用成果,但应用中存在着简单套用的情况,对判别分析的适用性、判别效果的显著性、判别变量的判别能力以及判别函数的判别能力的检验等问题重视不够。为了更好地应用判别分析,就应对判别分析进行统计检验并建立统计检验体系,统计检验体系应包括:判别分析适用性检验,判别效果显著性检验,判别变量的判别能力检验和判别函数的判别能力检验。  相似文献   

16.
In a recent paper5 Broemeling (1978) extended his earlier work on one-sided confidence regions for the variance ratios of balanced random-effects models to the two-sided case. The extension depends on a probability Inequality which was claimed to be tru We show here that it is false, hence the proof of the main result given in Ms parer is in error W also show Lhat the ntatement of his result remains true in certain special cases.  相似文献   

17.
In this article, the generalized linear model for longitudinal data is studied. A generalized empirical likelihood method is proposed by combining generalized estimating equations and quadratic inference functions based on the working correlation matrix. It is proved that the proposed generalized empirical likelihood ratios are asymptotically chi-squared under some suitable conditions, and hence it can be used to construct the confidence regions of the parameters. In addition, the maximum empirical likelihood estimates of parameters are obtained, and their asymptotic normalities are proved. Some simulations are undertaken to compare the generalized empirical likelihood and normal approximation-based method in terms of coverage accuracies and average areas/lengths of confidence regions/intervals. An example of a real data is used for illustrating our methods.  相似文献   

18.
In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate.  相似文献   

19.
We develop exact inference for the location and scale parameters of the Laplace (double exponential) distribution based on their maximum likelihood estimators from a Type-II censored sample. Based on some pivotal quantities, exact confidence intervals and tests of hypotheses are constructed. Upon conditioning first on the number of observations that are below the population median, exact distributions of the pivotal quantities are expressed as mixtures of linear combinations and of ratios of linear combinations of standard exponential random variables, which facilitates the computation of quantiles of these pivotal quantities. Tables of quantiles are presented for the complete sample case.  相似文献   

20.
ABSTRACT

Classification of data consisting of both categorical and continuous variables between two groups is often handled by the sample location linear discriminant function confined to each of the locations specified by the observed values of the categorical variables. Homoscedasticity of across-location conditional dispersion matrices of the continuous variables is often assumed. Quite often, interactions between continuous and categorical variables cause across-location heteroscedasticity. In this article, we examine the effect of heterogeneous across-location conditional dispersion matrices on the overall expected and actual error rates associated with the sample location linear discriminant function. Performance of the sample location linear discriminant function is evaluated against the results for the restrictive classifier adjusted for across-location heteroscedasticity. Conclusions based on a Monte Carlo study are reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号