首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
The performance of the sample linear discriminant function with known, proportional, covariance matrices and equal but unknown mean vectors is considered. Unconditional misclassification rates are obtained from the Student-t distribution. These results can be used as an aid in verifying simulation programs incorporating the linear discriminant function when Gaussian densities with unequal covariance matrices are used.  相似文献   

2.
The distribution of the probabilities of misclassification is derived in this paper, which are reproduced by the use of the linear discriminant function. The statistical background is two independent doubly truncated t populations with distinct location parameters and common scale parameter and degrees of freedom. The behavior of the linear discriminant function is studied by comparing the distribution function of the errors of misclassification under the truncated t and truncated normal models.  相似文献   

3.
Several methods have been proposed to estimate the misclassification probabilities when a linear discriminant function is used to classify an observation into one of several populations. We describe the application of bootstrap sampling to the above problem. The proposed method has the advantage of not only furnishing the estimates of misclassification probabilities but also provides an estimate of the standard error of estimate. The method is illustrated by a small simulation experiment. It is then applied to three published, well accessible data sets, which are typical of large, medium and small data sets encountered in practice.  相似文献   

4.
ABSTRACT

Classification of data consisting of both categorical and continuous variables between two groups is often handled by the sample location linear discriminant function confined to each of the locations specified by the observed values of the categorical variables. Homoscedasticity of across-location conditional dispersion matrices of the continuous variables is often assumed. Quite often, interactions between continuous and categorical variables cause across-location heteroscedasticity. In this article, we examine the effect of heterogeneous across-location conditional dispersion matrices on the overall expected and actual error rates associated with the sample location linear discriminant function. Performance of the sample location linear discriminant function is evaluated against the results for the restrictive classifier adjusted for across-location heteroscedasticity. Conclusions based on a Monte Carlo study are reported.  相似文献   

5.
We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

6.
We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probabilities than the other two estimators. We concluded that analysts should be cautious in using aggregate data to estimate the parameters of the logistic regression model for the underlying individual data.  相似文献   

7.
A discrimination procedure, based on the location model is described and suggested for use in situation where the discriminating variables are mixtures of continuous and binary variables. Some procedures that have been previously employed, in a similar situation, like Fisher's linear discriminant function and the logistic regression were compared with this method using error rate (ER). Optimal ERs for these procedures are reported using real and simulated data for the case of varying sample size and number of continuous and binary variables and were used as a measure for assessing the performance of the various procedures. The suggested procedure performed considerably better in the cases considered and never did produce a result that is poor when compared with other procedures. Hence, the suggested procedure might be considered for such situations.  相似文献   

8.
The authors consider a robust linear discriminant function based on high breakdown location and covariance matrix estimators. They derive influence functions for the estimators of the parameters of the discriminant function and for the associated classification error. The most B‐robust estimator is determined within the class of multivariate S‐estimators. This estimator, which minimizes the maximal influence that an outlier can have on the classification error, is also the most B‐robust location S‐estimator. A comparison of the most B‐robust estimator with the more familiar biweight S‐estimator is made.  相似文献   

9.
A Mann-Whitney type statistic is used to estimate a change-point when a change, at an unknown point in a sequence of random variables, has taken place. This estimate is compared, using Monte Carlo techniques, with the normal theory maximum likelihood estimate, when a location change has occurred, for different underlying distributions ranging from the normal to the long tailed “normal over uniform” distribution. The distribution of the Mann-Whitney type estimate remains fairly constant over the various distributions. Two generalisations of the statistic are considered and investigated.  相似文献   

10.
Discrimination between two Gaussian time series is examined assuming that the important difference between the alternative processes is their covarianoe (spectral) structure. Using the likelihood ratio method in frequency domain a discriminant function is derived and its approximate distribution is obtained. It is demonstrated that, utilizing the Kullbadk-Leibler information measure, the frequencies or frequency bands which carry information for discrimination can be determined. Using this, it is shown that when mean functions are equal, discrimination based on the frequency with the largest discrimination information is equivalent to the classification procedure based on the best linear discriminant, Application to seismology is described by including a discussion concerning the spectral ratio discriminant for underground nuclear explosion and natural earthquake and is illustrated numerically using Rayleigh wave data from an underground and an atmospheric explosions.  相似文献   

11.
We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-à-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children.  相似文献   

12.
Partially linear models are extensions of linear models that include a nonparametric function of some covariate allowing an adequate and more flexible handling of explanatory variables than in linear models. The difference-based estimation in partially linear models is an approach designed to estimate parametric component by using the ordinary least squares estimator after removing the nonparametric component from the model by differencing. However, it is known that least squares estimates do not provide useful information for the majority of data when the error distribution is not normal, particularly when the errors are heavy-tailed and when outliers are present in the dataset. This paper aims to find an outlier-resistant fit that represents the information in the majority of the data by robustly estimating the parametric and the nonparametric components of the partially linear model. Simulations and a real data example are used to illustrate the feasibility of the proposed methodology and to compare it with the classical difference-based estimator when outliers exist.  相似文献   

13.
The quadratic discriminant function is commonly used for the two group classification problem when the covariance matrices in the two populations are substantially unequal. This procedure is optimal when both populations are multivariate normal with known means and covariance matrices. This study examined the robustness of the QDF to non-normality. Sampling experiments were conducted to estimate expected actual error rates for the QDF when sampling from a variety of non-normal distributions. Results indicated that the QDF was robust to non-normality except when the distributions were highly skewed, in which case relatively large deviations from optimal were observed. In all cases studied the average probabilities of misclassification were relatively stable while the individual population error rates exhibited considerable variability.  相似文献   

14.
Summary.  We propose covariance-regularized regression, a family of methods for prediction in high dimensional settings that uses a shrunken estimate of the inverse covariance matrix of the features to achieve superior prediction. An estimate of the inverse covariance matrix is obtained by maximizing the log-likelihood of the data, under a multivariate normal model, subject to a penalty; it is then used to estimate coefficients for the regression of the response onto the features. We show that ridge regression, the lasso and the elastic net are special cases of covariance-regularized regression, and we demonstrate that certain previously unexplored forms of covariance-regularized regression can outperform existing methods in a range of situations. The covariance-regularized regression framework is extended to generalized linear models and linear discriminant analysis, and is used to analyse gene expression data sets with multiple class and survival outcomes.  相似文献   

15.
The linear discriminant function is transformed into a linear combination of independent random variables. It is shown that reducing dimensionality using the smallest distance criterion results in smaller increase in the error rate than using the smallest variance criterion. Three error rates are used to prove this.  相似文献   

16.
In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods.  相似文献   

17.
The consistency and asymptotic normality of a linear least squares estimate of the form (X'X)-X'Y when the mean is not Xβ is investigated in this paper. The least squares estimate is a consistent estimate of the best linear approximation of the true mean function for the design chosen. The asymptotic normality of the least squares estimate depends on the design and the asymptotic mean may not be the best linear approximation of the true mean function. Choices of designs which allow large sample inferences to be made about the best linear approximation of the true mean function are discussed.  相似文献   

18.
In this paper we consider the risk performances of some estimators for both location and scale parameters in a linear regression model under Inagaki’s loss function We prove that the pre-test estimator for location parameter is dominated by the Stein-rule estimator under Inagaki’s loss function when the distribution of error terms is expressed by the scale mixture of normal distribution and the variance of error terms is unknown.. It is an extension of the results in Nagata (1983) to our situation Also we perform numerical calculations to draw the shapes of the risks.  相似文献   

19.
ABSTRACT

The parameters of stable law parameters can be estimated using a regression based approach involving the empirical characteristic function. One approach is to use a fixed number of points for all parameters of the distribution to estimate the characteristic function. In this work the results are derived where all points in an interval is used to estimate the empirical characteristic function, thus least squares estimators of a linear function of the parameters, using an infinite number of observations. It was found that the procedure performs very good in small samples.  相似文献   

20.
For defining a Modified Maximum Likelihood Estimate of the scale parameter of Rayleigh distribution, a hyperbolic approximation is used instead of linear approximation for a function which appears in the Maximum Likelihood equation. This estimate is shown to perform better, in the sense of accuracy and simplicity of calculation, than the one based on linear approximation for the same function. Also the estimate of the scale parameter obtained is shown to be asymptotically unbiased. Numerical computation for random samples of different sizes from Rayleigh distribution, using type I1 censoring is done and is shown to be better than that obtained by Lee et al. (1980)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号