首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Yanfang Li  Jing Lei 《Statistics》2018,52(4):782-800
We study high dimensional multigroup classification from a sparse subspace estimation perspective, unifying the linear discriminant analysis (LDA) with other recent developments in high dimensional multivariate analysis using similar tools, such as penalization method. We develop two two-stage sparse LDA models, where in the first stage, convex relaxation is used to convert two classical formulations of LDA to semidefinite programs, and furthermore subspace perspective allows for straightforward regularization and estimation. After the initial convex relaxation, we use a refinement stage to improve the accuracy. For the first model, a penalized quadratic program with group lasso penalty is used for refinement, whereas a sparse version of the power method is used for the second model. We carefully examine the theoretical properties of both methods, alongside with simulations and real data analysis.  相似文献   

2.
In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate.  相似文献   

3.
Summary.  We present a new class of methods for high dimensional non-parametric regression and classification called sparse additive models. Our methods combine ideas from sparse linear modelling and additive non-parametric regression. We derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size. Sparse additive models are essentially a functional version of the grouped lasso of Yuan and Lin. They are also closely related to the COSSO model of Lin and Zhang but decouple smoothing and sparsity, enabling the use of arbitrary non-parametric smoothers. We give an analysis of the theoretical properties of sparse additive models and present empirical results on synthetic and real data, showing that they can be effective in fitting sparse non-parametric models in high dimensional data.  相似文献   

4.
Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis.  相似文献   

5.
For time series data with obvious periodicity (e.g., electric motor systems and cardiac monitor) or vague periodicity (e.g., earthquake and explosion, speech, and stock data), frequency-based techniques using the spectral analysis can usually capture the features of the series. By this approach, we are able not only to reduce the data dimensions into frequency domain but also utilize these frequencies by general classification methods such as linear discriminant analysis (LDA) and k-nearest-neighbor (KNN) to classify the time series. This is a combination of two classical approaches. However, there is a difficulty in using LDA and KNN in frequency domain due to excessive dimensions of data. We overcome the obstacle by using Singular Value Decomposition to select essential frequencies. Two data sets are used to illustrate our approach. The classification error rates of our simple approach are comparable to those of several more complicated methods.  相似文献   

6.
We propose two new procedures based on multiple hypothesis testing for correct support estimation in high‐dimensional sparse linear models. We conclusively prove that both procedures are powerful and do not require the sample size to be large. The first procedure tackles the atypical setting of ordered variable selection through an extension of a testing procedure previously developed in the context of a linear hypothesis. The second procedure is the main contribution of this paper. It enables data analysts to perform support estimation in the general high‐dimensional framework of non‐ordered variable selection. A thorough simulation study and applications to real datasets using the R package mht shows that our non‐ordered variable procedure produces excellent results in terms of correct support estimation as well as in terms of mean square errors and false discovery rate, when compared to common methods such as the Lasso, the SCAD penalty, forward regression or the false discovery rate procedure (FDR).  相似文献   

7.
In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods.  相似文献   

8.
We study the design problem for the optimal classification of functional data. The goal is to select sampling time points so that functional data observed at these time points can be classified accurately. We propose optimal designs that are applicable to either dense or sparse functional data. Using linear discriminant analysis, we formulate our design objectives as explicit functions of the sampling points. We study the theoretical properties of the proposed design objectives and provide a practical implementation. The performance of the proposed design is evaluated through simulations and real data applications. The Canadian Journal of Statistics 48: 285–307; 2020 © 2019 Statistical Society of Canada  相似文献   

9.
We deal with smoothed estimators for conditional probability functions of discrete-valued time series { Yt } under two different settings. When the conditional distribution of Yt given its lagged values falls in a parametric family and depends on exogenous random variables, a smoothed maximum (partial) likelihood estimator for the unknown parameter is proposed. While there is no prior information on the distribution, various nonparametric estimation methods have been compared and the adjusted Nadaraya–Watson estimator stands out as it shares the advantages of both Nadaraya–Watson and local linear regression estimators. The asymptotic normality of the estimators proposed has been established in the manner of sparse asymptotics, which shows that the smoothed methods proposed outperform their conventional, unsmoothed, parametric counterparts under very mild conditions. Simulation results lend further support to this assertion. Finally, the new method is illustrated via a real data set concerning the relationship between the number of daily hospital admissions and the levels of pollutants in Hong Kong in 1994–1995. An ad hoc model selection procedure based on a local Akaike information criterion is proposed to select the significant pollutant indices.  相似文献   

10.
Summary.  The problem of component choice in regression-based prediction has a long history. The main cases where important choices must be made are functional data analysis, and problems in which the explanatory variables are relatively high dimensional vectors. Indeed, principal component analysis has become the basis for methods for functional linear regression. In this context the number of components can also be interpreted as a smoothing parameter, and so the viewpoint is a little different from that for standard linear regression. However, arguments for and against conventional component choice methods are relevant to both settings and have received significant recent attention. We give a theoretical argument, which is applicable in a wide variety of settings, justifying the conventional approach. Although our result is of minimax type, it is not asymptotic in nature; it holds for each sample size. Motivated by the insight that is gained from this analysis, we give theoretical and numerical justification for cross-validation choice of the number of components that is used for prediction. In particular we show that cross-validation leads to asymptotic minimization of mean summed squared error, in settings which include functional data analysis.  相似文献   

11.
This paper is concerned with testing the equality of two high‐dimensional spatial sign covariance matrices with applications to testing the proportionality of two high‐dimensional covariance matrices. It is interesting that these two testing problems are completely equivalent for the class of elliptically symmetric distributions. This paper develops a new test for testing the equality of two high‐dimensional spatial sign covariance matrices based on the Frobenius norm of the difference between two spatial sign covariance matrices. The asymptotic normality of the proposed testing statistic is derived under the null and alternative hypotheses when the dimension and sample sizes both tend to infinity. Moreover, the asymptotic power function is also presented. Simulation studies show that the proposed test performs very well in a wide range of settings and can be allowed for the case of large dimensions and small sample sizes.  相似文献   

12.
We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

13.
We find the asymptotic distribution of the multi‐dimensional multi‐scale and kernel estimators for high‐frequency financial data with microstructure. Sampling times are allowed to be asynchronous and endogenous. In the process, we show that the classes of multi‐scale and kernel estimators for smoothing noise perturbation are asymptotically equivalent in the sense of having the same asymptotic distribution for corresponding kernel and weight functions. The theory leads to multi‐dimensional stable central limit theorems and feasible versions. Hence, they allow to draw statistical inference for a broad class of multivariate models, which paves the way to tests and confidence intervals in risk measurement for arbitrary portfolios composed of high‐frequently observed assets. As an application, we enhance the approach to construct a test for investigating hypotheses that correlated assets are independent conditional on a common factor.  相似文献   

14.
Classification of gene expression microarray data is important in the diagnosis of diseases such as cancer, but often the analysis of microarray data presents difficult challenges because the gene expression dimension is typically much larger than the sample size. Consequently, classification methods for microarray data often rely on regularization techniques to stabilize the classifier for improved classification performance. In particular, numerous regularization techniques, such as covariance-matrix regularization, are available, which, in practice, lead to a difficult choice of regularization methods. In this paper, we compare the classification performance of five covariance-matrix regularization methods applied to the linear discriminant function using two simulated high-dimensional data sets and five well-known, high-dimensional microarray data sets. In our simulation study, we found the minimum distance empirical Bayes method reported in Srivastava and Kubokawa [Comparison of discrimination methods for high dimensional data, J. Japan Statist. Soc. 37(1) (2007), pp. 123–134], and the new linear discriminant analysis reported in Thomaz, Kitani, and Gillies [A Maximum Uncertainty LDA-based approach for Limited Sample Size problems – with application to Face Recognition, J. Braz. Comput. Soc. 12(1) (2006), pp. 1–12], to perform consistently well and often outperform three other prominent regularization methods. Finally, we conclude with some recommendations for practitioners.  相似文献   

15.
Abstract. The conditional score approach is proposed to the analysis of errors‐in‐variable current status data under the proportional odds model. Distinct from the conditional scores in other applications, the proposed conditional score involves a high‐dimensional nuisance parameter, causing challenges in both asymptotic theory and computation. We propose a composite algorithm combining the Newton–Raphson and self‐consistency algorithms for computation and develop an efficient conditional score, analogous to the efficient score from a typical semiparametric likelihood, for building an asymptotic linear expression and hence the asymptotic distribution of the conditional‐score estimator for the regression parameter. Our proposal is shown to perform well in simulation studies and is applied to a zebrafish basal cell carcinoma data involving measurement errors in gene expression levels.  相似文献   

16.
The mode of a distribution provides an important summary of data and is often estimated on the basis of some non‐parametric kernel density estimator. This article develops a new data analysis tool called modal linear regression in order to explore high‐dimensional data. Modal linear regression models the conditional mode of a response Y given a set of predictors x as a linear function of x . Modal linear regression differs from standard linear regression in that standard linear regression models the conditional mean (as opposed to mode) of Y as a linear function of x . We propose an expectation–maximization algorithm in order to estimate the regression coefficients of modal linear regression. We also provide asymptotic properties for the proposed estimator without the symmetric assumption of the error density. Our empirical studies with simulated data and real data demonstrate that the proposed modal regression gives shorter predictive intervals than mean linear regression, median linear regression and MM‐estimators.  相似文献   

17.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

18.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

19.
In this paper, we propose the hard thresholding regression (HTR) for estimating high‐dimensional sparse linear regression models. HTR uses a two‐stage convex algorithm to approximate the ?0‐penalized regression: The first stage calculates a coarse initial estimator, and the second stage identifies the oracle estimator by borrowing information from the first one. Theoretically, the HTR estimator achieves the strong oracle property over a wide range of regularization parameters. Numerical examples and a real data example lend further support to our proposed methodology.  相似文献   

20.
M. Nussbaum 《Statistics》2013,47(2):173-198
For the problem of estimating a linear functional relation when the ratio of the error variances is known a general class of estimators is introduced. They include as special cases the instrumental variable and replication cases and some others. Conditions are given for consistency, asymptotic normality and asymptotic optimality within this class based on the variance of the limit distribution. Fisheb's lower bound for asymptotic variances is established, and under normality the asymptotically optimal estimators are shown to be best asymptotically normal. For an inhomogeneous linear relation only estimators which are invariant with respect to a translation of the origin are considered, and asymptotically optimal invariant and, under normality, best asymptotically normal invariant estimators are obtained. Several special cases are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号