首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Discrimination between two Gaussian time series is examined assuming that the important difference between the alternative processes is their covarianoe (spectral) structure. Using the likelihood ratio method in frequency domain a discriminant function is derived and its approximate distribution is obtained. It is demonstrated that, utilizing the Kullbadk-Leibler information measure, the frequencies or frequency bands which carry information for discrimination can be determined. Using this, it is shown that when mean functions are equal, discrimination based on the frequency with the largest discrimination information is equivalent to the classification procedure based on the best linear discriminant, Application to seismology is described by including a discussion concerning the spectral ratio discriminant for underground nuclear explosion and natural earthquake and is illustrated numerically using Rayleigh wave data from an underground and an atmospheric explosions.  相似文献   

2.
Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins.  相似文献   

3.
In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate.  相似文献   

4.
An assumption made in the classification problem is that the distribution of the data being classified has the same parameters as the data used to obtain the discriminant functions. A method based on mixtures of two normal distributions is proposed as method of checking this assumption and modifying the discriminant functions accordingly. As a first step, the case considered in this paper, is that of a shift in the mean of one or two univariate normal distributions with all other parameters remaining fixed and known. Calculations based on the asymptotic the proposed method works well even for small shifts.  相似文献   

5.
Summary.  An authentic food is one that is what it purports to be. Food processors and consumers need to be assured that, when they pay for a specific product or ingredient, they are receiving exactly what they pay for. Classification methods are an important tool in food authenticity studies where they are used to assign food samples of unknown type to known types. A classification method is developed where the classification rule is estimated by using both the labelled and the unlabelled data, in contrast with many classical methods which use only the labelled data for estimation. This methodology models the data as arising from a Gaussian mixture model with parsimonious covariance structure, as is done in model-based clustering. A missing data formulation of the mixture model is used and the models are fitted by using the EM and classification EM algorithms. The methods are applied to the analysis of spectra of food-stuffs recorded over the visible and near infra-red wavelength range in food authenticity studies. A comparison of the performance of model-based discriminant analysis and the method of classification proposed is given. The classification method proposed is shown to yield very good misclassification rates. The correct classification rate was observed to be as much as 15% higher than the correct classification rate for model-based discriminant analysis.  相似文献   

6.
For time series data with obvious periodicity (e.g., electric motor systems and cardiac monitor) or vague periodicity (e.g., earthquake and explosion, speech, and stock data), frequency-based techniques using the spectral analysis can usually capture the features of the series. By this approach, we are able not only to reduce the data dimensions into frequency domain but also utilize these frequencies by general classification methods such as linear discriminant analysis (LDA) and k-nearest-neighbor (KNN) to classify the time series. This is a combination of two classical approaches. However, there is a difficulty in using LDA and KNN in frequency domain due to excessive dimensions of data. We overcome the obstacle by using Singular Value Decomposition to select essential frequencies. Two data sets are used to illustrate our approach. The classification error rates of our simple approach are comparable to those of several more complicated methods.  相似文献   

7.
8.
ABSTRACT

In this paper we propose a new non parametric estimator of the spectral matrix of a multivariate stationary stochastic process, with the main goal to locally improve the deficiencies of the smoothed periodogram in terms of mean square error of the estimates. Our estimator is based on a convex linear combination of the frequency averaged periodogram and an estimate of the true mean spectral matrix across frequencies. In a wide simulation study we show that our estimator turns out to be able to markedly improve the frequency averaged periodogram especially at central frequencies.  相似文献   

9.
The circulant embedding method for generating statistically exact simulations of time series from certain Gaussian distributed stationary processes is attractive because of its advantage in computational speed over a competitive method based upon the modified Cholesky decomposition. We demonstrate that the circulant embedding method can be used to generate simulations from stationary processes whose spectral density functions are dictated by a number of popular nonparametric estimators, including all direct spectral estimators (a special case being the periodogram), certain lag window spectral estimators, all forms of Welch's overlapped segment averaging spectral estimator and all basic multitaper spectral estimators. One application for this technique is to generate time series for bootstrapping various statistics. When used with bootstrapping, our proposed technique avoids some – but not all – of the pitfalls of previously proposed frequency domain methods for simulating time series.  相似文献   

10.
This paper discusses a supervised classification approach for the differential diagnosis of Raynaud's phenomenon (RP). The classification of data from healthy subjects and from patients suffering for primary and secondary RP is obtained by means of a set of classifiers derived within the framework of linear discriminant analysis. A set of functional variables and shape measures extracted from rewarming/reperfusion curves are proposed as discriminant features. Since the prediction of group membership is based on a large number of these features, the high dimension/small sample size problem is considered to overcome the singularity problem of the within-group covariance matrix. Results on a data set of 72 subjects demonstrate that a satisfactory classification of the subjects can be achieved through the proposed methodology.  相似文献   

11.
The purpose of this study was to predict placement and nonplacement outcomes for mildly handicapped three through five year old children given knowledge of developmental screening test data. Discrete discriminant analysis (Anderson, 1951; Cochran & Hopkins, 1961; Goldstein & Dillon, 1978) was used to classify children into either a placement or nonplacement group using developmental information retrieved from longitudinal Child Find records (1982-89). These records were located at the Florida Diagnostic and Learning Resource System (FDLRS) in Sarasota, Florida and provided usable data for 602 children. The developmental variables included performance on screening test activities from the Comprehensive Identification Process (Zehrbach, 1975), and consisted of: (a) gross motor skills, (b) expressive language skills, and (c) social-emotional skills. These three dichotomously scored developmental variables generated eight mutually exclusive and exhaustive combinations of screening data. Combined with one of three different types of cost-of-misclassification functions, each child in a random cross-validation sample of 100 was classified into one of the two outcome groups minimizing the expected cost of misclassification based on the remaining 502 children. For each cost function designed by the researchers a comparison was made between classifications from the discrete discriminant analysis procedure and actual placement outcomes for the 100 children. A logit analysis and a standard discriminant analysis were likewise conducted using the 502 children and compared with results of the discrete discriminant analysis for selected cost functions.  相似文献   

12.
This article proposes an asymptotic expansion for the Studentized linear discriminant function using two-step monotone missing samples under multivariate normality. The asymptotic expansions related to discriminant function have been obtained for complete data under multivariate normality. The result derived by Anderson (1973 Anderson , T. W. ( 1973 ). An asymptotic expansion of the distribution of the Studentized classification statistic W . The Annals of Statistics 1 : 964972 .[Crossref], [Web of Science ®] [Google Scholar]) plays an important role in deciding the cut-off point that controls the probabilities of misclassification. This article provides an extension of the result derived by Anderson (1973 Anderson , T. W. ( 1973 ). An asymptotic expansion of the distribution of the Studentized classification statistic W . The Annals of Statistics 1 : 964972 .[Crossref], [Web of Science ®] [Google Scholar]) in the case of two-step monotone missing samples under multivariate normality. Finally, numerical evaluations by Monte Carlo simulations were also presented.  相似文献   

13.
We study the design problem for the optimal classification of functional data. The goal is to select sampling time points so that functional data observed at these time points can be classified accurately. We propose optimal designs that are applicable to either dense or sparse functional data. Using linear discriminant analysis, we formulate our design objectives as explicit functions of the sampling points. We study the theoretical properties of the proposed design objectives and provide a practical implementation. The performance of the proposed design is evaluated through simulations and real data applications. The Canadian Journal of Statistics 48: 285–307; 2020 © 2019 Statistical Society of Canada  相似文献   

14.
Non-Gaussian Conditional Linear AR(1) Models   总被引:2,自引:0,他引:2  
This paper gives a general formulation of a non-Gaussian conditional linear AR(1) model subsuming most of the non-Gaussian AR(1) models that have appeared in the literature. It derives some general results giving properties for the stationary process mean, variance and correlation structure, and conditions for stationarity. These results highlight similarities with and differences from the Gaussian AR(1) model, and unify many separate results appearing in the literature. Examples illustrate the wide range of properties that can appear under the conditional linear autoregressive assumption. These results are used in analysing three real datasets, illustrating general methods of estimation, model diagnostics and model selection. In particular, the theoretical results can be used to develop diagnostics for deciding if a time series can be modelled by some linear autoregressive model, and for selecting among several candidate models.  相似文献   

15.
We propose an adaptive functional autoregressive (AFAR) forecast model to predict electricity price curves. With time-varying operators, the AFAR model can be safely used in both stationary and nonstationary situations. A closed-form maximum likelihood (ML) estimator is derived under stationarity. The result is further extended for nonstationarity, where the time-dependent operators are adaptively estimated under local homogeneity. We provide theoretical results of the ML estimator and the adaptive estimator. Simulation study illustrates nice finite sample performance of the AFAR modeling. The AFAR model also exhibits a superior accuracy in the forecast exercise of the California electricity daily price curves compared to several alternatives.  相似文献   

16.
This article deals with the efficiency of fractional integration parameter estimators. This study was based on Monte Carlo experiments involving simulated stochastic processes with integration orders in the range ]-1,1[. The evaluated estimation methods were classified into two groups: heuristics and semiparametric/maximum likelihood (ML). The study revealed that the comparative efficiency of the estimators, measured by the lesser mean squared error, depends on the stationary/non-stationary and persistency/anti-persistency conditions of the series. The ML estimator was shown to be superior for stationary persistent processes; the wavelet spectrum-based estimators were better for non-stationary mean reversible and invertible anti-persistent processes; the weighted periodogram-based estimator was shown to be superior for non-invertible anti-persistent processes.  相似文献   

17.
The problem of constructing classification methods based on both labeled and unlabeled data sets is considered for analyzing data with complex structures. We introduce a semi-supervised logistic discriminant model with Gaussian basis expansions. Unknown parameters included in the logistic model are estimated by regularization method along with the technique of EM algorithm. For selection of adjusted parameters, we derive a model selection criterion from Bayesian viewpoints. Numerical studies are conducted to investigate the effectiveness of our proposed modeling procedures.  相似文献   

18.
This article is concerned with inference for the parameter vector in stationary time series models based on the frequency domain maximum likelihood estimator. The traditional method consistently estimates the asymptotic covariance matrix of the parameter estimator and usually assumes the independence of the innovation process. For dependent innovations, the asymptotic covariance matrix of the estimator depends on the fourth‐order cumulants of the unobserved innovation process, a consistent estimation of which is a difficult task. In this article, we propose a novel self‐normalization‐based approach to constructing a confidence region for the parameter vector in such models. The proposed procedure involves no smoothing parameter, and is widely applicable to a large class of long/short memory time series models with weakly dependent innovations. In simulation studies, we demonstrate favourable finite sample performance of our method in comparison with the traditional method and a residual block bootstrap approach.  相似文献   

19.
Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures.  相似文献   

20.
In longitudinal studies, as repeated observations are made on the same individual the response variables will usually be correlated. In analyzing such data, this dependence must be taken into account to avoid misleading inferences. The focus of this paper is to apply a logistic marginal model with Markovian dependence proposed by Azzalini [A. Azzalini, Logistic regression for autocorrelated data with application to repeated measures, Biometrika 81 (1994) 767–775] to the study of the influence of time-dependent covariates on the marginal distribution of the binary response in serially correlated binary data. We have shown how to construct the model so that the covariates relate only to the mean value of the process, independent of the association parameters. After formulating the proposed model for repeated measures data, the same approach is applied to missing data. An application is provided to the diabetes mellitus data of registered patients at the Bangladesh Institute of Research and Rehabilitation in Diabetes, Endocrine and Metabolic Disorders (BIRDEM) in 1984, using both time stationary and time varying covariates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号