首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We develop classification rules for data that have an autoregressive circulant covariance structure under the assumption of multivariate normality. We also develop classification rules assuming a general circulant covariance structure. The new classification rules are efficient in reducing the misclassification error rates when the number of observations is not large enough to estimate the unknown variance–covariance matrix. The proposed classification rules are demonstrated by simulation study for their validity and illustrated by a real data analysis for their use. Analyses of both simulated data and real data show the effectiveness of our new classification rules.  相似文献   

2.
In this article we study the problem of classification of three-level multivariate data, where multiple qq-variate observations are measured on uu-sites and over pp-time points, under the assumption of multivariate normality. The new classification rules with certain structured and unstructured mean vectors and covariance structures are very efficient in small sample scenario, when the number of observations is not adequate to estimate the unknown variance–covariance matrix. These classification rules successfully model the correlation structure on successive repeated measurements over time. Computation algorithms for maximum likelihood estimates of the unknown population parameters are presented. Simulation results show that the introduction of sites in the classification rules improves their performance over the existing classification rules without the sites.  相似文献   

3.
Although devised in 1936 by Fisher, discriminant analysis is still rapidly evolving, as the complexity of contemporary data sets grows exponentially. Our classification rules explore these complexities by modeling various correlations in higher-order data. Moreover, our classification rules are suitable to data sets where the number of response variables is comparable or larger than the number of observations. We assume that the higher-order observations have a separable variance-covariance matrix and two different Kronecker product structures on the mean vector. In this article, we develop quadratic classification rules among g different populations where each individual has κth order (κ ≥2) measurements. We also provide the computational algorithms to compute the maximum likelihood estimates for the model parameters and eventually the sample classification rules.  相似文献   

4.
We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the data set. This criterion can be applied to general prediction problems (e.g. regression or classification) and to general prediction rules (e.g. stepwise regression, tree-based models and neural nets). As a by-product we obtain a measure of the effective number of parameters used by an adaptive procedure. We relate the covariance inflation criterion to other model selection procedures and illustrate its use in some regression and classification problems. We also revisit the conditional bootstrap approach to model selection.  相似文献   

5.
Abstract.  We consider classification of the realization of a multivariate spatial–temporal Gaussian random field into one of two populations with different regression mean models and factorized covariance matrices. Unknown means and common feature vector covariance matrix are estimated from training samples with observations correlated in space and time, assuming spatial–temporal correlations to be known. We present the first-order asymptotic expansion of the expected error rate associated with a linear plug-in discriminant function. Our results are applied to ecological data collected from the Lithuanian Economic Zone in the Baltic Sea.  相似文献   

6.
ABSTRACT

We develop a new score-driven model for the joint dynamics of fat-tailed realized covariance matrix observations and daily returns. The score dynamics for the unobserved true covariance matrix are robust to outliers and incidental large observations in both types of data by assuming a matrix-F distribution for the realized covariance measures and a multivariate Student's t distribution for the daily returns. The filter for the unknown covariance matrix has a computationally efficient matrix formulation, which proves beneficial for estimation and simulation purposes. We formulate parameter restrictions for stationarity and positive definiteness. Our simulation study shows that the new model is able to deal with high-dimensional settings (50 or more) and captures unobserved volatility dynamics even if the model is misspecified. We provide an empirical application to daily equity returns and realized covariance matrices up to 30 dimensions. The model statistically and economically outperforms competing multivariate volatility models out-of-sample. Supplementary materials for this article are available online.  相似文献   

7.
The last decade has seen an explosion of work on the use of mixture models for clustering. The use of the Gaussian mixture model has been common practice, with constraints sometimes imposed upon the component covariance matrices to give families of mixture models. Similar approaches have also been applied, albeit with less fecundity, to classification and discriminant analysis. In this paper, we begin with an introduction to model-based clustering and a succinct account of the state-of-the-art. We then put forth a novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure. This family, which is largely a t-analogue of the well-known MCLUST family, is known as the tEIGEN family. The efficacy of this family for clustering, classification, and discriminant analysis is illustrated with both real and simulated data. The performance of this family is compared to its Gaussian counterpart on three real data sets.  相似文献   

8.
Necessary and sufficient conditions are given for the covariance structure of all the observations in a multivariate factorial experiment under which certain multivariate quadratic forms are independent and distributed as a constant times a Wishart. It is also shown that exact multivariate test statistics can be formed for certain covariance structures of the observations when the assumption of equal covariance matrices for each normal population is relaxed. A characterization is given for the dependency structure between random vectors in which the sample mean and sample covariance matrix have certain properties.  相似文献   

9.
In this article we study a linear discriminant function of multiple m-variate observations at u-sites and over v-time points under the assumption of multivariate normality. We assume that the m-variate observations have a separable mean vector structure and a “jointly equicorrelated covariance” structure. The new discriminant function is very effective in discriminating individuals in a small sample scenario. No closed-form expression exists for the maximum likelihood estimates of the unknown population parameters, and their direct computation is nontrivial. An iterative algorithm is proposed to calculate the maximum likelihood estimates of these unknown parameters. A discriminant function is also developed for unstructured mean vectors. The new discriminant functions are applied to simulated data sets as well as to a real data set. Results illustrating the benefits of the new classification methods over the traditional one are presented.  相似文献   

10.
11.
The Linear Discriminant Rule (LD) is theoretically justified for use in classification when the population within-groups covariance matrices are equal, something rarely known in practice. As an alternative, the Quadratic Discriminant Rule (QD) avoids assuming equal covariance matrices, but requires the estimation of a large number of parameters. Hence, the performance of QD may be poor if the training set sizes are small or moderate. In fact, simulation studies have shown that in the two-groups case LD often outperforms QD for small training sets even when the within -groups covariance matrices differ substantially. The present article shows this to be true when there are more than two groups, as well. Thus, it would seem reasonable and useful to develop a data-based method of classification that, in effect, represents a compromise between QD and LD. In this article we develop such a method based on an empirical Bayes formulation in which the within-groups covariance matrices are assumed to be outcomes of a common prior distribution whose parameters are estimated from the data. Two classification rules are developed under this framework and, through the use of extensive simulations, are compared to existing methods when the number of groups is moderate.  相似文献   

12.
A novel family of mixture models is introduced based on modified t-factor analyzers. Modified factor analyzers were recently introduced within the Gaussian context and our work presents a more flexible and robust alternative. We introduce a family of mixtures of modified t-factor analyzers that uses this generalized version of the factor analysis covariance structure. We apply this family within three paradigms: model-based clustering; model-based classification; and model-based discriminant analysis. In addition, we apply the recently published Gaussian analogue to this family under the model-based classification and discriminant analysis paradigms for the first time. Parameter estimation is carried out within the alternating expectation-conditional maximization framework and the Bayesian information criterion is used for model selection. Two real data sets are used to compare our approach to other popular model-based approaches; in these comparisons, the chosen mixtures of modified t-factor analyzers model performs favourably. We conclude with a summary and suggestions for future work.  相似文献   

13.
Clustering gene expression time course data is an important problem in bioinformatics because understanding which genes behave similarly can lead to the discovery of important biological information. Statistically, the problem of clustering time course data is a special case of the more general problem of clustering longitudinal data. In this paper, a very general and flexible model-based technique is used to cluster longitudinal data. Mixtures of multivariate t-distributions are utilized, with a linear model for the mean and a modified Cholesky-decomposed covariance structure. Constraints are placed upon the covariance structure, leading to a novel family of mixture models, including parsimonious models. In addition to model-based clustering, these models are also used for model-based classification, i.e., semi-supervised clustering. Parameters, including the component degrees of freedom, are estimated using an expectation-maximization algorithm and two different approaches to model selection are considered. The models are applied to simulated data to illustrate their efficacy; this includes a comparison with their Gaussian analogues—the use of these Gaussian analogues with a linear model for the mean is novel in itself. Our family of multivariate t mixture models is then applied to two real gene expression time course data sets and the results are discussed. We conclude with a summary, suggestions for future work, and a discussion about constraining the degrees of freedom parameter.  相似文献   

14.
The analysis of crossover designs assuming i.i.d. errors leads to biased variance estimates whenever the true covariance structure is not spherical. As a result, the OLS F-test for the equality of the direct effects of the treatments is not valid. Bellavance et al. [1996. Biometrics 52, 607–612] use simulations to show that a modified F-test based on an estimate of the within subjects covariance matrix allows for nearly unbiased tests. Kunert and Utzig [1993. JRSS B 55, 919–927] propose an alternative test that does not need an estimate of the covariance matrix. Instead, they correct the F-statistic by multiplying by a constant based on the worst-case scenario. However, for designs with more than three observations per subject, Kunert and Utzig (1993) only give a rough upper bound for the worst-case variance bias. This may lead to overly conservative tests. In this paper we derive an exact upper limit for the variance bias due to carry-over for an arbitrary number of observations per subject. The result holds for a certain class of highly efficient balanced crossover designs.  相似文献   

15.
We consider the problem of estimating the parameters of the covariance function of a stationary spatial random process. In spatial statistics, there are widely used parametric forms for the covariance functions, and various methods for estimating the parameters have been proposed in the literature. We develop a method for estimating the parameters of the covariance function that is based on a regression approach. Our method utilizes pairs of observations whose distances are closest to a value h>0h>0 which is chosen in a way that the estimated correlation at distance h is a predetermined value. We demonstrate the effectiveness of our procedure by simulation studies and an application to a water pH data set. Simulation studies show that our method outperforms all well-known least squares-based approaches to the variogram estimation and is comparable to the maximum likelihood estimation of the parameters of the covariance function. We also show that under a mixing condition on the random field, the proposed estimator is consistent for standard one parameter models for stationary correlation functions.  相似文献   

16.
Abstract.  Multivariate failure time data arises when each study subject can potentially ex-perience several types of failures or recurrences of a certain phenomenon, or when failure times are sampled in clusters. We formulate the marginal distributions of such multivariate data with semiparametric accelerated failure time models (i.e. linear regression models for log-transformed failure times with arbitrary error distributions) while leaving the dependence structures for related failure times completely unspecified. We develop rank-based monotone estimating functions for the regression parameters of these marginal models based on right-censored observations. The estimating equations can be easily solved via linear programming. The resultant estimators are consistent and asymptotically normal. The limiting covariance matrices can be readily estimated by a novel resampling approach, which does not involve non-parametric density estimation or evaluation of numerical derivatives. The proposed estimators represent consistent roots to the potentially non-monotone estimating equations based on weighted log-rank statistics. Simulation studies show that the new inference procedures perform well in small samples. Illustrations with real medical data are provided.  相似文献   

17.
This paper proposes new classifiers under the assumption of multivariate normality for multivariate repeated measures data with Kronecker product covariance structures. These classifiers are especially effective when the number of observations is not large enough to estimate the covariance matrices, and thus the traditional classifiers fail. Computational scheme for maximum likelihood estimates of required class parameters are also given. The quality of these new classifiers are examined on some real data.  相似文献   

18.
ABSTRACT

We propose a new estimator for the spot covariance matrix of a multi-dimensional continuous semimartingale log asset price process, which is subject to noise and nonsynchronous observations. The estimator is constructed based on a local average of block-wise parametric spectral covariance estimates. The latter originate from a local method of moments (LMM), which recently has been introduced by Bibinger et al.. We prove consistency and a point-wise stable central limit theorem for the proposed spot covariance estimator in a very general setup with stochastic volatility, leverage effects, and general noise distributions. Moreover, we extend the LMM estimator to be robust against autocorrelated noise and propose a method to adaptively infer the autocorrelations from the data. Based on simulations we provide empirical guidance on the effective implementation of the estimator and apply it to high-frequency data of a cross-section of Nasdaq blue chip stocks. Employing the estimator to estimate spot covariances, correlations, and volatilities in normal but also unusual periods yields novel insights into intraday covariance and correlation dynamics. We show that intraday (co-)variations (i) follow underlying periodicity patterns, (ii) reveal substantial intraday variability associated with (co-)variation risk, and (iii) can increase strongly and nearly instantaneously if new information arrives. Supplementary materials for this article are available online.  相似文献   

19.
Building new and flexible classes of nonseparable spatio-temporal covariances and variograms has resulted a key point of research in the last years. The goal of this paper is to present an up-to-date overview of recent spatio-temporal covariance models taking into account the problem of spatial anisotropy. The resulting structures are proved to have certain interesting mathematical properties, together with a considerable applicability. In particular, we focus on the problem of modelling anisotropy through isotropy within components. We present the Bernstein class, and a generalisation of Gneiting’s approach (2002a) to obtain new classes of space–time covariance functions which are spatially anisotropic. We also discuss some methods for building covariance functions that attain negative values. We finally present several differentiation and integration operators acting on particular space–time covariance classes.   相似文献   

20.
An analysis of the 1-stage classification decision with two candidate populations is provided in this paper. When the successive posterior probabilities follow a first order markov process it it shown that the optimal classification rules are greatly simplified. A detailed analysis and example are provided for the important case of multivariate normality with equal covariance matrices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号