首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This article introduces principal component analysis for multidimensional sparse functional data, utilizing Gaussian basis functions. Our multidimensional model is estimated by maximizing a penalized log-likelihood function, while previous mixed-type models were estimated by maximum likelihood methods for one-dimensional data. The penalized estimation performs well for our multidimensional model, while maximum likelihood methods yield unstable parameter estimates and some of the parameter estimates are infinite. Numerical experiments are conducted to investigate the effectiveness of our method for some types of missing data. The proposed method is applied to handwriting data, which consist of the XY coordinates values in handwritings.  相似文献   

2.
Functional principal component analysis (FPCA) as a reduction data technique of a finite number T of functions can be used to identify the dominant modes of variation of numeric three-way data.

We carry out the FPCA on multidimensional probability density functions, relate this method to other standard methods and define its centered or standardized versions. Grounded on the relationship between FPCA of densities, FPCA of their corresponding characteristic functions, PCA of the MacLaurin expansions of these characteristic functions and dual STATIS method applied to their variance matrices, we propose a method for interpreting the results of the FPCA of densities. This method is based on the investigations of the relationships between the scores of the FPCA and the moments associated to the densities.

The method is illustrated using known Gaussian densities. In practice, FPCA of densities deals with observations of multidimensional variables on T occasions. These observations can be used to estimate the T associated densities (i) by estimating the parameters of these densities, assuming that they are Gaussian, or (ii) by using the Gaussian kernel method and choosing the matrix bandwidth by the normal reference rule. Thereafter, FPCA estimate is derived from these estimates and the interpretation method is carried out to explore the dominant modes of variation of the types of three-way data encountered in sensory analysis and archaeology.  相似文献   

3.
Through the use of a matrix representation for B-splines presented by Qin (Vis. Comput. 16:177–186, 2000) we are able to reexamine calculus operations on B-spline basis functions. In this matrix framework the problem associated with generating orthogonal splines is reexamined, and we show that this approach can simplify the operations involved to linear matrix operations. We apply these results to a recent paper (Zhou et al. in Biometrika 95:601–619, 2008) on hierarchical functional data analysis using a principal components approach, where a numerical integration scheme was used to orthogonalize a set of B-spline basis functions. These orthogonalized basis functions, along with their estimated derivatives, are then used to construct estimates of mean functions and functional principal components. By applying the methods presented here such algorithms can benefit from increased speed and precision. An R package is available to do the computations.  相似文献   

4.
Abstract

In this paper we introduce continuous tree mixture model that is the mixture of undirected graphical models with tree structured graphs and is considered as multivariate analysis with a non parametric approach. We estimate its parameters, the component edge sets and mixture proportions through regularized maximum likalihood procedure. Our new algorithm, which uses expectation maximization algorithm and the modified version of Kruskal algorithm, simultaneosly estimates and prunes the mixture component trees. Simulation studies indicate this method performs better than the alternative Gaussian graphical mixture model. The proposed method is also applied to water-level data set and is compared with the results of Gaussian mixture model.  相似文献   

5.
The nonparametric component in a partially linear model is approximated via cubic B-splines with a second-order difference penalty on the adjacent B-spline coefficients to avoid undersmoothing. A Wald-type spline-based test statistic is constructed for the null hypothesis of no effect of a continuous covariate. When the number of knots is fixed, the limiting null distribution of the test statistic is the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom. A real-life dataset is provided to illustrate the practical use of the test statistic.  相似文献   

6.
Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology.  相似文献   

7.
In order to explore and compare a finite number T of data sets by applying functional principal component analysis (FPCA) to the T associated probability density functions, we estimate these density functions by using the multivariate kernel method. The data set sizes being fixed, we study the behaviour of this FPCA under the assumption that all the bandwidth matrices used in the estimation of densities are proportional to a common parameter h and proportional to either the variance matrices or the identity matrix. In this context, we propose a selection criterion of the parameter h which depends only on the data and the FPCA method. Then, on simulated examples, we compare the quality of approximation of the FPCA when the bandwidth matrices are selected using either the previous criterion or two other classical bandwidth selection methods, that is, a plug-in or a cross-validation method.  相似文献   

8.
Dynamic principal component analysis (DPCA), also known as frequency domain principal component analysis, has been developed by Brillinger [Time Series: Data Analysis and Theory, Vol. 36, SIAM, 1981] to decompose multivariate time-series data into a few principal component series. A primary advantage of DPCA is its capability of extracting essential components from the data by reflecting the serial dependence of them. It is also used to estimate the common component in a dynamic factor model, which is frequently used in econometrics. However, its beneficial property cannot be utilized when missing values are present, which should not be simply ignored when estimating the spectral density matrix in the DPCA procedure. Based on a novel combination of conventional DPCA and self-consistency concept, we propose a DPCA method when missing values are present. We demonstrate the advantage of the proposed method over some existing imputation methods through the Monte Carlo experiments and real data analysis.  相似文献   

9.
In practice, it is not uncommon to encounter the situation that a discrete response is related to both a functional random variable and multiple real-value random variables whose impact on the response is nonlinear. In this paper, we consider the generalized partial functional linear additive models (GPFLAM) and present the estimation procedure. In GPFLAM, the nonparametric functions are approximated by polynomial splines and the infinite slope function is estimated based on the principal component basis function approximations. We obtain the estimator by maximizing the quasi-likelihood function. We investigate the finite sample properties of the estimation procedure via Monte Carlo simulation studies and illustrate our proposed model by a real data analysis.  相似文献   

10.
Tanaka(1988) derived two influence functions related to an ordinary eigenvalue problem (A–λs I)vs = 0 of a real symmetric matrix A and used them for sensitivity analysis in principal component analysis. One of these influence functions was used to develop sensitivity analysis in factor analysis (see e.g. Tanaka and Odaka, 1988a). The present paper derives some additional influence functions related to an ordinary eigenvalue problem and also several influence functions related to a generalized eigenvalue problem (A–θs A)us = 0, where A and B are real symmetric and real symmetric positive definite matrices, respectively. These influence functions are applicable not only to the case where the eigenvalues of interest are all simple but also to the case where there are some multiple eigenvalues among those of interest.  相似文献   

11.
The nonparametric component in a partially linear model is estimated by a linear combination of fixed-knot cubic B-splines with a second-order difference penalty on the adjacent B-spline coefficients. The resulting penalized least-squares estimator is used to construct two Wald-type spline-based test statistics for the null hypothesis of the linearity of the nonparametric function. When the number of knots is fixed, the first test statistic asymptotically has the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom, under the null hypothesis. The smoothing parameter is determined by specifying a value for the asymptotically expected value of the test statistic under the null hypothesis. When the number of knots is fixed and under the null hypothesis, the second test statistic asymptotically has a chi-squared distribution with K=q+2 degrees of freedom, where q is the number of knots used for estimation. The power performances of the two proposed tests are investigated via simulation experiments, and the practicality of the proposed methodology is illustrated using a real-life data set.  相似文献   

12.
Negative-binomial (NB) regression models have been widely used for analysis of count data displaying substantial overdispersion (extra-Poisson variation). However, no formal lack-of-fit tests for a postulated parametric model for a covariate effect have been proposed. Therefore, a flexible parametric procedure is used to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. Within the proposed modeling framework, a log-likelihood ratio test is constructed to evaluate the adequacy of a postulated parametric form of the covariate effect. Simulation experiments are conducted to study the power performance of the proposed test.  相似文献   

13.
We develop functional data analysis techniques using the differential geometry of a manifold of smooth elastic functions on an interval in which the functions are represented by a log-speed function and an angle function. The manifold's geometry provides a method for computing a sample mean function and principal components on tangent spaces. Using tangent principal component analysis, we estimate probability models for functional data and apply them to functional analysis of variance, discriminant analysis, and clustering. We demonstrate these tasks using a collection of growth curves from children from ages 1–18.  相似文献   

14.
Abstract. We study the Bayesian solution of a linear inverse problem in a separable Hilbert space setting with Gaussian prior and noise distribution. Our contribution is to propose a new Bayes estimator which is a linear and continuous estimator on the whole space and is stronger than the mean of the exact Gaussian posterior distribution which is only defined as a measurable linear transformation. Our estimator is the mean of a slightly modified posterior distribution called regularized posterior distribution. Frequentist consistency of our estimator and of the regularized posterior distribution is proved. A Monte Carlo study and an application to real data confirm good small‐sample properties of our procedure.  相似文献   

15.
When functional data are not homogenous, for example, when there are multiple classes of functional curves in the dataset, traditional estimation methods may fail. In this article, we propose a new estimation procedure for the mixture of Gaussian processes, to incorporate both functional and inhomogenous properties of the data. Our method can be viewed as a natural extension of high-dimensional normal mixtures. However, the key difference is that smoothed structures are imposed for both the mean and covariance functions. The model is shown to be identifiable, and can be estimated efficiently by a combination of the ideas from expectation-maximization (EM) algorithm, kernel regression, and functional principal component analysis. Our methodology is empirically justified by Monte Carlo simulations and illustrated by an analysis of a supermarket dataset.  相似文献   

16.
This paper focuses on the analysis of spatially correlated functional data. We propose a parametric model for spatial correlation and the between-curve correlation is modeled by correlating functional principal component scores of the functional data. Additionally, in the sparse observation framework, we propose a novel approach of spatial principal analysis by conditional expectation to explicitly estimate spatial correlations and reconstruct individual curves. Assuming spatial stationarity, empirical spatial correlations are calculated as the ratio of eigenvalues of the smoothed covariance surface Cov\((X_i(s),X_i(t))\) and cross-covariance surface Cov\((X_i(s), X_j(t))\) at locations indexed by i and j. Then a anisotropy Matérn spatial correlation model is fitted to empirical correlations. Finally, principal component scores are estimated to reconstruct the sparsely observed curves. This framework can naturally accommodate arbitrary covariance structures, but there is an enormous reduction in computation if one can assume the separability of temporal and spatial components. We demonstrate the consistency of our estimates and propose hypothesis tests to examine the separability as well as the isotropy effect of spatial correlation. Using simulation studies, we show that these methods have some clear advantages over existing methods of curve reconstruction and estimation of model parameters.  相似文献   

17.
Abstract

In this article, we propose a penalized local log-likelihood method to locally select the number of components in non parametric finite mixture of regression models via proportion shrinkage method. Mean functions and variance functions are estimated simultaneously. We show that the number of components can be estimated consistently, and further establish asymptotic normality of functional estimates. We use a modified EM algorithm to estimate the unknown functions. Simulations are conducted to demonstrate the performance of the proposed method. We illustrate our method via an empirical analysis of the housing price index data of United States.  相似文献   

18.
Motivated by problems that arise in dose-response curve estimation, we developed a new method to estimate a monotone curve. The resulting monotone estimator is obtained by combining techniques from smoothing splines with nonnegativity properties of cubic B-splines. Numerical experiments are given to exemplify the method.  相似文献   

19.
Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis.  相似文献   

20.
The paper considers a problem of equality of two covariance operators. Using functional principal component analysis, a method for testing equality of K largest eigenvalues and the corresponding eigenfunctions, together with its generalization to a corresponding change point problem is suggested. Asymptotic distributions of the test statistics are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号