期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sparse Functional Principal Component Analysis via Regularized Basis Expansions and Its Application

Mitsunori Kayano Sadanori Konishi 《统计学通讯:模拟与计算》2013,42(7):1318-1333

This article introduces principal component analysis for multidimensional sparse functional data, utilizing Gaussian basis functions. Our multidimensional model is estimated by maximizing a penalized log-likelihood function, while previous mixed-type models were estimated by maximum likelihood methods for one-dimensional data. The penalized estimation performs well for our multidimensional model, while maximum likelihood methods yield unstable parameter estimates and some of the parameter estimates are infinite. Numerical experiments are conducted to investigate the effectiveness of our method for some types of missing data. The proposed method is applied to handwriting data, which consist of the XY coordinates values in handwritings. 相似文献

2.

Interpreting the Principal Component Analysis of Multivariate Density Functions

Rachid Boumaza Smail Yousfi Sabine Demotes-Mainard 《统计学通讯:理论与方法》2013,42(16):3321-3339

Functional principal component analysis (FPCA) as a reduction data technique of a finite number T of functions can be used to identify the dominant modes of variation of numeric three-way data.

We carry out the FPCA on multidimensional probability density functions, relate this method to other standard methods and define its centered or standardized versions. Grounded on the relationship between FPCA of densities, FPCA of their corresponding characteristic functions, PCA of the MacLaurin expansions of these characteristic functions and dual STATIS method applied to their variance matrices, we propose a method for interpreting the results of the FPCA of densities. This method is based on the investigations of the relationships between the scores of the FPCA and the moments associated to the densities.

The method is illustrated using known Gaussian densities. In practice, FPCA of densities deals with observations of multidimensional variables on T occasions. These observations can be used to estimate the T associated densities (i) by estimating the parameters of these densities, assuming that they are Gaussian, or (ii) by using the Gaussian kernel method and choosing the matrix bandwidth by the normal reference rule. Thereafter, FPCA estimate is derived from these estimates and the interpretation method is carried out to explore the dominant modes of variation of the types of three-way data encountered in sensory analysis and archaeology. 相似文献

3.

A comment on the orthogonalization of B-spline basis functions and their derivatives

Andrew Redd 《Statistics and Computing》2012,22(1):251-257

Through the use of a matrix representation for B-splines presented by Qin (Vis. Comput. 16:177–186, 2000) we are able to reexamine calculus operations on B-spline basis functions. In this matrix framework the problem associated with generating orthogonal splines is reexamined, and we show that this approach can simplify the operations involved to linear matrix operations. We apply these results to a recent paper (Zhou et al. in Biometrika 95:601–619, 2008) on hierarchical functional data analysis using a principal components approach, where a numerical integration scheme was used to orthogonalize a set of B-spline basis functions. These orthogonalized basis functions, along with their estimated derivatives, are then used to construct estimates of mean functions and functional principal components. By applying the methods presented here such algorithms can benefit from increased speed and precision. An R package is available to do the computations. 相似文献

4.

Estimating finite mixture of continuous trees using penalized mutual information

Atefeh Khalili 《统计学通讯:理论与方法》2020,49(20):4974-4987

Abstract

In this paper we introduce continuous tree mixture model that is the mixture of undirected graphical models with tree structured graphs and is considered as multivariate analysis with a non parametric approach. We estimate its parameters, the component edge sets and mixture proportions through regularized maximum likalihood procedure. Our new algorithm, which uses expectation maximization algorithm and the modified version of Kruskal algorithm, simultaneosly estimates and prunes the mixture component trees. Simulation studies indicate this method performs better than the alternative Gaussian graphical mixture model. The proposed method is also applied to water-level data set and is compared with the results of Gaussian mixture model. 相似文献

5.

A Partially Linear Model with Its Applications

Chin-Shang Li 《统计学通讯:模拟与计算》2013,42(7):1673-1680

The nonparametric component in a partially linear model is approximated via cubic B-splines with a second-order difference penalty on the adjacent B-spline coefficients to avoid undersmoothing. A Wald-type spline-based test statistic is constructed for the null hypothesis of no effect of a continuous covariate. When the number of knots is fixed, the limiting null distribution of the test statistic is the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom. A real-life dataset is provided to illustrate the practical use of the test statistic. 相似文献

6.

A lack-of-fit test for parametric zero-inflated Poisson models

《Journal of Statistical Computation and Simulation》2012,82(9):1081-1098

Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology. 相似文献

7.

Optimal bandwidth matrices in functional principal component analysis of density functions

《Journal of Statistical Computation and Simulation》2012,82(11):2315-2330

In order to explore and compare a finite number T of data sets by applying functional principal component analysis (FPCA) to the T associated probability density functions, we estimate these density functions by using the multivariate kernel method. The data set sizes being fixed, we study the behaviour of this FPCA under the assumption that all the bandwidth matrices used in the estimation of densities are proportional to a common parameter h and proportional to either the variance matrices or the identity matrix. In this context, we propose a selection criterion of the parameter h which depends only on the data and the FPCA method. Then, on simulated examples, we compare the quality of approximation of the FPCA when the bandwidth matrices are selected using either the previous criterion or two other classical bandwidth selection methods, that is, a plug-in or a cross-validation method. 相似文献

8.

Dynamic principal component analysis with missing values

Junhyeon Kwon Hee-Seok Oh Yaeji Lim 《Journal of applied statistics》2020,47(11):1957

Dynamic principal component analysis (DPCA), also known as frequency domain principal component analysis, has been developed by Brillinger [Time Series: Data Analysis and Theory, Vol. 36, SIAM, 1981] to decompose multivariate time-series data into a few principal component series. A primary advantage of DPCA is its capability of extracting essential components from the data by reflecting the serial dependence of them. It is also used to estimate the common component in a dynamic factor model, which is frequently used in econometrics. However, its beneficial property cannot be utilized when missing values are present, which should not be simply ignored when estimating the spectral density matrix in the DPCA procedure. Based on a novel combination of conventional DPCA and self-consistency concept, we propose a DPCA method when missing values are present. We demonstrate the advantage of the proposed method over some existing imputation methods through the Monte Carlo experiments and real data analysis. 相似文献

9.

Estimation for generalized partially functional linear additive regression model

Jiang Du Eddy Kwessi Zhongzhan Zhang 《Journal of applied statistics》2019,46(5):914-925

In practice, it is not uncommon to encounter the situation that a discrete response is related to both a functional random variable and multiple real-value random variables whose impact on the response is nonlinear. In this paper, we consider the generalized partial functional linear additive models (GPFLAM) and present the estimation procedure. In GPFLAM, the nonparametric functions are approximated by polynomial splines and the infinite slope function is estimated based on the principal component basis function approximations. We obtain the estimator by maximizing the quasi-likelihood function. We investigate the finite sample properties of the estimation procedure via Monte Carlo simulation studies and illustrate our proposed model by a real data analysis. 相似文献

10.

Influence functions related to eigenvalue problems which appear in multivariate analysis

Yutaka Tanaka 《统计学通讯:理论与方法》2013,42(11):3991-4010

Tanaka(1988) derived two influence functions related to an ordinary eigenvalue problem (A–λ_s I)v_s = 0 of a real symmetric matrix A and used them for sensitivity analysis in principal component analysis. One of these influence functions was used to develop sensitivity analysis in factor analysis (see e.g. Tanaka and Odaka, 1988a). The present paper derives some additional influence functions related to an ordinary eigenvalue problem and also several influence functions related to a generalized eigenvalue problem (A–θ_s A)u_s = 0, where A and B are real symmetric and real symmetric positive definite matrices, respectively. These influence functions are applicable not only to the case where the eigenvalues of interest are all simple but also to the case where there are some multiple eigenvalues among those of interest. 相似文献

11.

Using -splines to test the linearity of partially linear models

Chin-Shang Li 《Statistical Methodology》2009,6(5):542-552

The nonparametric component in a partially linear model is estimated by a linear combination of fixed-knot cubic B-splines with a second-order difference penalty on the adjacent B-spline coefficients. The resulting penalized least-squares estimator is used to construct two Wald-type spline-based test statistics for the null hypothesis of the linearity of the nonparametric function. When the number of knots is fixed, the first test statistic asymptotically has the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom, under the null hypothesis. The smoothing parameter is determined by specifying a value for the asymptotically expected value of the test statistic under the null hypothesis. When the number of knots is fixed and under the null hypothesis, the second test statistic asymptotically has a chi-squared distribution with K=q+2 degrees of freedom, where q is the number of knots used for estimation. The power performances of the two proposed tests are investigated via simulation experiments, and the practicality of the proposed methodology is illustrated using a real-life data set. 相似文献

12.

Semiparametric Negative Binomial Regression Models

Chin-Shang Li 《统计学通讯:模拟与计算》2013,42(3):475-486

Negative-binomial (NB) regression models have been widely used for analysis of count data displaying substantial overdispersion (extra-Poisson variation). However, no formal lack-of-fit tests for a postulated parametric model for a covariate effect have been proposed. Therefore, a flexible parametric procedure is used to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. Within the proposed modeling framework, a log-likelihood ratio test is constructed to evaluate the adequacy of a postulated parametric form of the covariate effect. Simulation experiments are conducted to study the power performance of the proposed test. 相似文献

13.

Functional Analysis of Variance,Discriminant Analysis,and Clustering in a Manifold of Elastic Curves

David M. Kaziska 《统计学通讯:理论与方法》2013,42(14):2487-2499

We develop functional data analysis techniques using the differential geometry of a manifold of smooth elastic functions on an interval in which the functions are represented by a log-speed function and an angle function. The manifold's geometry provides a method for computing a sample mean function and principal components on tangent spaces. Using tangent principal component analysis, we estimate probability models for functional data and apply them to functional analysis of variance, discriminant analysis, and clustering. We demonstrate these tasks using a collection of growth curves from children from ages 1–18. 相似文献

14.

Regularized Posteriors in Linear Ill‐Posed Inverse Problems

JEAN‐PIERRE FLORENS ANNA SIMONI 《Scandinavian Journal of Statistics》2012,39(2):214-235

Abstract. We study the Bayesian solution of a linear inverse problem in a separable Hilbert space setting with Gaussian prior and noise distribution. Our contribution is to propose a new Bayes estimator which is a linear and continuous estimator on the whole space and is stronger than the mean of the exact Gaussian posterior distribution which is only defined as a measurable linear transformation. Our estimator is the mean of a slightly modified posterior distribution called regularized posterior distribution. Frequentist consistency of our estimator and of the regularized posterior distribution is proved. A Monte Carlo study and an application to real data confirm good small‐sample properties of our procedure. 相似文献

15.

Estimating Mixture of Gaussian Processes by Kernel Smoothing

Mian Huang Runze Li Hansheng Wang Weixin Yao 《商业与经济统计学杂志》2014,32(2):259-270

When functional data are not homogenous, for example, when there are multiple classes of functional curves in the dataset, traditional estimation methods may fail. In this article, we propose a new estimation procedure for the mixture of Gaussian processes, to incorporate both functional and inhomogenous properties of the data. Our method can be viewed as a natural extension of high-dimensional normal mixtures. However, the key difference is that smoothed structures are imposed for both the mean and covariance functions. The model is shown to be identifiable, and can be estimated efficiently by a combination of the ideas from expectation-maximization (EM) algorithm, kernel regression, and functional principal component analysis. Our methodology is empirically justified by Monte Carlo simulations and illustrated by an analysis of a supermarket dataset. 相似文献

16.

Functional principal component analysis of spatially correlated data

Chong Liu Surajit Ray Giles Hooker 《Statistics and Computing》2017,27(6):1639-1654

This paper focuses on the analysis of spatially correlated functional data. We propose a parametric model for spatial correlation and the between-curve correlation is modeled by correlating functional principal component scores of the functional data. Additionally, in the sparse observation framework, we propose a novel approach of spatial principal analysis by conditional expectation to explicitly estimate spatial correlations and reconstruct individual curves. Assuming spatial stationarity, empirical spatial correlations are calculated as the ratio of eigenvalues of the smoothed covariance surface Cov\((X_i(s),X_i(t))\) and cross-covariance surface Cov\((X_i(s), X_j(t))\) at locations indexed by i and j. Then a anisotropy Matérn spatial correlation model is fitted to empirical correlations. Finally, principal component scores are estimated to reconstruct the sparsely observed curves. This framework can naturally accommodate arbitrary covariance structures, but there is an enormous reduction in computation if one can assume the separability of temporal and spatial components. We demonstrate the consistency of our estimates and propose hypothesis tests to examine the separability as well as the isotropy effect of spatial correlation. Using simulation studies, we show that these methods have some clear advantages over existing methods of curve reconstruction and estimation of model parameters. 相似文献

17.

Penalized proportion estimation for non parametric mixture of regressions

Qinghua Ji 《统计学通讯:理论与方法》2020,49(7):1537-1560

Abstract

In this article, we propose a penalized local log-likelihood method to locally select the number of components in non parametric finite mixture of regression models via proportion shrinkage method. Mean functions and variance functions are estimated simultaneously. We show that the number of components can be estimated consistently, and further establish asymptotic normality of functional estimates. We use a modified EM algorithm to estimate the unknown functions. Simulations are conducted to demonstrate the performance of the proposed method. We illustrate our method via an empirical analysis of the housing price index data of United States. 相似文献

18.

Monotone Smoothing with Application to Dose-Response Curve

Maiying Kong Randall L. Eubank 《统计学通讯:模拟与计算》2013,42(4):991-1004

Motivated by problems that arise in dose-response curve estimation, we developed a new method to estimate a monotone curve. The resulting monotone estimator is obtained by combining techniques from smoothing splines with nonnegativity properties of cubic B-splines. Numerical experiments are given to exemplify the method. 相似文献

19.

Functional Modelling and Classification of Longitudinal Data*

HANS‐GEORG MÜLLER 《Scandinavian Journal of Statistics》2005,32(2):223-240

Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis. 相似文献

20.

Testing for a change in covariance operator

Daniela Jarušková 《Journal of statistical planning and inference》2013

The paper considers a problem of equality of two covariance operators. Using functional principal component analysis, a method for testing equality of K largest eigenvalues and the corresponding eigenfunctions, together with its generalization to a corresponding change point problem is suggested. Asymptotic distributions of the test statistics are presented. 相似文献