首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, a new estimation procedure based on composite quantile regression and functional principal component analysis (PCA) method is proposed for the partially functional linear regression models (PFLRMs). The proposed estimation method can simultaneously estimate both the parametric regression coefficients and functional coefficient components without specification of the error distributions. The proposed estimation method is shown to be more efficient empirically for non-normal random error, especially for Cauchy error, and almost as efficient for normal random errors. Furthermore, based on the proposed estimation procedure, we use the penalized composite quantile regression method to study variable selection for parametric part in the PFLRMs. Under certain regularity conditions, consistency, asymptotic normality, and Oracle property of the resulting estimators are derived. Simulation studies and a real data analysis are conducted to assess the finite sample performance of the proposed methods.  相似文献   

2.
We consider the problem of constructing confidence intervals for nonparametric functional data analysis using empirical likelihood. In this doubly infinite-dimensional context, we demonstrate the Wilk's phenomenon and propose a bias-corrected construction that requires neither undersmoothing nor direct bias estimation. We also extend our results to partially linear regression models involving functional data. Our numerical results demonstrate improved performance of the empirical likelihood methods over normal approximation-based methods.  相似文献   

3.
函数数据聚类分析方法探析   总被引:3,自引:0,他引:3  
函数数据是目前数据分析中新出现的一种数据类型,它同时具有时间序列和横截面数据的特征,通常可以描述为关于某一变量的函数图像,在实际应用中具有很强的实用性。首先简要分析函数数据的一些基本特征和目前提出的一些函数数据聚类方法,如均匀修正的函数数据K均值聚类方法、函数数据层次聚类方法等,并在此基础上,从函数特征分析的角度探讨了函数数据聚类方法,提出了一种基于导数分析的函数数据区间聚类分析方法,并利用中国中部六省的就业人口数据对该方法进行实证分析,取得了聚类结果。  相似文献   

4.
Cross-validation has been widely used in the context of statistical linear models and multivariate data analysis. Recently, technological advancements give possibility of collecting new types of data that are in the form of curves. Statistical procedures for analysing these data, which are of infinite dimension, have been provided by functional data analysis. In functional linear regression, using statistical smoothing, estimation of slope and intercept parameters is generally based on functional principal components analysis (FPCA), that allows for finite-dimensional analysis of the problem. The estimators of the slope and intercept parameters in this context, proposed by Hall and Hosseini-Nasab [On properties of functional principal components analysis, J. R. Stat. Soc. Ser. B: Stat. Methodol. 68 (2006), pp. 109–126], are based on FPCA, and depend on a smoothing parameter that can be chosen by cross-validation. The cross-validation criterion, given there, is time-consuming and hard to compute. In this work, we approximate this cross-validation criterion by such another criterion so that we can turn to a multivariate data analysis tool in some sense. Then, we evaluate its performance numerically. We also treat a real dataset, consisting of two variables; temperature and the amount of precipitation, and estimate the regression coefficients for the former variable in a model predicting the latter one.  相似文献   

5.
Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis.  相似文献   

6.
ABSTRACT

In this paper, we develop an efficient wavelet-based regularized linear quantile regression framework for coefficient estimations, where the responses are scalars and the predictors include both scalars and function. The framework consists of two important parts: wavelet transformation and regularized linear quantile regression. Wavelet transform can be used to approximate functional data through representing it by finite wavelet coefficients and effectively capturing its local features. Quantile regression is robust for response outliers and heavy-tailed errors. In addition, comparing with other methods it provides a more complete picture of how responses change conditional on covariates. Meanwhile, regularization can remove small wavelet coefficients to achieve sparsity and efficiency. A novel algorithm, Alternating Direction Method of Multipliers (ADMM) is derived to solve the optimization problems. We conduct numerical studies to investigate the finite sample performance of our method and applied it on real data from ADHD studies.  相似文献   

7.
We extend four tests common in classical regression – Wald, score, likelihood ratio and F tests – to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.  相似文献   

8.
We consider a process that is observed as a mixture of two random distributions, where the mixing probability is an unknown function of time. The setup is built upon a wavelet‐based mixture regression. Two linear wavelet estimators are proposed. Furthermore, we consider three regularizing procedures for each of the two wavelet methods. We also discuss regularity conditions under which the consistency of the wavelet methods is attained and derive rates of convergence for the proposed estimators. A Monte Carlo simulation study is conducted to illustrate the performance of the estimators. Various scenarios for the mixing probability function are used in the simulations, in addition to a range of sample sizes and resolution levels. We apply the proposed methods to a data set consisting of array Comparative Genomic Hybridization from glioblastoma cancer studies.  相似文献   

9.
The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs.  相似文献   

10.
This article is concerned with the effect of the methods for handling missing values in multivariate control charts. We discuss the complete case, mean substitution, regression, stochastic regression, and the expectation–maximization algorithm methods for handling missing values. Estimates of mean vector and variance–covariance matrix from the treated data set are used to build the multivariate exponentially weighted moving average (MEWMA) control chart. Based on a Monte Carlo simulation study, the performance of each of the five methods is investigated in terms of its ability to obtain the nominal in-control and out-of-control average run length (ARL). We consider three sample sizes, five levels of the percentage of missing values, and three types of variable numbers. Our simulation results show that imputation methods produce better performance than case deletion methods. The regression-based imputation methods have the best overall performance among all the competing methods.  相似文献   

11.
Abstract.  Functional data analysis is a growing research field as more and more practical applications involve functional data. In this paper, we focus on the problem of regression and classification with functional predictors: the model suggested combines an efficient dimension reduction procedure [functional sliced inverse regression, first introduced by Ferré & Yao ( Statistics , 37, 2003 , 475)], for which we give a regularized version, with the accuracy of a neural network. Some consistency results are given and the method is successfully confronted to real-life data.  相似文献   

12.
Recently, several methodologies to perform geostatistical analysis of functional data have been proposed. All of them assume that the spatial functional process considered is stationary. However, in practice, we often have nonstationary functional data because there exists an explicit spatial trend in the mean. Here, we propose a methodology to extend kriging predictors for functional data to the case where the mean function is not constant through the region of interest. We consider an approach based on the classical residual kriging method used in univariate geostatistics. We propose a three steps procedure. Initially, a functional regression model is used to detrend the mean. Then we apply kriging methods for functional data to the regression residuals to predict a residual curve at a non-data location. Finally, the prediction curve is obtained as the sum of the trend and the residual prediction. We apply the methodology to salinity data corresponding to 21 salinity curves recorded at the Ciénaga Grande de Santa Marta estuary, located in the Caribbean coast of Colombia. A cross-validation analysis was carried out to track the performance of the proposed methodology.  相似文献   

13.
Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented.  相似文献   

14.
This study considers the binary classification of functional data collected in the form of curves. In particular, we assume a situation in which the curves are highly mixed over the entire domain, so that the global discriminant analysis based on the entire domain is not effective. This study proposes an interval-based classification method for functional data: the informative intervals for classification are selected and used for separating the curves into two classes. The proposed method, called functional logistic regression with fused lasso penalty, combines the functional logistic regression as a classifier and the fused lasso for selecting discriminant segments. The proposed method automatically selects the most informative segments of functional data for classification by employing the fused lasso penalty and simultaneously classifies the data based on the selected segments using the functional logistic regression. The effectiveness of the proposed method is demonstrated with simulated and real data examples.  相似文献   

15.
Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient’s information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.  相似文献   

16.
Shi  Yushu  Laud  Purushottam  Neuner  Joan 《Lifetime data analysis》2021,27(1):156-176

In this paper, we first propose a dependent Dirichlet process (DDP) model using a mixture of Weibull models with each mixture component resembling a Cox model for survival data. We then build a Dirichlet process mixture model for competing risks data without regression covariates. Next we extend this model to a DDP model for competing risks regression data by using a multiplicative covariate effect on subdistribution hazards in the mixture components. Though built on proportional hazards (or subdistribution hazards) models, the proposed nonparametric Bayesian regression models do not require the assumption of constant hazard (or subdistribution hazard) ratio. An external time-dependent covariate is also considered in the survival model. After describing the model, we discuss how both cause-specific and subdistribution hazard ratios can be estimated from the same nonparametric Bayesian model for competing risks regression. For use with the regression models proposed, we introduce an omnibus prior that is suitable when little external information is available about covariate effects. Finally we compare the models’ performance with existing methods through simulations. We also illustrate the proposed competing risks regression model with data from a breast cancer study. An R package “DPWeibull” implementing all of the proposed methods is available at CRAN.

  相似文献   

17.
In this article, we discuss the estimation of the parameter function for a functional logistic regression model in the presence of outliers. We consider ways that allow for the parameter estimator to be resistant to outliers, in addition to minimizing multicollinearity and reducing the high dimensionality, which is inherent with functional data. To achieve this, the functional covariates and functional parameter of the model are approximated in a finite-dimensional space generated by an appropriate basis. This approach reduces the functional model to a standard multiple logistic model with highly collinear covariates and potential high-dimensionality issues. The proposed estimator tackles these issues and also minimizes the effect of functional outliers. Results from a simulation study and a real world example are also presented to illustrate the performance of the proposed estimator.  相似文献   

18.
This work focuses on the linear regression model with functional covariate and scalar response. We compare the performance of two (parametric) linear regression estimators and a nonparametric (kernel) estimator via a Monte Carlo simulation study and the analysis of two real data sets. The first linear estimator expands the predictor and the regression weight function in terms of the trigonometric basis, while the second one uses functional principal components. The choice of the regularization degree in the linear estimators is addressed.  相似文献   

19.
In this paper, we give an extension of the functional regression concurrent model to the case of spatially correlated errors. We propose estimating the spatial correlation structure by using functional geostatistics. The estimation of the regression parameters is carried out by feasible generalized least squares. This modeling approach is motivated by the problem of validating rainfall data retrieved from satellite sensors. In this sense, we use the methodology to study the relationship between satellite and ground rainfall time series recorded in 82 weather stations from Department of Valle del Cauca, Colombia. The model obtained allows predicting pentadal rainfall curves in many sites of the region of interest by using as input the satellite information. A residual analysis shows a good performance of the methodology proposed.  相似文献   

20.
This paper demonstrates the usefulness of nonparametric regression analysis for functional specfication of houshold Engel curves.

After a brief review in section 2 of the literature on demand functions and equivalence scales and the functional specifications used, we first discuss in section 3 the issues of using income versus total expenditure, the origin and nature of the error terms in the light of utility theroy, and the interpretation of empirical demand functions. we shall reach the unorthodox view that household demand functions should be interpreted as conditional expectations relative to prices, household composition and either income or the conditional expectation of total expenditure (rather that total expenditure itself), where the latter conditional expectation is taken relative to income, prices and household composition. these two forms appear to be equivalent. this result also solves the simultaneity problem: the error variance matrix is no longer singular. Moreover, the errors are in general heteroskedastic.

In section 4 we discuss the model and the data, and in section 5 we review the nonparametric kernal regression approach.

In section 6 we derive the functional form of our household engel curves from nonparametric regression results, using the 1980 budget survey for the netherlands, in order to avoid model misspecification. thus the modl is derived directly from the data, without restricting its functional form. the nonparametric regression results are then translated to suitable parametric functional specifications, i.e., we choose parametric functional forms in accordance with the nanparametric regression results. these parametric specification are estimated by least squares, and various parameter restrictions are tested in order to simplify the models. this yields very simple final specifications of the household engel curves involved, namely linear functions of income and the number of children in two age groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号