首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
We develop a hierarchical Gaussian process model for forecasting and inference of functional time series data. Unlike existing methods, our approach is especially suited for sparsely or irregularly sampled curves and for curves sampled with nonnegligible measurement error. The latent process is dynamically modeled as a functional autoregression (FAR) with Gaussian process innovations. We propose a fully nonparametric dynamic functional factor model for the dynamic innovation process, with broader applicability and improved computational efficiency over standard Gaussian process models. We prove finite-sample forecasting and interpolation optimality properties of the proposed model, which remain valid with the Gaussian assumption relaxed. An efficient Gibbs sampling algorithm is developed for estimation, inference, and forecasting, with extensions for FAR(p) models with model averaging over the lag p. Extensive simulations demonstrate substantial improvements in forecasting performance and recovery of the autoregressive surface over competing methods, especially under sparse designs. We apply the proposed methods to forecast nominal and real yield curves using daily U.S. data. Real yields are observed more sparsely than nominal yields, yet the proposed methods are highly competitive in both settings. Supplementary materials, including R code and the yield curve data, are available online.  相似文献   

In this article, we introduce a new method for modelling curves with dynamic structures, using a non-parametric approach formulated as a state space model. The non-parametric approach is based on the use of penalised splines, represented as a dynamic mixed model. This formulation can capture the dynamic evolution of curves using a limited number of latent factors, allowing an accurate fit with a small number of parameters. We also present a new method to determine the optimal smoothing parameter through an adaptive procedure, using a formulation analogous to a model of stochastic volatility (SV). The non-parametric state space model allows unifying different methods applied to data with a functional structure in finance. We present the advantages and limitations of this method through simulation studies and also by comparing its predictive performance with other parametric and non-parametric methods used in financial applications using data on the term structure of interest rates.  相似文献   

We consider the recent history functional linear models, relating a longitudinal response to a longitudinal predictor where the predictor process only in a sliding window into the recent past has an effect on the response value at the current time. We propose an estimation procedure for recent history functional linear models that is geared towards sparse longitudinal data, where the observation times across subjects are irregular and total number of measurements per subject is small. The proposed estimation procedure builds upon recent developments in literature for estimation of functional linear models with sparse data and utilizes connections between the recent history functional linear models and varying coefficient models. We establish uniform consistency of the proposed estimators, propose prediction of the response trajectories and derive their asymptotic distribution leading to asymptotic point-wise confidence bands. We include a real data application and simulation studies to demonstrate the efficacy of the proposed methodology.  相似文献   

Measurement error is an important problem that has not been studied very well in the context of functional data analysis. To the best of our knowledge, there are no existing methods that address the presence of functional measurement errors in generalized functional linear models. In this article, a novel approach is proposed to estimate the slope function in the presence of measurement error in the generalized functional linear model with a scalar response. This work significantly advances the existing conditional score method to accommodate the case where both the measurement error and independent variables lie in infinite dimensional spaces. Asymptotic results are established for the proposed estimate, and its behaviour is studied via simulations, where the response is continuous or binary. Analysis of Canadian Weather data highlights the practical utility of our method. The Canadian Journal of Statistics 48: 238–258; 2020 © 2020 Statistical Society of Canada  相似文献   

As the discipline of functional neuroimaging grows there is an increasing interest in meta analysis of brain imaging studies. A typical neuroimaging meta analysis collects peak activation coordinates (foci) from several studies and identifies areas of consistent activation. Most imaging meta analysis methods only produce null hypothesis inferences and do not provide an interpretable fitted model. To overcome these limitations, we propose a Bayesian spatial hierarchical model using a marked independent cluster process. We model the foci as offspring of a latent study center process, and the study centers are in turn offspring of a latent population center process. The posterior intensity function of the population center process provides inference on the location of population centers, as well as the inter-study variability of foci about the population centers. We illustrate our model with a meta analysis consisting of 437 studies from 164 publications, show how two subpopulations of studies can be compared and assess our model via sensitivity analyses and simulation studies. Supplemental materials are available online.  相似文献   

田茂再  梅波 《统计研究》2019,36(8):114-128
本文考虑函数型数据的结构特征,针对两类函数型变量分位回归模型(函数型因变量对标量自变量和函数型因变量对函数型自变量),基于函数型倾斜分位曲线的定义构建新型函数型倾斜分位回归模型。对于第二类模型,本文分别考虑样条基函数对模型系数展开和函数型主成分基函数对函数型自变量展开,得到倾斜分位回归模型的基本形式。参数估计采用成分梯度Boosting算法最小化加权非对称损失函数,提高计算效率。在理论上证明了倾斜分位回归模型的系数估计量均服从渐近正态分布。模拟和实证研究结果显示,倾斜分位回归模型比已有的逐点分位回归模型具有更好的拟合效果。根据积分均方预测误差准则,本文提出的模型有一致较好的预测能力。  相似文献   

The aim of this article is to improve the quality of cookies production by classifying them as good or bad from the curves of resistance of dough observed during the kneading process. As the predictor variable is functional, functional classification methodologies such as functional logit regression and functional discriminant analysis are considered. A P-spline approximation of the sample curves is proposed to improve the classification ability of these models and to suitably estimate the relationship between the quality of cookies and the resistance of dough. Inference results on the functional parameters and related odds ratios are obtained using the asymptotic normality of the maximum likelihood estimators under the classical regularity conditions. Finally, the classification results are compared with alternative functional data analysis approaches such as componentwise classification on the logit regression model.  相似文献   

Summary. We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).  相似文献   

Nonparametric estimators of the upper boundary of the support of a multivariate distribution are very appealing because they rely on very few assumptions. But in productivity and efficiency analysis, this upper boundary is a production (or a cost) frontier and a parametric form for it allows for a richer economic interpretation of the production process under analysis. On the other hand, most of the parametric approaches rely on often too restrictive assumptions on the stochastic part of the model and are based on standard regression techniques fitting the shape of the center of the cloud of points rather than its boundary. To overcome these limitations, Florens and Simar [2005. Parametric approximations of nonparametric frontiers. J. Econometrics 124 (1), 91–116] propose a two-stage approach which tries to capture the shape of the cloud of points near its frontier by providing parametric approximations of a nonparametric frontier. In this paper we propose an alternative method using the nonparametric quantile-type frontiers introduced in Aragon, Daouia and Thomas-Agnan [2005. Nonparametric frontier estimation: a conditional quantile-based approach. Econometric Theory 21, 358–389] for the nonparametric part of our model. These quantile-type frontiers have the superiority of being more robust to extremes. Our main result concerns the functional convergence of the quantile-type frontier process. Then we provide convergence and asymptotic normality of the resulting estimators of the parametric approximation. The approach is illustrated through simulated and real data sets.  相似文献   

It is known that functional single-index regression models can achieve better prediction accuracy than functional linear models or fully nonparametric models, when the target is to predict a scalar response using a function-valued covariate. However, the performance of these models may be adversely affected by extremely large values or skewness in the response. In addition, they are not able to offer a full picture of the conditional distribution of the response. Motivated by using trajectories of $$\hbox {PM}_{{10}}$$ concentrations of last day to predict the maximum $$\hbox {PM}_{{10}}$$ concentration of the current day, a functional single-index quantile regression model is proposed to address those issues. A generalized profiling method is employed to estimate the model. Simulation studies are conducted to investigate the finite sample performance of the proposed estimator. We apply the proposed framework to predict the maximal value of $$\hbox {PM}_{{10}}$$ concentrations based on the intraday $$\hbox {PM}_{{10}}$$ concentrations of the previous day.  相似文献   

In this article, we discuss the estimation of the parameter function for a functional logistic regression model in the presence of outliers. We consider ways that allow for the parameter estimator to be resistant to outliers, in addition to minimizing multicollinearity and reducing the high dimensionality, which is inherent with functional data. To achieve this, the functional covariates and functional parameter of the model are approximated in a finite-dimensional space generated by an appropriate basis. This approach reduces the functional model to a standard multiple logistic model with highly collinear covariates and potential high-dimensionality issues. The proposed estimator tackles these issues and also minimizes the effect of functional outliers. Results from a simulation study and a real world example are also presented to illustrate the performance of the proposed estimator.  相似文献   

To analyse the risk factors of coronary heart disease (CHD), we apply the Bayesian model averaging approach that formalizes the model selection process and deals with model uncertainty in a discrete-time survival model to the data from the Framingham Heart Study. We also use the Alternating Conditional Expectation algorithm to transform the risk factors, such that their relationships with CHD are best described, overcoming the problem of coding such variables subjectively. For the Framingham Study, the Bayesian model averaging approach, which makes inferences about the effects of covariates on CHD based on an average of the posterior distributions of the set of identified models, outperforms the stepwise method in predictive performance. We also show that age, cholesterol, and smoking are nonlinearly associated with the occurrence of CHD and that P-values from models selected from stepwise methods tend to overestimate the evidence for the predictive value of a risk factor and ignore model uncertainty.  相似文献   

In this article, we propose a class of logarithmic autoregressive conditional duration (ACD)-type models that accommodates overdispersion, intermittent dynamics, multiple regimes, and asymmetries in financial durations. In particular, our functional coefficient logarithmic autoregressive conditional duration (FC-LACD) model relies on a smooth-transition autoregressive specification. The motivation lies on the fact that the latter yields a universal approximation if one lets the number of regimes grows without bound. After establishing sufficient conditions for strict stationarity, we address model identifiability as well as the asymptotic properties of the quasi-maximum likelihood (QML) estimator for the FC-LACD model with a fixed number of regimes. In addition, we also discuss how to consistently estimate a semiparametric variant of the FC-LACD model that takes the number of regimes to infinity. An empirical illustration indicates that our functional coefficient model is flexible enough to model IBM price durations.  相似文献   

Intraclass correlation coefficients (ICC) are employed in a wide range of behavioral, biomedical, psychosocial, and health care related research for assessing reliability of continuous outcomes. The linear mixed-effects model (LMM) is the most popular approach for inference about the ICC. However, since LMM is a normal distribution-based model and non-normal data are the norm rather than the exception in most studies, its applications to real study data always beg the question of inference validity. In this paper, we propose a distribution-free alternative to provide robust inference based on the functional response models. We illustrate the performance of the new approach using both real and simulated data.  相似文献   

A novel class of hierarchical nonparametric Bayesian survival regression models for time-to-event data with uninformative right censoring is introduced. The survival curve is modeled as a random function whose prior distribution is defined using the beta-Stacy (BS) process. The prior mean of each survival probability and its prior variance are linked to a standard parametric survival regression model. This nonparametric survival regression can thus be anchored to any reference parametric form, such as a proportional hazards or an accelerated failure time model, allowing substantial departures of the predictive survival probabilities when the reference model is not supported by the data. Also, under this formulation the predictive survival probabilities will be close to the empirical survival distribution near the mode of the reference model and they will be shrunken towards its probability density in the tails of the empirical distribution.  相似文献   

Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

Abstract.  This work proposes an extension of the functional principal components analysis (FPCA) or Karhunen–Loève expansion, which can take into account non-parametrically the effects of an additional covariate. Such models can also be interpreted as non-parametric mixed effect models for functional data. We propose estimators based on kernel smoothers and a data-driven selection procedure of the smoothing parameters based on a two-step cross-validation criterion. The conditional FPCA is illustrated with the analysis of a data set consisting of egg laying curves for female fruit flies. Convergence rates are given for estimators of the conditional mean function and the conditional covariance operator when the entire curves are collected. Almost sure convergence is also proven when one observes discretized noisy sample paths only. A simulation study allows us to check the good behaviour of the estimators.  相似文献   

This article attempts to predict home run hitting performance of Major League Baseball players using a Bayesian semiparametric model. Following Berry, Reese and Larkey we include in the model effects for era of birth, season of play, and home ball park. We estimate performance curves for each player using orthonormal quartic polynomials. We use a Dirichlet process prior on the unknown distribution for the coefficients of the polynomials, and parametric priors for the other effects. Dirichlet process priors are useful in prediction for two reasons: (1) an increased probability of obtaining more precise prediction comes with the increased flexibility of the prior specification, and (2) the clustering inherent in the Dirichlet process provides the means to share information across players. Data from 1871 to 2008 were used to fit the model. Data from 2009 to 2016 were used to test the predictive ability of the model. A parametric model was also fit to compare the predictive performance of the models. We used what we called “pure performance” curves to predict future performance for 22 players. The nonparametric method provided superior predictive performance.  相似文献   

We propose a flexible functional approach for modelling generalized longitudinal data and survival time using principal components. In the proposed model the longitudinal observations can be continuous or categorical data, such as Gaussian, binomial or Poisson outcomes. We generalize the traditional joint models that treat categorical data as continuous data by using some transformations, such as CD4 counts. The proposed model is data-adaptive, which does not require pre-specified functional forms for longitudinal trajectories and automatically detects characteristic patterns. The longitudinal trajectories observed with measurement error or random error are represented by flexible basis functions through a possibly nonlinear link function, combining dimension reduction techniques resulting from functional principal component (FPC) analysis. The relationship between the longitudinal process and event history is assessed using a Cox regression model. Although the proposed model inherits the flexibility of non-parametric methods, the estimation procedure based on the EM algorithm is still parametric in computation, and thus simple and easy to implement. The computation is simplified by dimension reduction for random coefficients or FPC scores. An iterative selection procedure based on Akaike information criterion (AIC) is proposed to choose the tuning parameters, such as the knots of spline basis and the number of FPCs, so that appropriate degree of smoothness and fluctuation can be addressed. The effectiveness of the proposed approach is illustrated through a simulation study, followed by an application to longitudinal CD4 counts and survival data which were collected in a recent clinical trial to compare the efficiency and safety of two antiretroviral drugs.  相似文献   

唐晓彬等 《统计研究》2018,35(11):71-81
传统SVR模型可预测房价变化趋势,但不恰当的参数设置会影响预测的精度。本文针对北京二手房同比价格指数的非线性变化特征,将蝙蝠算法(BatAlgorithm,BA)引入到SVR模型中,使其对模型的三个参数进行优化设置,结合网络搜索数据(Web Search Data,WSD),构建了BA-SVR&WSD混合模型,并给出了该模型算法的预测流程,通过引入多个基准预测模型和预测性能度量指标进行对比研究。研究结果表明:基于蝙蝠算法的SVR模型的具有较好的泛化能力、预测效果更准确且预测精度更高,该预测方法也为北京二手房价格的监测和调控提供有价值的参考。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号