首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 892 毫秒
1.
In this article, we propose a novel approach to fit a functional linear regression in which both the response and the predictor are functions. We consider the case where the response and the predictor processes are both sparsely sampled at random time points and are contaminated with random errors. In addition, the random times are allowed to be different for the measurements of the predictor and the response functions. The aforementioned situation often occurs in longitudinal data settings. To estimate the covariance and the cross‐covariance functions, we use a regularization method over a reproducing kernel Hilbert space. The estimate of the cross‐covariance function is used to obtain estimates of the regression coefficient function and of the functional singular components. We derive the convergence rates of the proposed cross‐covariance, the regression coefficient, and the singular component function estimators. Furthermore, we show that, under some regularity conditions, the estimator of the coefficient function has a minimax optimal rate. We conduct a simulation study and demonstrate merits of the proposed method by comparing it to some other existing methods in the literature. We illustrate the method by an example of an application to a real‐world air quality dataset. The Canadian Journal of Statistics 47: 524–559; 2019 © 2019 Statistical Society of Canada  相似文献   

2.
There has been increasing interest in the assessment of surgeon effects for survival data of post-operative cancer patients. In particular, the measurement of surgeon's surgical performance after eliminating significant risk variables is considered. The generalized linear mixed model approach, which assumes a log-normal-distributed surgeon effects in the hazard function, is adopted to assess the random surgeon effects of post-operative colorectal cancer patients data. The method extends the traditional Cox's proportional hazards regression model, by including a random component in the linear predictor. Estimation is accomplished by constructing an appropriate log-likelihood function in the spirit of the best linear unbiased predictor method and extends to obtain residual maximum likelihood estimates. As a result of the non-proportionality of the hazard of colon and rectal cancer, the data are analyzed separately according to these two kinds of cancer. Significant risk variables are identified. The 'predictions' of random surgeon effects are obtained and their association with the rank of surgeon is examined.  相似文献   

3.
In functional linear regression, one conventional approach is to first perform functional principal component analysis (FPCA) on the functional predictor and then use the first few leading functional principal component (FPC) scores to predict the response variable. The leading FPCs estimated by the conventional FPCA stand for the major source of variation of the functional predictor, but these leading FPCs may not be mostly correlated with the response variable, so the prediction accuracy of the functional linear regression model may not be optimal. In this paper, we propose a supervised version of FPCA by considering the correlation of the functional predictor and response variable. It can automatically estimate leading FPCs, which represent the major source of variation of the functional predictor and are simultaneously correlated with the response variable. Our supervised FPCA method is demonstrated to have a better prediction accuracy than the conventional FPCA method by using one real application on electroencephalography (EEG) data and three carefully designed simulation studies.  相似文献   

4.
We propose a flexible functional approach for modelling generalized longitudinal data and survival time using principal components. In the proposed model the longitudinal observations can be continuous or categorical data, such as Gaussian, binomial or Poisson outcomes. We generalize the traditional joint models that treat categorical data as continuous data by using some transformations, such as CD4 counts. The proposed model is data-adaptive, which does not require pre-specified functional forms for longitudinal trajectories and automatically detects characteristic patterns. The longitudinal trajectories observed with measurement error or random error are represented by flexible basis functions through a possibly nonlinear link function, combining dimension reduction techniques resulting from functional principal component (FPC) analysis. The relationship between the longitudinal process and event history is assessed using a Cox regression model. Although the proposed model inherits the flexibility of non-parametric methods, the estimation procedure based on the EM algorithm is still parametric in computation, and thus simple and easy to implement. The computation is simplified by dimension reduction for random coefficients or FPC scores. An iterative selection procedure based on Akaike information criterion (AIC) is proposed to choose the tuning parameters, such as the knots of spline basis and the number of FPCs, so that appropriate degree of smoothness and fluctuation can be addressed. The effectiveness of the proposed approach is illustrated through a simulation study, followed by an application to longitudinal CD4 counts and survival data which were collected in a recent clinical trial to compare the efficiency and safety of two antiretroviral drugs.  相似文献   

5.
This article is concerned with the estimation problem in the semiparametric isotonic regression model when the covariates are measured with additive errors and the response is missing at random. An inverse marginal probability weighted imputation approach is developed to estimate the regression parameters and a least-square approach under monotone constraint is employed to estimate the functional component. We show that the proposed estimator of the regression parameter is root-n consistent and asymptotically normal and the isotonic estimator of the functional component, at a fixed point, is cubic root-n consistent. A simulation study is conducted to examine the finite-sample properties of the proposed estimators. A data set is used to demonstrate the proposed approach.  相似文献   

6.
Nonparametric seemingly unrelated regression provides a powerful alternative to parametric seemingly unrelated regression for relaxing the linearity assumption. The existing methods are limited, particularly with sharp changes in the relationship between the predictor variables and the corresponding response variable. We propose a new nonparametric method for seemingly unrelated regression, which adopts a tree-structured regression framework, has satisfiable prediction accuracy and interpretability, no restriction on the inclusion of categorical variables, and is less vulnerable to the curse of dimensionality. Moreover, an important feature is constructing a unified tree-structured model for multivariate data, even though the predictor variables corresponding to the response variable are entirely different. This unified model can offer revelatory insights such as underlying economic meaning. We propose the key factors of tree-structured regression, which are an impurity function detecting complex nonlinear relationships between the predictor variables and the response variable, split rule selection with negligible selection bias, and tree size determination solving underfitting and overfitting problems. We demonstrate our proposed method using simulated data and illustrate it using data from the Korea stock exchange sector indices.  相似文献   

7.
When a generalized linear mixed model with multiple (two or more) sources of random effects is considered, the inferences may vary depending on the nature of the random effects. In this paper, we consider a familial Poisson mixed model where each of the count responses of a family are influenced by two independent unobservable familial random effects with two distinct components of dispersion. A generalized quasilikelihood (GQL) approach is discussed for the estimation of the dispersion components as well as the regression effects of the model. A simulation study is conducted to examine the relative performance of the GQL approach as opposed to a simpler method of moments. Furthermore, the GQL estimation methodology is illustrated by using health care utilization data that follow a Poisson mixed model with one component of dispersion and by using simulated asthma data that follow a Poisson mixed model with two sources of random effects with two distinct components of dispersion.  相似文献   

8.
王芝皓等 《统计研究》2021,38(7):127-139
在实际数据分析中经常会遇到零膨胀计数数据作为响应变量与函数型随机变量和随机向量作为预测变量相关联。本文考虑函数型部分变系数零膨胀模型 (FPVCZIM),模型中无穷维的斜率函数用函数型主成分基逼近,系数函数用B-样条进行拟合。通过EM 算法得到估计量,讨论其理论性质,在一些正则条件下获得了斜率函数和系数函数估计量的收敛速度。有限样本的Monte Carlo 模拟研究和真实数据分析被用来解释本文提出的方法。  相似文献   

9.
Clustered multinomial data with random cluster sizes commonly appear in health, environmental and ecological studies. Traditional approaches for analyzing clustered multinomial data contemplate two assumptions. One of these assumptions is that cluster sizes are fixed, whereas the other demands cluster sizes to be positive. Randomness of the cluster sizes may be the determinant of the within-cluster correlation and between-cluster variation. We propose a baseline-category mixed model for clustered multinomial data with random cluster sizes based on Poisson mixed models. Our orthodox best linear unbiased predictor approach to this model depends only on the moment structure of unobserved distribution-free random effects. Our approach also consolidates the marginal and conditional modeling interpretations. Unlike the traditional methods, our approach can accommodate both random and zero cluster sizes. Two real-life multinomial data examples, crime data and food contamination data, are used to manifest our proposed methodology.  相似文献   

10.
Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

11.
Mixed effects models and Berkson measurement error models are widely used. They share features which the author uses to develop a unified estimation framework. He deals with models in which the random effects (or measurement errors) have a general parametric distribution, whereas the random regression coefficients (or unobserved predictor variables) and error terms have nonparametric distributions. He proposes a second-order least squares estimator and a simulation-based estimator based on the first two moments of the conditional response variable given the observed covariates. He shows that both estimators are consistent and asymptotically normally distributed under fairly general conditions. The author also reports Monte Carlo simulation studies showing that the proposed estimators perform satisfactorily for relatively small sample sizes. Compared to the likelihood approach, the proposed methods are computationally feasible and do not rely on the normality assumption for random effects or other variables in the model.  相似文献   

12.
Abstract.  Mixed model based approaches for semiparametric regression have gained much interest in recent years, both in theory and application. They provide a unified and modular framework for penalized likelihood and closely related empirical Bayes inference. In this article, we develop mixed model methodology for a broad class of Cox-type hazard regression models where the usual linear predictor is generalized to a geoadditive predictor incorporating non-parametric terms for the (log-)baseline hazard rate, time-varying coefficients and non-linear effects of continuous covariates, a spatial component, and additional cluster-specific frailties. Non-linear and time-varying effects are modelled through penalized splines, while spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field prior. Generalizing existing mixed model methodology, inference is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. In a simulation we study the performance of the proposed method, in particular comparing it with its fully Bayesian counterpart using Markov chain Monte Carlo methodology, and complement the results by some asymptotic considerations. As an application, we analyse leukaemia survival data from northwest England.  相似文献   

13.
Most regression problems in practice require flexible semiparametric forms of the predictor for modelling the dependence of responses on covariates. Moreover, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal or spatial data. We present a unified approach for Bayesian inference via Markov chain Monte Carlo simulation in generalized additive and semiparametric mixed models. Different types of covariates, such as the usual covariates with fixed effects, metrical covariates with non-linear effects, unstructured random effects, trend and seasonal components in longitudinal data and spatial covariates, are all treated within the same general framework by assigning appropriate Markov random field priors with different forms and degrees of smoothness. We applied the approach in several case-studies and consulting cases, showing that the methods are also computationally feasible in problems with many covariates and large data sets. In this paper, we choose two typical applications.  相似文献   

14.
15.
This work focuses on the linear regression model with functional covariate and scalar response. We compare the performance of two (parametric) linear regression estimators and a nonparametric (kernel) estimator via a Monte Carlo simulation study and the analysis of two real data sets. The first linear estimator expands the predictor and the regression weight function in terms of the trigonometric basis, while the second one uses functional principal components. The choice of the regularization degree in the linear estimators is addressed.  相似文献   

16.
While most regression models focus on explaining distributional aspects of one single response variable alone, interest in modern statistical applications has recently shifted towards simultaneously studying multiple response variables as well as their dependence structure. A particularly useful tool for pursuing such an analysis are copula-based regression models since they enable the separation of the marginal response distributions and the dependence structure summarised in a specific copula model. However, so far copula-based regression models have mostly been relying on two-step approaches where the marginal distributions are determined first whereas the copula structure is studied in a second step after plugging in the estimated marginal distributions. Moreover, the parameters of the copula are mostly treated as a constant not related to covariates and most regression specifications for the marginals are restricted to purely linear predictors. We therefore propose simultaneous Bayesian inference for both the marginal distributions and the copula using computationally efficient Markov chain Monte Carlo simulation techniques. In addition, we replace the commonly used linear predictor by a generic structured additive predictor comprising for example nonlinear effects of continuous covariates, spatial effects or random effects and furthermore allow to make the copula parameters covariate-dependent. To facilitate Bayesian inference, we construct proposal densities for a Metropolis–Hastings algorithm relying on quadratic approximations to the full conditionals of regression coefficients avoiding manual tuning. The performance of the resulting Bayesian estimates is evaluated in simulations comparing our approach with penalised likelihood inference, studying the choice of a specific copula model based on the deviance information criterion, and comparing a simultaneous approach with a two-step procedure. Furthermore, the flexibility of Bayesian conditional copula regression models is illustrated in two applications on childhood undernutrition and macroecology.  相似文献   

17.
We consider Markov-switching regression models, i.e. models for time series regression analyses where the functional relationship between covariates and response is subject to regime switching controlled by an unobservable Markov chain. Building on the powerful hidden Markov model machinery and the methods for penalized B-splines routinely used in regression analyses, we develop a framework for nonparametrically estimating the functional form of the effect of the covariates in such a regression model, assuming an additive structure of the predictor. The resulting class of Markov-switching generalized additive models is immensely flexible, and contains as special cases the common parametric Markov-switching regression models and also generalized additive and generalized linear models. The feasibility of the suggested maximum penalized likelihood approach is demonstrated by simulation. We further illustrate the approach using two real data applications, modelling (i) how sales data depend on advertising spending and (ii) how energy price in Spain depends on the Euro/Dollar exchange rate.  相似文献   

18.
In a calibration of near-infrared (NIR) instrument, we regress some chemical compositions of interest as a function of their NIR spectra. In this process, we have two immediate challenges: first, the number of variables exceeds the number of observations and, second, the multicollinearity between variables are extremely high. To deal with the challenges, prediction models that produce sparse solutions have recently been proposed. The term ‘sparse’ means that some model parameters are zero estimated and the other parameters are estimated naturally away from zero. In effect, a variable selection is embedded in the model to potentially achieve a better prediction. Many studies have investigated sparse solutions for latent variable models, such as partial least squares and principal component regression, and for direct regression models such as ridge regression (RR). However, in the latter, it mainly involves an L1 norm penalty to the objective function such as lasso regression. In this study, we investigate new sparse alternative models for RR within a random effects model framework, where we consider Cauchy and mixture-of-normals distributions on the random effects. The results indicate that the mixture-of-normals model produces a sparse solution with good prediction and better interpretation. We illustrate the methods using NIR spectra datasets from milk and corn specimens.  相似文献   

19.
A simulation study of the binomial-logit model with correlated random effects is carried out based on the generalized linear mixed model (GLMM) methodology. Simulated data with various numbers of regression parameters and different values of the variance component are considered. The performance of approximate maximum likelihood (ML) and residual maximum likelihood (REML) estimators is evaluated. For a range of true parameter values, we report the average biases of estimators, the standard error of the average bias and the standard error of estimates over the simulations. In general, in terms of bias, the two methods do not show significant differences in estimating regression parameters. The REML estimation method is slightly better in reducing the bias of variance component estimates.  相似文献   

20.
A semiparametric approach to model skewed/heteroscedastic regression data is discussed. We work with a semiparametric transform-both-sides regression model, which contains a parametric regression function and a nonparametric transformation. This model is adequate when the relationship between the median response and the explanatory variable has been specified by a theoretical result or a previous empirical study. The transform-both-sides model with a parametric transformation has been studied extensively and applied successfully to a number data sets. Allowing a nonparametric transformation function increases the flexibility of the model. In this article, we estimate the nonparametric transformation function by the conditional kernel density approach developed by Wang and Ruppert (1995), and then use a pseudo-maximum likelihood estimator to estimate the regression parameters. This estimate of the regression parameters has not been studied previously. In this article, the asymptotic distribution of this pseudo-MLE is derived. We also show that when σ, the standard deviation of the error, goes to zero (small σ asymptotics), this estimator is adaptive. Adaptive means that the regression parameters are estimated as precisely as when the transformation is known exactly. A similar result holds in the parametric approaches of Carroll and Ruppert (1984) and Ruppert and Aldershof (1989). Simulated and real examples are provided to illustrate the performance of the proposed estimator for finite sample size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号