首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We develop in this paper a new procedure to construct simultaneous confidence bands for derivatives of mean curves in functional data analysis. The technique involves polynomial splines that provide an approximation to the derivatives of the mean functions, the covariance functions and the associated eigenfunctions. We show that the proposed procedure has desirable statistical properties. In particular, we first show that the proposed estimators of derivatives of the mean curves are semiparametrically efficient. Second, we establish consistency results for derivatives of covariance functions and their eigenfunctions. Most importantly, we show that the proposed spline confidence bands are asymptotically efficient as if all random trajectories were observed with no error. Finally, the confidence band procedure is illustrated through numerical simulation studies and a real life example.  相似文献   

2.
Based on sero-prevalence data of rubella, mumps in the UK and varicella in Belgium, we show how the force of infection, the age-specific rate at which susceptible individuals contract infection, can be estimated using generalized linear mixed models (McCulloch & Searle, 2001). Modelling the dependency of the force of infection on age by penalized splines, which involve fixed and random effects, allows us to use generalized linear mixed models techniques to estimate both the cumulative probability of being infected before a given age and the force of infection. Moreover, these models permit an automatic selection of the smoothing parameter. The smoothness of the estimated force of infection can be influenced by the number of knots and the degree of the penalized spline used. To determine these, a different number of knots and different degrees are used and the results are compared to establish this sensitivity. Simulations with a different number of knots and polynomial spline bases of different degrees suggest - for estimating the force of infection from serological data - the use of a quadratic penalized spline based on about 10 knots.  相似文献   

3.
Abstract.  Mixed model based approaches for semiparametric regression have gained much interest in recent years, both in theory and application. They provide a unified and modular framework for penalized likelihood and closely related empirical Bayes inference. In this article, we develop mixed model methodology for a broad class of Cox-type hazard regression models where the usual linear predictor is generalized to a geoadditive predictor incorporating non-parametric terms for the (log-)baseline hazard rate, time-varying coefficients and non-linear effects of continuous covariates, a spatial component, and additional cluster-specific frailties. Non-linear and time-varying effects are modelled through penalized splines, while spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field prior. Generalizing existing mixed model methodology, inference is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. In a simulation we study the performance of the proposed method, in particular comparing it with its fully Bayesian counterpart using Markov chain Monte Carlo methodology, and complement the results by some asymptotic considerations. As an application, we analyse leukaemia survival data from northwest England.  相似文献   

4.
Expectile regression is a topic which became popular in the last years. It includes ordinary mean regression as special case but is more general as it offers the possibility to also model non-central parts of a distribution. Semi-parametric expectile models have recently been developed and it is easy to perform flexible expectile estimation with modern software like R. We extend the model class by allowing for panel observations, i.e. clustered data with repeated measurements taken at the same individual. A random (individual) effect is incorporated in the model which accounts for the dependence structure in the data. We fit expectile sheets, meaning that not a single expectile is estimated but a whole range of expectiles is estimated simultaneously. The presented model allows for multiple covariates, where a semi-parametric approach with penalized splines is pursued to fit smooth expectile curves. We apply our methods to panel data from the German Socio-Economic Panel.  相似文献   

5.
We propose a general family of nonparametric mixed effects models. Smoothing splines are used to model the fixed effects and are estimated by maximizing the penalized likelihood function. The random effects are generic and are modelled parametrically by assuming that the covariance function depends on a parsimonious set of parameters. These parameters and the smoothing parameter are estimated simultaneously by the generalized maximum likelihood method. We derive a connection between a nonparametric mixed effects model and a linear mixed effects model. This connection suggests a way of fitting a nonparametric mixed effects model by using existing programs. The classical two-way mixed models and growth curve models are used as examples to demonstrate how to use smoothing spline analysis-of-variance decompositions to build nonparametric mixed effects models. Similarly to the classical analysis of variance, components of these nonparametric mixed effects models can be interpreted as main effects and interactions. The penalized likelihood estimates of the fixed effects in a two-way mixed model are extensions of James–Stein shrinkage estimates to correlated observations. In an example three nested nonparametric mixed effects models are fitted to a longitudinal data set.  相似文献   

6.
We consider additive mixed models for longitudinal data with a nonlinear time trend. As random effects distribution an approximate Dirichlet process mixture is proposed that is based on the truncated version of the stick breaking presentation of the Dirichlet process and provides a Gaussian mixture with a data driven choice of the number of mixture components. The main advantage of the specification is its ability to identify clusters of subjects with a similar random effects structure. For the estimation of the trend curve the mixed model representation of penalized splines is used. An Expectation-Maximization algorithm is given that solves the estimation problem and that exhibits advantages over Markov chain Monte Carlo approaches, which are typically used when modeling with Dirichlet processes. The method is evaluated in a simulation study and applied to theophylline data and to body mass index profiles of children.  相似文献   

7.
In many areas of application, especially life testing and reliability, it is often of interest to estimate an unknown cumulative distribution (cdf). A simultaneous confidence band (SCB) of the cdf can be used to assess the statistical uncertainty of the estimated cdf over the entire range of the distribution. Cheng and Iles [1983. Confidence bands for cumulative distribution functions of continuous random variables. Technometrics 25 (1), 77–86] presented an approach to construct an SCB for the cdf of a continuous random variable. For the log-location-scale family of distributions, they gave explicit forms for the upper and lower boundaries of the SCB based on expected information. In this article, we extend the work of Cheng and Iles [1983. Confidence bands for cumulative distribution functions of continuous random variables. Technometrics 25 (1), 77–86] in several directions. We study the SCBs based on local information, expected information, and estimated expected information for both the “cdf method” and the “quantile method.” We also study the effects of exceptional cases where a simple SCB does not exist. We describe calibration of the bands to provide exact coverage for complete data and type II censoring and better approximate coverage for other kinds of censoring. We also discuss how to extend these procedures to regression analysis.  相似文献   

8.
Let ( X , Y ) be a random vector, where Y denotes the variable of interest possibly subject to random right censoring, and X is a covariate. We construct confidence intervals and bands for the conditional survival and quantile function of Y given X using a non-parametric likelihood ratio approach. This approach was introduced by Thomas & Grunkemeier (1975 ), who estimated confidence intervals of survival probabilities based on right censored data. The method is appealing for several reasons: it always produces intervals inside [0, 1], it does not involve variance estimation, and can produce asymmetric intervals. Asymptotic results for the confidence intervals and bands are obtained, as well as simulation results, in which the performance of the likelihood ratio intervals and bands is compared with that of the normal approximation method. We also propose a bandwidth selection procedure based on the bootstrap and apply the technique on a real data set.  相似文献   

9.
Book Reviews     
The diagnostic tools examined in this article are applicable to regressions estimated with panel data or cross-sectional data drawn from a population with grouped structure. The diagnostic tools considered include (a) tests for the existence of group effects under both fixed and random effects models, (b) checks for outlying groups, and (c) specification tests for comparing the fixed and random effects models. A group-specific counterpart to the studentized residual is introduced. The methods are illustrated using a hedonic housing price regression.  相似文献   

10.
We propose a Bayesian nonparametric instrumental variable approach under additive separability that allows us to correct for endogeneity bias in regression models where the covariate effects enter with unknown functional form. Bias correction relies on a simultaneous equations specification with flexible modeling of the joint error distribution implemented via a Dirichlet process mixture prior. Both the structural and instrumental variable equation are specified in terms of additive predictors comprising penalized splines for nonlinear effects of continuous covariates. Inference is fully Bayesian, employing efficient Markov chain Monte Carlo simulation techniques. The resulting posterior samples do not only provide us with point estimates, but allow us to construct simultaneous credible bands for the nonparametric effects, including data-driven smoothing parameter selection. In addition, improved robustness properties are achieved due to the flexible error distribution specification. Both these features are challenging in the classical framework, making the Bayesian one advantageous. In simulations, we investigate small sample properties and an investigation of the effect of class size on student performance in Israel provides an illustration of the proposed approach which is implemented in an R package bayesIV. Supplementary materials for this article are available online.  相似文献   

11.
Existing literature on quantile regression for panel data models with individual effects advocates the application of penalization to reduce the dynamic panel bias and increase the efficiency of the estimators. In this paper, we consider penalized quantile regression for dynamic panel data with random effects from a Bayesian perspective, where the penalty involves an adaptive Lasso shrinkage of the random effects. We also address the role of initial conditions in dynamic panel data models, emphasizing joint modeling of start-up and subsequent responses. For posterior inference, an efficient Gibbs sampler is developed to simulate the parameters from the posterior distributions. Through simulation studies and analysis of a real data set, we assess the performance of the proposed Bayesian method.  相似文献   

12.
We construct bootstrap confidence intervals for smoothing spline estimates based on Gaussian data, and penalized likelihood smoothing spline estimates based on data from .exponential families. Several vari- ations of bootstrap confidence intervals are considered and compared. We find that the commonly used ootstrap percentile intervals are inferior to the T intervals and to intervals based on bootstrap estimation of mean squared errors. The best variations of the bootstrap confidence intervals behave similar to the well known Bayesian confidence intervals. These bootstrap confidence intervals have an average coverage probability across the function being estimated, as opposed to a pointwise property.  相似文献   

13.
Simultaneous confidence bands have been shown in the statistical literature as powerful inferential tools in univariate linear regression. While the methodology of simultaneous confidence bands for univariate linear regression has been extensively researched and well developed, no published work seems available for multivariate linear regression. This paper fills this gap by studying one particular simultaneous confidence band for multivariate linear regression. Because of the shape of the band, the word ‘tube’ is more pertinent and so will be used to replace the word ‘band’. It is shown that the construction of the tube is related to the distribution of the largest eigenvalue. A simulation‐based method is proposed to compute the 1 ? α quantile of this eigenvalue. With the computation power of modern computers, the simultaneous confidence tube can be computed fast and accurately. A real‐data example is used to illustrate the method, and many potential research problems have been pointed out.  相似文献   

14.
The widely used linear model of the unexpected earnings/returns relationship has been challenged. In this article we propose a flexible nonparametric approach to study this relationship in which splines are used to approximate the unknown regression function. Spline confidence bands are constructed based on wild bootstrap to examine the adequacy of certain linear/nonlinear specifications. Monte Carlo results show that the proposed bands have excellent coverage of the true regression function with little computing load. These properties make the procedure highly recommended for extracting information from large and complicated datasets. The proposed approach has also been applied to the real world financial data from the unexpected earnings/returns study, and we find significant evidence of nonlinearity. The nonlinearity persists when we control the measurement errors of the earning surprises and firm size.  相似文献   

15.
In biomedical studies, it is of substantial interest to develop risk prediction scores using high-dimensional data such as gene expression data for clinical endpoints that are subject to censoring. In the presence of well-established clinical risk factors, investigators often prefer a procedure that also adjusts for these clinical variables. While accelerated failure time (AFT) models are a useful tool for the analysis of censored outcome data, it assumes that covariate effects on the logarithm of time-to-event are linear, which is often unrealistic in practice. We propose to build risk prediction scores through regularized rank estimation in partly linear AFT models, where high-dimensional data such as gene expression data are modeled linearly and important clinical variables are modeled nonlinearly using penalized regression splines. We show through simulation studies that our model has better operating characteristics compared to several existing models. In particular, we show that there is a non-negligible effect on prediction as well as feature selection when nonlinear clinical effects are misspecified as linear. This work is motivated by a recent prostate cancer study, where investigators collected gene expression data along with established prognostic clinical variables and the primary endpoint is time to prostate cancer recurrence. We analyzed the prostate cancer data and evaluated prediction performance of several models based on the extended c statistic for censored data, showing that 1) the relationship between the clinical variable, prostate specific antigen, and the prostate cancer recurrence is likely nonlinear, i.e., the time to recurrence decreases as PSA increases and it starts to level off when PSA becomes greater than 11; 2) correct specification of this nonlinear effect improves performance in prediction and feature selection; and 3) addition of gene expression data does not seem to further improve the performance of the resultant risk prediction scores.  相似文献   

16.
Partial linear varying coefficient models (PLVCM) are often considered for analysing longitudinal data for a good balance between flexibility and parsimony. The existing estimation and variable selection methods for this model are mainly built upon which subset of variables have linear or varying effect on the response is known in advance, or say, model structure is determined. However, in application, this is unreasonable. In this work, we propose a simultaneous structure estimation and variable selection method, which can do simultaneous coefficient estimation and three types of selections: varying and constant effects selection, relevant variable selection. It can be easily implemented in one step by employing a penalized M-type regression, which uses a general loss function to treat mean, median, quantile and robust mean regressions in a unified framework. Consistency in the three types of selections and oracle property in estimation are established as well. Simulation studies and real data analysis also confirm our method.  相似文献   

17.
Longitudinal data often require a combination of flexible time trends and individual-specific random effects. For example, our methodological developments are motivated by a study on longitudinal body mass index profiles of children collected with the aim to gain a better understanding of factors driving childhood obesity. The high amount of nonlinearity and heterogeneity in these data and the complexity of the data set with a large number of observations, long longitudinal profiles and clusters of observations with specific deviations from the population model make the application challenging and prevent the application of standard growth curve models. We propose a fully Bayesian approach based on Markov chain Monte Carlo simulation techniques that allows for the semiparametric specification of both the trend function and the random effects distribution. Bayesian penalized splines are considered for the former, while a Dirichlet process mixture (DPM) specification allows for an adaptive amount of deviations from normality for the latter. The advantages of such DPM prior structures for random effects are investigated in terms of a simulation study to improve the understanding of the model specification before analyzing the childhood obesity data.  相似文献   

18.
During recent years, analysts have been relying on approximate methods of inference to estimate multilevel models for binary or count data. In an earlier study of random-intercept models for binary outcomes we used simulated data to demonstrate that one such approximation, known as marginal quasi-likelihood, leads to a substantial attenuation bias in the estimates of both fixed and random effects whenever the random effects are non-trivial. In this paper, we fit three-level random-intercept models to actual data for two binary outcomes, to assess whether refined approximation procedures, namely penalized quasi-likelihood and second-order improvements to marginal and penalized quasi-likelihood, also underestimate the underlying parameters. The extent of the bias is assessed by two standards of comparison: exact maximum likelihood estimates, based on a Gauss–Hermite numerical quadrature procedure, and a set of Bayesian estimates, obtained from Gibbs sampling with diffuse priors. We also examine the effectiveness of a parametric bootstrap procedure for reducing the bias. The results indicate that second-order penalized quasi-likelihood estimates provide a considerable improvement over the other approximations, but all the methods of approximate inference result in a substantial underestimation of the fixed and random effects when the random effects are sizable. We also find that the parametric bootstrap method can eliminate the bias but is computationally very intensive.  相似文献   

19.
Abstract. The focus of this article is on simultaneous confidence bands over a rectangular covariate region for a linear regression model with k>1 covariates, for which only conservative or approximate confidence bands are available in the statistical literature stretching back to Working & Hotelling (J. Amer. Statist. Assoc. 24 , 1929; 73–85). Formulas of simultaneous confidence levels of the hyperbolic and constant width bands are provided. These involve only a k‐dimensional integral; it is unlikely that the simultaneous confidence levels can be expressed as an integral of less than k‐dimension. These formulas allow the construction for the first time of exact hyperbolic and constant width confidence bands for at least a small k(>1) by using numerical quadrature. Comparison between the hyperbolic and constant width bands is then addressed under both the average width and minimum volume confidence set criteria. It is observed that the constant width band can be drastically less efficient than the hyperbolic band when k>1. Finally it is pointed out how the methods given in this article can be applied to more general regression models such as fixed‐effect or random‐effect generalized linear regression models.  相似文献   

20.
Exact simultaneous confidence bands (SCBs) for a polynomial regression model are available only in some special situations. In this paper, simultaneous confidence levels for both hyperbolic and constant width bands for a polynomial function over a given interval are expressed as multidimensional integrals. The dimension of these integrals is equal to the degree of the polynomial. Hence the values can be calculated quickly and accurately via numerical quadrature provided that the degree of the polynomial is small (e.g. 2 or 3). This allows the construction of exact SCBs for quadratic and cubic regression functions over any given interval and for any given design matrix. Quadratic and cubic regressions are frequently used to characterise dose response relationships in addition to many other applications. Comparison between the hyperbolic and constant width bands under both the average width and minimum volume confidence set criteria shows that the constant width band can be much less efficient than the hyperbolic band. For hyperbolic bands, comparison between the exact critical constant and conservative or approximate critical constants indicates that the exact critical constant can be substantially smaller than the conservative or approximate critical constants. Numerical examples from a dose response study are used to illustrate the methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号