首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 109 毫秒

In this article, we propose a penalized local log-likelihood method to locally select the number of components in non parametric finite mixture of regression models via proportion shrinkage method. Mean functions and variance functions are estimated simultaneously. We show that the number of components can be estimated consistently, and further establish asymptotic normality of functional estimates. We use a modified EM algorithm to estimate the unknown functions. Simulations are conducted to demonstrate the performance of the proposed method. We illustrate our method via an empirical analysis of the housing price index data of United States.  相似文献   

We develop functional data analysis techniques using the differential geometry of a manifold of smooth elastic functions on an interval in which the functions are represented by a log-speed function and an angle function. The manifold's geometry provides a method for computing a sample mean function and principal components on tangent spaces. Using tangent principal component analysis, we estimate probability models for functional data and apply them to functional analysis of variance, discriminant analysis, and clustering. We demonstrate these tasks using a collection of growth curves from children from ages 1–18.  相似文献   

Abstract.  A flexible semi-parametric regression model is proposed for modelling the relationship between a response and multivariate predictor variables. The proposed multiple-index model includes smooth unknown link and variance functions that are estimated non-parametrically. Data-adaptive methods for automatic smoothing parameter selection and for the choice of the number of indices M are considered. This model adapts to complex data structures and provides efficient adaptive estimation through the variance function component in the sense that the asymptotic distribution is the same as if the non-parametric components are known. We develop iterative estimation schemes, which include a constrained projection method for the case where the regression parameter vectors are mutually orthogonal. The proposed methods are illustrated with the analysis of data from a growth bioassay and a reproduction experiment with medflies. Asymptotic properties of the estimated model components are also obtained.  相似文献   

Generalized additive mixed models are proposed for overdispersed and correlated data, which arise frequently in studies involving clustered, hierarchical and spatial designs. This class of models allows flexible functional dependence of an outcome variable on covariates by using nonparametric regression, while accounting for correlation between observations by using random effects. We estimate nonparametric functions by using smoothing splines and jointly estimate smoothing parameters and variance components by using marginal quasi-likelihood. Because numerical integration is often required by maximizing the objective functions, double penalized quasi-likelihood is proposed to make approximate inference. Frequentist and Bayesian inferences are compared. A key feature of the method proposed is that it allows us to make systematic inference on all model components within a unified parametric mixed model framework and can be easily implemented by fitting a working generalized linear mixed model by using existing statistical software. A bias correction procedure is also proposed to improve the performance of double penalized quasi-likelihood for sparse data. We illustrate the method with an application to infectious disease data and we evaluate its performance through simulation.  相似文献   

Abstract.  Imagine we have two different samples and are interested in doing semi- or non-parametric regression analysis in each of them, possibly on the same model. In this paper, we consider the problem of testing whether a specific covariate has different impacts on the regression curve in these two samples. We compare the regression curves of different samples but are interested in specific differences instead of testing for equality of the whole regression function. Our procedure does allow for random designs, different sample sizes, different variance functions, different sets of regressors with different impact functions, etc. As we use the marginal integration approach, this method can be applied to any strong, weak or latent separable model as well as to additive interaction models to compare the lower dimensional separable components between the different samples. Thus, in the case of having separable models, our procedure includes the possibility of comparing the whole regression curves, thereby avoiding the curse of dimensionality. It is shown that bootstrap fails in theory and practice. Therefore, we propose a subsampling procedure with automatic choice of subsample size. We present a complete asymptotic theory and an extensive simulation study.  相似文献   

In this article, we consider Bayesian inferences for the heteroscedastic nonparametric regression models, when both the mean function and variance function are unknown. We demonstrated consistency of posterior distributions for this model using priors induced by B-splines expansion, treating both random and deterministic covariates in a uniform manner.  相似文献   

Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

Summary. We present a technique for extending generalized linear models to the situation where some of the predictor variables are observations from a curve or function. The technique is particularly useful when only fragments of each curve have been observed. We demonstrate, on both simulated and real data sets, how this approach can be used to perform linear, logistic and censored regression with functional predictors. In addition, we show how functional principal components can be used to gain insight into the relationship between the response and functional predictors. Finally, we extend the methodology to apply generalized linear models and principal components to standard missing data problems.  相似文献   

Univariate time series often take the form of a collection of curves observed sequentially over time. Examples of these include hourly ground-level ozone concentration curves. These curves can be viewed as a time series of functions observed at equally spaced intervals over a dense grid. Since functional time series may contain various types of outliers, we introduce a robust functional time series forecasting method to down-weigh the influence of outliers in forecasting. Through a robust principal component analysis based on projection pursuit, a time series of functions can be decomposed into a set of robust dynamic functional principal components and their associated scores. Conditioning on the estimated functional principal components, the crux of the curve-forecasting problem lies in modelling and forecasting principal component scores, through a robust vector autoregressive forecasting method. Via a simulation study and an empirical study on forecasting ground-level ozone concentration, the robust method demonstrates the superior forecast accuracy that dynamic functional principal component regression entails. The robust method also shows the superior estimation accuracy of the parameters in the vector autoregressive models for modelling and forecasting principal component scores, and thus improves curve forecast accuracy.  相似文献   

We propose a flexible functional approach for modelling generalized longitudinal data and survival time using principal components. In the proposed model the longitudinal observations can be continuous or categorical data, such as Gaussian, binomial or Poisson outcomes. We generalize the traditional joint models that treat categorical data as continuous data by using some transformations, such as CD4 counts. The proposed model is data-adaptive, which does not require pre-specified functional forms for longitudinal trajectories and automatically detects characteristic patterns. The longitudinal trajectories observed with measurement error or random error are represented by flexible basis functions through a possibly nonlinear link function, combining dimension reduction techniques resulting from functional principal component (FPC) analysis. The relationship between the longitudinal process and event history is assessed using a Cox regression model. Although the proposed model inherits the flexibility of non-parametric methods, the estimation procedure based on the EM algorithm is still parametric in computation, and thus simple and easy to implement. The computation is simplified by dimension reduction for random coefficients or FPC scores. An iterative selection procedure based on Akaike information criterion (AIC) is proposed to choose the tuning parameters, such as the knots of spline basis and the number of FPCs, so that appropriate degree of smoothness and fluctuation can be addressed. The effectiveness of the proposed approach is illustrated through a simulation study, followed by an application to longitudinal CD4 counts and survival data which were collected in a recent clinical trial to compare the efficiency and safety of two antiretroviral drugs.  相似文献   

We consider semiparametric additive regression models with a linear parametric part and a nonparametric part, both involving multivariate covariates. For the nonparametric part we assume two models. In the first, the regression function is unspecified and smooth; in the second, the regression function is additive with smooth components. Depending on the model, the regression curve is estimated by suitable least squares methods. The resulting residual-based empirical distribution function is shown to differ from the error-based empirical distribution function by an additive expression, up to a uniformly negligible remainder term. This result implies a functional central limit theorem for the residual-based empirical distribution function. It is used to test for normal errors.  相似文献   

This paper describes inference methods for functional data under the assumption that the functional data of interest are smooth latent functions, characterized by a Gaussian process, which have been observed with noise over a finite set of time points. The methods we propose are completely specified in a Bayesian environment that allows for all inferences to be performed through a simple Gibbs sampler. Our main focus is in estimating and describing uncertainty in the covariance function. However, these models also encompass functional data estimation, functional regression where the predictors are latent functions, and an automatic approach to smoothing parameter selection. Furthermore, these models require minimal assumptions on the data structure as the time points for observations do not need to be equally spaced, the number and placement of observations are allowed to vary among functions, and special treatment is not required when the number of functional observations is less than the dimensionality of those observations. We illustrate the effectiveness of these models in estimating latent functional data, capturing variation in the functional covariance estimate, and in selecting appropriate smoothing parameters in both a simulation study and a regression analysis of medfly fertility data.  相似文献   

In this article, we discuss the estimation of the parameter function for a functional logistic regression model in the presence of outliers. We consider ways that allow for the parameter estimator to be resistant to outliers, in addition to minimizing multicollinearity and reducing the high dimensionality, which is inherent with functional data. To achieve this, the functional covariates and functional parameter of the model are approximated in a finite-dimensional space generated by an appropriate basis. This approach reduces the functional model to a standard multiple logistic model with highly collinear covariates and potential high-dimensionality issues. The proposed estimator tackles these issues and also minimizes the effect of functional outliers. Results from a simulation study and a real world example are also presented to illustrate the performance of the proposed estimator.  相似文献   

For nonparametric regression models with fixed and random design, two classes of estimators for the error variance have been introduced: second sample moments based on residuals from a nonparametric fit, and difference-based estimators. The former are asymptotically optimal but require estimating the regression function; the latter are simple but have larger asymptotic variance. For nonparametric regression models with random covariates, we introduce a class of estimators for the error variance that are related to difference-based estimators: covariate-matched U-statistics. We give conditions on the random weights involved that lead to asymptotically optimal estimators of the error variance. Our explicit construction of the weights uses a kernel estimator for the covariate density.  相似文献   

Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient’s information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.  相似文献   

When analysing a contingency table, it is often worth relating the probabilities that a given individual falls into different cells from a set of predictors. These conditional probabilities are usually estimated using appropriate regression techniques. In particular, in this paper, a semiparametric model is developed. Essentially, it is only assumed that the effect of the vector of covariates on the probabilities can entirely be captured by a single index, which is a linear combination of the initial covariates. The estimation is then twofold: the coefficients of the linear combination and the functions linking this index to the related conditional probabilities have to be estimated. Inspired by the estimation procedures already proposed in the literature for single-index regression models, four estimators of the index coefficients are proposed and compared, from a theoretical point-of-view, but also practically, with the aid of simulations. Estimation of the link functions is also addressed.  相似文献   

The aim of this article is to improve the quality of cookies production by classifying them as good or bad from the curves of resistance of dough observed during the kneading process. As the predictor variable is functional, functional classification methodologies such as functional logit regression and functional discriminant analysis are considered. A P-spline approximation of the sample curves is proposed to improve the classification ability of these models and to suitably estimate the relationship between the quality of cookies and the resistance of dough. Inference results on the functional parameters and related odds ratios are obtained using the asymptotic normality of the maximum likelihood estimators under the classical regularity conditions. Finally, the classification results are compared with alternative functional data analysis approaches such as componentwise classification on the logit regression model.  相似文献   

Nonparametric regression methods have been widely studied in functional regression analysis in the context of functional covariates and univariate response, but it is not the case for functional covariates with multivariate response. In this paper, we present two new solutions for the latter problem: the first is to directly extend the nonparametric method for univariate response to multivariate response, and in the second, the correlation among different responses is incorporated into the model. The asymptotic properties of the estimators are studied, and the effectiveness of the proposed methods is demonstrated through several simulation studies and a real data example.  相似文献   

Flexible regression is a traditional motivation for the development of non-parametric Bayesian models. A popular approach for this involves a joint model for responses and covariates, from which the desired result arises by conditioning on the covariates. Many such models involve the convolution of a continuous kernel with some discrete random probability measure defined as an infinite mixture of i.i.d. atoms. Following this strategy, we propose a flexible model that involves the concept of repulsion between atoms. We show that this results in a more parsimonious representation of the regression than the i.i.d. counterpart. The key aspect is that repulsion discourages mixture components that are near each other, thus favouring parsimony. We show that the conditional model retains the repulsive features, thus facilitating interpretation of the resulting flexible regression, and with little or no sacrifice of model fit compared to the infinite mixture case. We show the utility of the methodology by way of a small simulation study and an application to a well-known data set.  相似文献   

Fong  Daniel Y.T.  Lam  K.F.  Lawless  J.F.  Lee  Y.W. 《Lifetime data analysis》2001,7(4):345-362
We consider recurrent event data when the duration or gap times between successive event occurrences are of intrinsic interest. Subject heterogeneity not attributed to observed covariates is usually handled by random effects which result in an exchangeable correlation structure for the gap times of a subject. Recently, efforts have been put into relaxing this restriction to allow non-exchangeable correlation. Here we consider dynamic models where random effects can vary stochastically over the gap times. We extend the traditional Gaussian variance components models and evaluate a previously proposed proportional hazards model through a simulation study and some examples. Besides, semiparametric estimation of the proportional hazards models is considered. Both models are easily used. The Gaussian models are easily interpreted in terms of the variance structure. On the other hand, the proportional hazards models would be more appropriate in the context of survival analysis, particularly in the interpretation of the regression parameters. They can be sensitive to the choice of model for random effects but not to the choice of the baseline hazard function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号