首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this article, we consider the problem of selecting functional variables using the L1 regularization in a functional linear regression model with a scalar response and functional predictors, in the presence of outliers. Since the LASSO is a special case of the penalized least-square regression with L1 penalty function, it suffers from the heavy-tailed errors and/or outliers in data. Recently, Least Absolute Deviation (LAD) and the LASSO methods have been combined (the LAD-LASSO regression method) to carry out robust parameter estimation and variable selection simultaneously for a multiple linear regression model. However, variable selection of the functional predictors based on LASSO fails since multiple parameters exist for a functional predictor. Therefore, group LASSO is used for selecting functional predictors since group LASSO selects grouped variables rather than individual variables. In this study, we propose a robust functional predictor selection method, the LAD-group LASSO, for a functional linear regression model with a scalar response and functional predictors. We illustrate the performance of the LAD-group LASSO on both simulated and real data.  相似文献   

2.
We propose a general framework for regression models with functional response containing a potentially large number of flexible effects of functional and scalar covariates. Special emphasis is put on historical functional effects, where functional response and functional covariate are observed over the same interval and the response is only influenced by covariate values up to the current grid point. Historical functional effects are mostly used when functional response and covariate are observed on a common time interval, as they account for chronology. Our formulation allows for flexible integration limits including, e.g., lead or lag times. The functional responses can be observed on irregular curve-specific grids. Additionally, we introduce different parameterizations for historical effects and discuss identifiability issues.The models are estimated by a component-wise gradient boosting algorithm which is suitable for models with a potentially high number of covariate effects, even more than observations, and inherently does model selection. By minimizing corresponding loss functions, different features of the conditional response distribution can be modeled, including generalized and quantile regression models as special cases. The methods are implemented in the open-source R package FDboost. The methodological developments are motivated by biotechnological data on Escherichia coli fermentations, but cover a much broader model class.  相似文献   

3.
ABSTRACT

Functional linear model is of great practical importance, as exemplified by applications in high-throughput studies such as meteorological and biomedical research. In this paper, we propose a new functional variable selection procedure, called functional variable selection via Gram–Schmidt (FGS) orthogonalization, for a functional linear model with a scalar response and multiple functional predictors. Instead of the regularization methods, FGS takes into account the similarity between the functional predictors in a data-driven way and utilizes the technique of Gram–Schmidt orthogonalization to remove the irrelevant predictors. FGS can successfully discriminate between the relevant and the irrelevant functional predictors to achieve a high true positive ratio without including many irrelevant predictors, and yield explainable models, which offers a new perspective for the variable selection method in the functional linear model. Simulation studies are carried out to evaluate the finite sample performance of the proposed method, and also a weather data set is analysed.  相似文献   

4.
This paper considers nonlinear regression analysis with a scalar response and multiple predictors. An unknown regression function is approximated by radial basis function models. The coefficients are estimated in the context of M-estimation. It is known that ordinary M-estimation leads to overfitting in nonlinear regression. The purpose of this paper is to construct a smooth estimator. The proposed method in this paper is conducted by a two-step procedure. First, the sufficient dimension reduction methods are applied to the response and radial basis functions for transforming the large number of radial bases to a small number of linear combinations of the radial bases without loss of information. In the second step, a multiple linear regression model between a response and the transformed radial bases is assumed and the ordinary M-estimation is applied. Thus, the final estimator is also obtained as a linear combination of radial bases. The validity and an asymptotic study of the proposed method are explored. A simulation and data example are addressed to confirm the behavior of the proposed method.  相似文献   

5.
In this paper, we investigate the effect of pre-smoothing on model selection. Christóbal et al 6 Christóbal Christóbal, J. A., Faraldo Roca, P. and González Manteiga, W. 1987. A class of linear regression parameter estimators constructed by nonparametric estimation. Ann. Statist.,, 15: 603609. [Crossref], [Web of Science ®] [Google Scholar] showed the beneficial effect of pre-smoothing on estimating the parameters in a linear regression model. Here, in a regression setting, we show that smoothing the response data prior to model selection by Akaike's information criterion can lead to an improved selection procedure. The bootstrap is used to control the magnitude of the random error structure in the smoothed data. The effect of pre-smoothing on model selection is shown in simulations. The method is illustrated in a variety of settings, including the selection of the best fractional polynomial in a generalized linear model.  相似文献   

6.

This paper is motivated by our collaborative research and the aim is to model clinical assessments of upper limb function after stroke using 3D-position and 4D-orientation movement data. We present a new nonlinear mixed-effects scalar-on-function regression model with a Gaussian process prior focusing on the variable selection from a large number of candidates including both scalar and function variables. A novel variable selection algorithm has been developed, namely functional least angle regression. As it is essential for this algorithm, we studied the representation of functional variables with different methods and the correlation between a scalar and a group of mixed scalar and functional variables. We also propose a new stopping rule for practical use. This algorithm is efficient and accurate for both variable selection and parameter estimation even when the number of functional variables is very large and the variables are correlated. And thus the prediction provided by the algorithm is accurate. Our comprehensive simulation study showed that the method is superior to other existing variable selection methods. When the algorithm was applied to the analysis of the movement data, the use of the nonlinear random-effect model and the function variables significantly improved the prediction accuracy for the clinical assessment.

  相似文献   

7.
In this paper, we focus on the variable selection problem in normal regression models using the expected-posterior prior methodology. We provide a straightforward MCMC scheme for the derivation of the posterior distribution, as well as Monte Carlo estimates for the computation of the marginal likelihood and posterior model probabilities. Additionally, for large spaces, a model search algorithm based on $\mathit{MC}^{3}$ is constructed. The proposed methodology is applied in two real life examples, already used in the relevant literature of objective variable selection. In both examples, uncertainty over different training samples is taken into consideration.  相似文献   

8.
9.
In this paper, we investigate a nonparametric estimation of the conditional density of a scalar response variable given a random variable taking values in separable Hilbert space. We establish under general conditions the uniform almost complete convergence rates and the asymptotic normality of the conditional density kernel estimator, when the variables satisfy the strong mixing dependency, based on the single-index structure. The asymptotic \((1-\zeta )\) confidence intervals of conditional density function are given, for \(0 < \zeta < 1\) . We further demonstrate the impact of this functional parameter to the conditional mode estimate. Simulation study is also presented. Finally, the estimation of the functional index via the pseudo-maximum likelihood method is discussed, but not tackled.  相似文献   

10.
This article addresses the problem of confidence band construction for a standard multiple linear regression model. An “independence point” method of construction is developed which generalizes the method of Gafarian (1964) for a simple linear regression model to a multiple linear regression model. Wynn (1984 Wynn , H. P. ( 1984 ). An exact confidence band for one-dimensional polynomial regression . Biometrika 71 : 3759 .[Crossref], [Web of Science ®] [Google Scholar]) pioneered the approach of basing confidence bands for a polynomial regression on a set of nodes where the function estimates are independent, and this approach is exploited in this article. This method requires only critical points from t-distributions so that the confidence bands are easy to construct. Both one-sided and two-sided confidence bands can be constructed using this method. An illustration of the new method is provided, and comparisons are made with other procedures.  相似文献   

11.
Research on the multiple comparison during the past 60 years or so has focused mainly on the comparison of several population means. Spurrier (J Am Stat Assoc 94:483–488, 1999) and Liu et al. (J Am Stat Assoc 99:395–403, 2004) considered the multiple comparison of several linear regression lines. They assumed that there was no functional relationship between the predictor variables. For the case of the polynomial regression model, the functional relationship between the predictor variables does exist. This lack of a full utilization of the functional relationship between the predictor variables may have some undesirable consequences. In this article we introduce an exact method for the multiple comparison of several polynomial regression models. This method sufficiently takes advantage of the feature of the polynomial regression model, and therefore, it can quickly and accurately compute the critical constant. This proposed method allows various types of comparisons, including pairwise, many-to-one and successive, and it also allows the predictor variable to be either unconstrained or constrained to a finite interval. The examples from the dose-response study are used to illustrate the method. MATLAB programs have been written for easy implementation of this method.  相似文献   

12.
For the problem of variable selection for the normal linear model, fixed penalty selection criteria such as AIC, CpCp, BIC and RIC correspond to the posterior modes of a hierarchical Bayes model for various fixed hyperparameter settings. Adaptive selection criteria obtained by empirical Bayes estimation of the hyperparameters have been shown by George and Foster [2000. Calibration and Empirical Bayes variable selection. Biometrika 87(4), 731–747] to improve on these fixed selection criteria. In this paper, we study the potential of alternative fully Bayes methods, which instead margin out the hyperparameters with respect to prior distributions. Several structured prior formulations are considered for which fully Bayes selection and estimation methods are obtained. Analytical and simulation comparisons with empirical Bayes counterparts are studied.  相似文献   

13.
One of the most important issues in using neural networks for the analysis of real-world problems is the input variable selection problem. This article connects input variable selection with multiple testing in the neural network regression models. In the proposed procedure, the number and the type of input neurons are selected by means of a testing scheme, based on appropriate measures of relevance of a given input variable to the model. In order to avoid the data snooping problem, family-wise error rate is controlled by using the StepM method proposed by Romano and Wolf (2005 Romano , J. P. , Wolf , M. ( 2005 ). Exact and approximate stepdown methods for multiple hypothesis testing . J. Amer. Statist. Assoc. 100 : 94108 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]). The testing procedure is calibrated by using the subsampling, which is shown to deliver consistent results under weak assumptions on the data generating process and on the structure of the neural network model.  相似文献   

14.
The linear regression models with the autoregressive moving average (ARMA) errors (REGARMA models) are often considered, in order to reflect a serial correlation among observations. In this article, we focus on an adaptive least absolute shrinkage and selection operator (LASSO) (ALASSO) method for the variable selection of the REGARMA models and extend it to the linear regression models with the ARMA-generalized autoregressive conditional heteroskedasticity (ARMA-GARCH) errors (REGARMA-GARCH models). This attempt is an extension of the existing ALASSO method for the linear regression models with the AR errors (REGAR models) proposed by Wang et al. in 2007 Wang, H., Li, G., Tsai, C. (2007). Regression coefficient and autoregressive order shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B 69:6378. [Google Scholar]. New ALASSO algorithms are proposed to determine important predictors for the REGARMA and REGARMA-GARCH models. Finally, we provide the simulation results and real data analysis to illustrate our findings.  相似文献   

15.
This paper considers the application of Stein-type estimation procedure for the coefficients in a linear regression model when data are available from replicated experiment. Two families of estimators characterized by a single scalar are proposed and their large sample asymptotic properties are derived. These are utilized for comparing the performances of the two estimators along with the conventional estimator and conditions for the superiority of one estimator over the other are deduced.  相似文献   

16.
In this paper, under a nonparametric regression model, we introduce two families of robust procedures to estimate the regression function when missing data occur in the response. The first proposal is based on a local MM-functional applied to the conditional distribution function estimate adapted to the presence of missing data. The second proposal imputes the missing responses using the local MM-smoother based on the observed sample and then estimates the regression function with the completed sample. We show that the robust procedures considered are consistent and asymptotically normally distributed. A robust procedure to select the smoothing parameter is also discussed.  相似文献   

17.
In the logistic regression model, the variance of the maximum likelihood estimator is inflated and unstable when the multicollinearity exists in the data. There are several methods available in literature to overcome this problem. We propose a new stochastic restricted biased estimator. We study the statistical properties of the proposed estimator and compare its performance with some existing estimators in the sense of scalar mean squared criterion. An example and a simulation study are provided to illustrate the performance of the proposed estimator.KEYWORDS: Logistic regression, maximum likelihood estimator, mean squared error matrix, ridge regression, simulation study, stochastic restricted estimatorMathematics Subject Classifications: Primary 62J05, Secondary 62J07  相似文献   

18.
One of the standard problems in statistics consists of determining the relationship between a response variable and a single predictor variable through a regression function. Background scientific knowledge is often available that suggests that the regression function should have a certain shape (e.g. monotonically increasing or concave) but not necessarily a specific parametric form. Bernstein polynomials have been used to impose certain shape restrictions on regression functions. The Bernstein polynomials are known to provide a smooth estimate over equidistant knots. Bernstein polynomials are used in this paper due to their ease of implementation, continuous differentiability, and theoretical properties. In this work, we demonstrate a connection between the monotonic regression problem and the variable selection problem in the linear model. We develop a Bayesian procedure for fitting the monotonic regression model by adapting currently available variable selection procedures. We demonstrate the effectiveness of our method through simulations and the analysis of real data.  相似文献   

19.
Non-parametric regression models are developed when the predictor is a function-valued random variable X={Xt}tTX={Xt}tT. Based on a representation of the regression function f(X)f(X) in a reproducing kernel Hilbert space such models generalize the classical setting used in statistical learning theory. Two applications corresponding to scalar and categorical response random variable are performed on stock-exchange and medical data. The results of different regression models are compared.  相似文献   

20.
Frequently, the main objective of statistically designed simulation experiments is to estimate and validate regression metamodels, where the regressors are functions of the design variables and the dependent variable is the system response. In this article, a weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel is formulated and evaluated. Since the validity of a fitted regression model must be tested, a method for validating nonlinear regression simulation metamodels is presented. This method is a generalization of the cross-validation test proposed by Kleijnen (1983 Kleijnen , J. P. C. ( 1983 ). Cross-validation using the t statistic . European Journal of Operational Research 13 : 133141 .[Crossref] [Google Scholar]) in the context of linear regression metamodels. One drawback of the cross-validation strategy is the need to perform a large number of nonlinear regressions, if the number of experimental points is large. In this article, cross-validation is implemented using only one nonlinear regression. The proposed statistical analysis allows us to obtain Scheffé-type simultaneous confidence intervals for linear combinations of the metamodel's unknown parameters. Using the well-known M/M/1 example, a metamodel is built and validated with the aid of the proposed procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号