期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Modeling association between multivariate correlated outcomes and high-dimensional sparse covariates: the adaptive SVS method

J. Pecanka A. W. van der Vaart 《Journal of applied statistics》2019,46(5):893-913

The problem of modeling the relationship between a set of covariates and a multivariate response with correlated components often arises in many areas of research such as genetics, psychometrics, signal processing. In the linear regression framework, such task can be addressed using a number of existing methods. In the high-dimensional sparse setting, most of these methods rely on the idea of penalization in order to efficiently estimate the regression matrix. Examples of such methods include the lasso, the group lasso, the adaptive group lasso or the simultaneous variable selection (SVS) method. Crucially, a suitably chosen penalty also allows for an efficient exploitation of the correlation structure within the multivariate response. In this paper we introduce a novel variant of such method called the adaptive SVS, which is closely linked with the adaptive group lasso. Via a simulation study we investigate its performance in the high-dimensional sparse regression setting. We provide a comparison with a number of other popular methods under different scenarios and show that the adaptive SVS is a powerful tool for efficient recovery of signal in such setting. The methods are applied to genetic data. 相似文献

2.

Adaptive Lasso Variable Selection for the Accelerated Failure Models

Xiaoguang Wang Lixin Song 《统计学通讯:理论与方法》2013,42(24):4372-4386

This article considers the adaptive lasso procedure for the accelerated failure time model with multiple covariates based on weighted least squares method, which uses Kaplan-Meier weights to account for censoring. The adaptive lasso method can complete the variable selection and model estimation simultaneously. Under some mild conditions, the estimator is shown to have sparse and oracle properties. We use Bayesian Information Criterion (BIC) for tuning parameter selection, and a bootstrap variance approach for standard error. Simulation studies and two real data examples are carried out to investigate the performance of the proposed method. 相似文献

3.

Variable Selection for Semiparametric Varying Coefficient Partially Linear Errors-in-Variables (EV) Model with Missing Response

Hu Yang 《统计学通讯:理论与方法》2013,42(21):4521-4539

This paper focuses on the variable selection for semiparametric varying coefficient partially linear model when the covariates are measured with additive errors and the response is missing. An adaptive lasso estimator and the smoothly clipped absolute deviation estimator as a comparison for the parameters are proposed. With the proper selection of regularization parameter, the sampling properties including the consistency of the two procedures and the oracle properties are established. Furthermore, the algorithms and corresponding standard error formulas are discussed. A simulation study is carried out to assess the finite sample performance of the proposed methods. 相似文献

4.

Prediction Error Property of the Lasso Estimator and its Generalization

Fuchun Huang 《Australian & New Zealand Journal of Statistics》2003,45(2):217-228

The lasso procedure is an estimator‐shrinkage and variable selection method. This paper shows that there always exists an interval of tuning parameter values such that the corresponding mean squared prediction error for the lasso estimator is smaller than for the ordinary least squares estimator. For an estimator satisfying some condition such as unbiasedness, the paper defines the corresponding generalized lasso estimator. Its mean squared prediction error is shown to be smaller than that of the estimator for values of the tuning parameter in some interval. This implies that all unbiased estimators are not admissible. Simulation results for five models support the theoretical results. 相似文献

5.

Lasso-type estimation for covariate-adjusted linear model

Feng Li Yiqiang Lu 《Journal of applied statistics》2018,45(1):26-42

Lasso is popularly used for variable selection in recent years. In this paper, lasso-type penalty functions including lasso and adaptive lasso are employed in simultaneously variable selection and parameter estimation for covariate-adjusted linear model, where the predictors and response cannot be observed directly and distorted by some observable covariate through some unknown multiplicative smooth functions. Estimation procedures are proposed and some asymptotic properties are obtained under some mild conditions. It deserves noting that under appropriate conditions, the adaptive lasso estimator correctly select covariates with nonzero coefficients with probability converging to one and that the estimators of nonzero coefficients have the same asymptotic distribution that they would have if the zero coefficients were known in advance, i.e. the adaptive lasso estimator has the oracle property in the sense of Fan and Li [6]. Simulation studies are carried out to examine its performance in finite sample situations and the Boston Housing data is analyzed for illustration. 相似文献

6.

Multivariate bandwidth selection for local linear regression

L. Yang & R. Tschernig 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(4):793-815

The existence and properties of optimal bandwidths for multivariate local linear regression are established, using either a scalar bandwidth for all regressors or a diagonal bandwidth vector that has a different bandwidth for each regressor. Both involve functionals of the derivatives of the unknown multivariate regression function. Estimating these functionals is difficult primarily because they contain multivariate derivatives. In this paper, an estimator of the multivariate second derivative is obtained via local cubic regression with most cross-terms left out. This estimator has the optimal rate of convergence but is simpler and uses much less computing time than the full local estimator. Using this as a pilot estimator, we obtain plug-in formulae for the optimal bandwidth, both scalar and diagonal, for multivariate local linear regression. As a simpler alternative, we also provide rule-of-thumb bandwidth selectors. All these bandwidths have satisfactory performance in our simulation study. 相似文献

7.

The influence function of penalized regression estimators

Viktoria Öllerer Christophe Croux Andreas Alfons 《Statistics》2015,49(4):741-765

To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore, it can be used to compute the asymptotic variance and the mean-squared error (MSE). In this paper we compute the influence function, the asymptotic variance and the MSE for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations non-standard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function. 相似文献

8.

Adaptive Elastic Net GMM Estimation With Many Invalid Moment Conditions: Simultaneous Model and Moment Selection

Mehmet Caner Xu Han Yoonseok Lee 《商业与经济统计学杂志》2018,36(1):24-46

This article develops the adaptive elastic net generalized method of moments (GMM) estimator in large-dimensional models with potentially (locally) invalid moment conditions, where both the number of structural parameters and the number of moment conditions may increase with the sample size. The basic idea is to conduct the standard GMM estimation combined with two penalty terms: the adaptively weighted lasso shrinkage and the quadratic regularization. It is a one-step procedure of valid moment condition selection, nonzero structural parameter selection (i.e., model selection), and consistent estimation of the nonzero parameters. The procedure achieves the standard GMM efficiency bound as if we know the valid moment conditions ex ante, for which the quadratic regularization is important. We also study the tuning parameter choice, with which we show that selection consistency still holds without assuming Gaussianity. We apply the new estimation procedure to dynamic panel data models, where both the time and cross-section dimensions are large. The new estimator is robust to possible serial correlations in the regression error terms. 相似文献

9.

Shape bias of robust covariance estimators: an empirical study

M. Hubert P. Rousseeuw K. Vakili 《Statistical Papers》2014,55(1):15-28

Detecting outliers in a multivariate point cloud is not trivial, especially when dealing with a sizable fraction of contamination. Over time, it has increasingly been recognized that the safest and most feasible approach to exposing outliers starts by computing a highly robust estimator of location and scatter that can withstand a large proportion of contamination. Many such estimators have been proposed in recent years. We will compare the worst-case bias of several prominent robust multivariate estimators by means of simulation. We also propose a new tool to compare robust estimators on real data sets, and illustrate it. 相似文献

10.

Multivariate generalized Birnbaum—Saunders kernel density estimators

N. Zougab Y. Ziane S. Adjabi 《统计学通讯:理论与方法》2018,47(18):4534-4555

In this article, we first propose the classical multivariate generalized Birnbaum–Saunders kernel estimator for probability density function estimation in the context of multivariate non negative data. Then, we apply two multiplicative bias correction (MBC) techniques for multivariate kernel density estimator. Some properties (bias, variance, and mean integrated squared error) of the corresponding estimators are also investigated. Finally, the performances of the classical and MBC estimators based on family of generalized Birnbaum–Saunders kernels are illustrated by a simulation study. 相似文献

11.

Construction of disease risk scoring systems using logistic group lasso: application to porcine reproductive and respiratory syndrome survey data

Hui Lin Peng Liu Derald J. Holtkamp 《Journal of applied statistics》2013,40(4):736-746

We propose to utilize the group lasso algorithm for logistic regression to construct a risk scoring system for predicting disease in swine. This work is motivated by the need to develop a risk scoring system from survey data on risk factor for porcine reproductive and respiratory syndrome (PRRS), which is a major health, production and financial problem for swine producers in nearly every country. Group lasso provides an attractive solution to this research question because of its ability to achieve group variable selection and stabilize parameter estimates at the same time. We propose to choose the penalty parameter for group lasso through leave-one-out cross-validation, using the criterion of the area under the receiver operating characteristic curve. Survey data for 896 swine breeding herd sites in the USA and Canada completed between March 2005 and March 2009 are used to construct the risk scoring system for predicting PRRS outbreaks in swine. We show that our scoring system for PRRS significantly improves the current scoring system that is based on an expert opinion. We also show that our proposed scoring system is superior in terms of area under the curve to that developed using multiple logistic regression model selected based on variable significance. 相似文献

12.

A generalization of a Gaussian semiparametric estimator on multivariate long-range dependent processes

《Journal of Statistical Computation and Simulation》2012,82(9):1832-1856

In this paper, we propose and study a general class of Gaussian semiparametric estimators (GSE) of the fractional differencing parameter in the context of long-range dependent multivariate time series. We establish large sample properties of the estimator without assuming Gaussianity. The class of models considered here satisfies simple conditions on the spectral density function, restricted to a small neighbourhood of the zero frequency and includes important class of vector autoregressive fractionally integrated moving average processes. We also present a simulation study to assess the finite sample properties of the proposed estimator based on a smoothed version of the GSE which supports its competitiveness. 相似文献

13.

Kernel estimation of regression function gradient

《统计学通讯:理论与方法》2012,41(1):135-151

Abstract

This paper is focused on kernel estimation of the gradient of a multivariate regression function. Despite the importance of this topic, the progress in this area is rather slow. Our aim is to construct a gradient estimator using the idea of local linear estimator for a regression function. The quality of this estimator is expressed in terms of the Mean Integrated Square Error. We focus on a choice of bandwidth matrix. Further, we present some data-driven methods for its choice and develop a new approach. The performance of presented methods is illustrated using a simulation study and real data example. 相似文献

14.

Sparse subspace linear discriminant analysis

Yanfang Li Jing Lei 《Statistics》2018,52(4):782-800

We study high dimensional multigroup classification from a sparse subspace estimation perspective, unifying the linear discriminant analysis (LDA) with other recent developments in high dimensional multivariate analysis using similar tools, such as penalization method. We develop two two-stage sparse LDA models, where in the first stage, convex relaxation is used to convert two classical formulations of LDA to semidefinite programs, and furthermore subspace perspective allows for straightforward regularization and estimation. After the initial convex relaxation, we use a refinement stage to improve the accuracy. For the first model, a penalized quadratic program with group lasso penalty is used for refinement, whereas a sparse version of the power method is used for the second model. We carefully examine the theoretical properties of both methods, alongside with simulations and real data analysis. 相似文献

15.

Estimation of covariance matrix via the sparse Cholesky factor with lasso

Changgee Chang Ruey S. Tsay 《Journal of statistical planning and inference》2010

In this paper, we discuss a parsimonious approach to estimation of high-dimensional covariance matrices via the modified Cholesky decomposition with lasso. Two different methods are proposed. They are the equi-angular and equi-sparse methods. We use simulation to compare the performance of the proposed methods with others available in the literature, including the sample covariance matrix, the banding method, and the L₁-penalized normal loglikelihood method. We then apply the proposed methods to a portfolio selection problem using 80 series of daily stock returns. To facilitate the use of lasso in high-dimensional time series analysis, we develop the dynamic weighted lasso (DWL) algorithm that extends the LARS-lasso algorithm. In particular, the proposed algorithm can efficiently update the lasso solution as new data become available. It can also add or remove explanatory variables. The entire solution path of the L₁-penalized normal loglikelihood method is also constructed. 相似文献

16.

Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations

Sy Han Chiou Sangwook Kang Junghi Kim Jun Yan 《Lifetime data analysis》2014,20(4):599-618

The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan’s weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study. 相似文献

17.

Robust Estimation of Multivariate Linear Model Based on Depth Weighted Mean and Scatter

Weihua Zhou 《统计学通讯:模拟与计算》2013,42(6):1292-1307

Based on the projection depth weighted mean and scatter estimation of the joint distribution of (x, y), we introduce a robust estimator of the regression coefficients for the multivariate linear model. The new estimator possesses desirable properties including affine invariance, Fisher consistency, and asymptotic normality. Also, we study the robustness of the estimator in terms of breakdown point and influence function. Extensive simulation studies are performed to investigate the finite sample behavior of robustness and efficiency. The methodology is illustrated with a real data example. 相似文献

18.

On the ridge regression estimator with sub-space restriction

R. Fallah S. M. M. Tabatabaey 《统计学通讯:理论与方法》2017,46(23):11854-11865

In the linear regression model with elliptical errors, a shrinkage ridge estimator is proposed. In this regard, the restricted ridge regression estimator under sub-space restriction is improved by incorporating a general function which satisfies Taylor’s series expansion. Approximate quadratic risk function of the proposed shrinkage ridge estimator is evaluated in the elliptical regression model. A Monte Carlo simulation study and analysis based on a real data example are considered for performance analysis. It is evident from the numerical results that the shrinkage ridge estimator performs better than both unrestricted and restricted estimators in the multivariate t-regression model, for some specific cases. 相似文献

19.

Regularization and variable selection via the elastic net 总被引：2，自引：0，他引：2

Hui Zou Trevor Hastie 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(2):301-320

Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors ( p ) is much bigger than the number of observations ( n ). By contrast, the lasso is not a very satisfactory variable selection method in the p ≫ n case. An algorithm called LARS-EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lasso. 相似文献

20.

Generalised Rank Regression Estimator with Standard Error Adjusted Lasso

下载免费PDF全文

A.S. Turkmen O. Ozturk 《Australian & New Zealand Journal of Statistics》2016,58(1):121-135

One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set. 相似文献