期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the grouped selection and model complexity of the adaptive elastic net

Samiran Ghosh 《Statistics and Computing》2011,21(3):451-462

Lasso proved to be an extremely successful technique for simultaneous estimation and variable selection. However lasso has two major drawbacks. First, it does not enforce any grouping effect and secondly in some situation lasso solutions are inconsistent for variable selection. To overcome this inconsistency adaptive lasso is proposed where adaptive weights are used for penalizing different coefficients. Recently a doubly regularized technique namely elastic net is proposed which encourages grouping effect i.e. either selection or omission of the correlated variables together. However elastic net is also inconsistent. In this paper we study adaptive elastic net which does not have this drawback. In this article we specially focus on the grouped selection property of adaptive elastic net along with its model selection complexity. We also shed some light on the bias-variance tradeoff of different regularization methods including adaptive elastic net. An efficient algorithm was proposed in the line of LARS-EN, which is then illustrated with simulated as well as real life data examples. 相似文献

2.

Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models

Jing Yang Fang Lu Hu Yang 《Statistics》2017,51(6):1179-1199

In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods. 相似文献

3.

Consistent variable selection via the optimal discovery procedure in multiple testing

Li Wang 《统计学通讯:理论与方法》2017,46(13):6303-6322

In this paper, we translate variable selection for linear regression into multiple testing, and select significant variables according to testing result. New variable selection procedures are proposed based on the optimal discovery procedure (ODP) in multiple testing. Due to ODP’s optimality, if we guarantee the number of significant variables included, it will include less non significant variables than marginal p-value based methods. Consistency of our procedures is obtained in theory and simulation. Simulation results suggest that procedures based on multiple testing have improvement over procedures based on selection criteria, and our new procedures have better performance than marginal p-value based procedures. 相似文献

4.

Modelling the role of variables in model-based cluster analysis

Giuliano Galimberti Annamaria Manisi Gabriele Soffritti 《Statistics and Computing》2018,28(1):145-169

In the framework of cluster analysis based on Gaussian mixture models, it is usually assumed that all the variables provide information about the clustering of the sample units. Several variable selection procedures are available in order to detect the structure of interest for the clustering when this structure is contained in a variable sub-vector. Currently, in these procedures a variable is assumed to play one of (up to) three roles: (1) informative, (2) uninformative and correlated with some informative variables, (3) uninformative and uncorrelated with any informative variable. A more general approach for modelling the role of a variable is proposed by taking into account the possibility that the variable vector provides information about more than one structure of interest for the clustering. This approach is developed by assuming that such information is given by non-overlapped and possibly correlated sub-vectors of variables; it is also assumed that the model for the variable vector is equal to a product of conditionally independent Gaussian mixture models (one for each variable sub-vector). Details about model identifiability, parameter estimation and model selection are provided. The usefulness and effectiveness of the described methodology are illustrated using simulated and real datasets. 相似文献

5.

Generalised Rank Regression Estimator with Standard Error Adjusted Lasso

下载免费PDF全文

A.S. Turkmen O. Ozturk 《Australian & New Zealand Journal of Statistics》2016,58(1):121-135

One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set. 相似文献

6.

How to Avoid Undue Influence of Federal Agencies on the Editorial Content of ASA Journals

Irwin Bross 《The American statistician》2013,67(3):226-228

This note discusses a problem that might occur when forward stepwise regression is used for variable selection and among the candidate variables is a categorical variable with more than two categories. Most software packages (such as SAS, SPSS^x, BMDP) include special programs for performing stepwise regression. The user of these programs has to code categorical variables with dummy variables. In this case the forward selection might wrongly indicate that a categorical variable with more than two categories is nonsignificant. This is a disadvantage of the forward selection compared with the backward elimination method. A way to avoid the problem would be to test in a single step all dummy variables corresponding to the same categorical variable rather than one dummy variable at a time, such as in the analysis of covariance. This option, however, is not available in forward stepwise procedures, except for stepwise logistic regression in BMDP. A practical possibility is to repeat the forward stepwise regression and change the reference categories each time. 相似文献

7.

ON THE COVERAGE PROBABILITY OF CONFIDENCE INTERVALS IN REGRESSION AFTER VARIABLE SELECTION 总被引：1，自引：1，他引：0

Paul Kabaila 《Australian & New Zealand Journal of Statistics》2005,47(4):549-562

This paper considers a linear regression model with regression parameter vector β. The parameter of interest is θ= a^Tβ where a is specified. When, as a first step, a data‐based variable selection (e.g. minimum Akaike information criterion) is used to select a model, it is common statistical practice to then carry out inference about θ, using the same data, based on the (false) assumption that the selected model had been provided a priori. The paper considers a confidence interval for θ with nominal coverage 1 ‐ α constructed on this (false) assumption, and calls this the naive 1 ‐ α confidence interval. The minimum coverage probability of this confidence interval can be calculated for simple variable selection procedures involving only a single variable. However, the kinds of variable selection procedures used in practice are typically much more complicated. For the real‐life data presented in this paper, there are 20 variables each of which is to be either included or not, leading to 2²⁰ different models. The coverage probability at any given value of the parameters provides an upper bound on the minimum coverage probability of the naive confidence interval. This paper derives a new Monte Carlo simulation estimator of the coverage probability, which uses conditioning for variance reduction. For these real‐life data, the gain in efficiency of this Monte Carlo simulation due to conditioning ranged from 2 to 6. The paper also presents a simple one‐dimensional search strategy for parameter values at which the coverage probability is relatively small. For these real‐life data, this search leads to parameter values for which the coverage probability of the naive 0.95 confidence interval is 0.79 for variable selection using the Akaike information criterion and 0.70 for variable selection using Bayes information criterion, showing that these confidence intervals are completely inadequate. 相似文献

8.

Component selection norms for principal components regression

R. Carter Hill Thomas B. Fomby S. R. Johnson 《统计学通讯:理论与方法》2013,42(4):309-334

Multicollinearity or near exact linear dependence among the vectors of regressor variables in a multiple linear regression analysis can have important effects on the quality of least squares parameter estimates. One frequently suggested approach for these problems is principal components regression. This paper investigates alternative variable selection procedures and their implications for such an analysis. 相似文献

9.

Finite Mixture of Generalized Semiparametric Models: Variable Selection via Penalized Estimation

Farzad Eskandari Ehsan Ormoz 《统计学通讯:模拟与计算》2016,45(10):3744-3759

Selection of the important variables is one of the most important model selection problems in statistical applications. In this article, we address variable selection in finite mixture of generalized semiparametric models. To overcome computational burden, we introduce a class of variable selection procedures for finite mixture of generalized semiparametric models using penalized approach for variable selection. Estimation of nonparametric component will be done via multivariate kernel regression. It is shown that the new method is consistent for variable selection and the performance of proposed method will be assessed via simulation. 相似文献

10.

Dealing with big data: comparing dimension reduction and shrinkage regression methods

Hamideh D. Hamedani Sara Sadat Moosavi 《Journal of applied statistics》2017,44(3):511-532

In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data. 相似文献

11.

A variable selection approach to monotonic regression with Bernstein polynomials

S. McKay Curtis Sujit K. Ghosh 《Journal of applied statistics》2011,38(5):961-976

One of the standard problems in statistics consists of determining the relationship between a response variable and a single predictor variable through a regression function. Background scientific knowledge is often available that suggests that the regression function should have a certain shape (e.g. monotonically increasing or concave) but not necessarily a specific parametric form. Bernstein polynomials have been used to impose certain shape restrictions on regression functions. The Bernstein polynomials are known to provide a smooth estimate over equidistant knots. Bernstein polynomials are used in this paper due to their ease of implementation, continuous differentiability, and theoretical properties. In this work, we demonstrate a connection between the monotonic regression problem and the variable selection problem in the linear model. We develop a Bayesian procedure for fitting the monotonic regression model by adapting currently available variable selection procedures. We demonstrate the effectiveness of our method through simulations and the analysis of real data. 相似文献

12.

Cluster Data Streams with Noisy Variables

Hu Yang Chenqun Yu 《统计学通讯:模拟与计算》2016,45(4):1381-1396

Clustering algorithms are important methods widely used in mining data streams because of their abilities to deal with infinite data flows. Although these algorithms perform well to mining latent relationship in data streams, most of them suffer from loss of cluster purity and become unstable when the inputting data streams have too many noisy variables. In this article, we propose a clustering algorithm to cluster data streams with noisy variables. The result from simulation shows that our proposal method is better than previous studies by adding a process of variable selection as a component in clustering algorithms. The results of two experiments indicate that clustering data streams with the process of variable selection are more stable and have better purity than those without such process. Another experiment testing KDD-CUP99 dataset also shows that our algorithm can generate more stable result. 相似文献

13.

Penalized regression models with autoregressive error terms

Young Joo Yoon Cheolwoo Park 《Journal of Statistical Computation and Simulation》2013,83(9):1756-1772

Penalized regression methods have recently gained enormous attention in statistics and the field of machine learning due to their ability of reducing the prediction error and identifying important variables at the same time. Numerous studies have been conducted for penalized regression, but most of them are limited to the case when the data are independently observed. In this paper, we study a variable selection problem in penalized regression models with autoregressive (AR) error terms. We consider three estimators, adaptive least absolute shrinkage and selection operator, bridge, and smoothly clipped absolute deviation, and propose a computational algorithm that enables us to select a relevant set of variables and also the order of AR error terms simultaneously. In addition, we provide their asymptotic properties such as consistency, selection consistency, and asymptotic normality. The performances of the three estimators are compared with one another using simulated and real examples. 相似文献

14.

Bayes model averaging with selection of regressors

P. J. Brown M. Vannucci T. Fearn 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(3):519-536

Summary. When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very many predictors. Here we look at Bayes model averaging incorporating variable selection for prediction. This offers similar mean-square errors of prediction but with a vastly reduced predictor space. This can greatly aid the interpretation of the model. It also reduces the cost if measured variables have costs. The development here uses decision theory in the context of the multivariate general linear model. In passing, this reduced predictor space Bayes model averaging is contrasted with single-model approximations. A fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be contemplated. We discuss the merits of absolute rather than proportionate shrinkage in regression, especially when there are more variables than observations. The methodology is illustrated on a set of spectroscopic data used for measuring the amounts of different sugars in an aqueous solution. 相似文献

15.

A Selective Overview of Variable Selection in High Dimensional Feature Space 总被引：5，自引：0，他引：5

Fan J Lv J 《Statistica Sinica》2010,20(1):101-148

High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods. 相似文献

16.

Investigation about a screening step in model selection

Willi Sauerbrei Norbert Holländer Anika Buchholz 《Statistics and Computing》2008,18(2):195-208

In many studies a large number of variables is measured and the identification of relevant variables influencing an outcome is an important task. For variable selection several procedures are available. However, focusing on one model only neglects that there usually exist other equally appropriate models. Bayesian or frequentist model averaging approaches have been proposed to improve the development of a predictor. With a larger number of variables (say more than ten variables) the resulting class of models can be very large. For Bayesian model averaging Occam’s window is a popular approach to reduce the model space. As this approach may not eliminate any variables, a variable screening step was proposed for a frequentist model averaging procedure. Based on the results of selected models in bootstrap samples, variables are eliminated before deriving a model averaging predictor. As a simple alternative screening procedure backward elimination can be used. Through two examples and by means of simulation we investigate some properties of the screening step. In the simulation study we consider situations with fifteen and 25 variables, respectively, of which seven have an influence on the outcome. With the screening step most of the uninfluential variables will be eliminated, but also some variables with a weak effect. Variable screening leads to more applicable models without eliminating models, which are more strongly supported by the data. Furthermore, we give recommendations for important parameters of the screening step. 相似文献

17.

Diffusion Indexes With Sparse Loadings

Johannes Tang Kristensen 《商业与经济统计学杂志》2017,35(3):434-451

The use of large-dimensional factor models in forecasting has received much attention in the literature with the consensus being that improvements on forecasts can be achieved when comparing with standard models. However, recent contributions in the literature have demonstrated that care needs to be taken when choosing which variables to include in the model. A number of different approaches to determining these variables have been put forward. These are, however, often based on ad hoc procedures or abandon the underlying theoretical factor model. In this article, we will take a different approach to the problem by using the least absolute shrinkage and selection operator (LASSO) as a variable selection method to choose between the possible variables and thus obtain sparse loadings from which factors or diffusion indexes can be formed. This allows us to build a more parsimonious factor model that is better suited for forecasting compared to the traditional principal components (PC) approach. We provide an asymptotic analysis of the estimator and illustrate its merits empirically in a forecasting experiment based on U.S. macroeconomic data. Overall we find that compared to PC we obtain improvements in forecasting accuracy and thus find it to be an important alternative to PC. Supplementary materials for this article are available online. 相似文献

18.

A self-normalizing approach to the specification test of mixed-frequency models

Henriette Groenvik 《统计学通讯:理论与方法》2018,47(8):1913-1922

In econometrics and finance, variables are collected at different frequencies. One straightforward regression model is to aggregate the higher frequency variable to match the lower frequency with a fixed weight function. However, aggregation with fixed weight functions may overlook useful information in the higher frequency variable. On the other hand, keeping all higher frequencies may result in overly complicated models. In literature, mixed data sampling (MIDAS) regression models have been proposed to balance between the two. In this article, a new model specification test is proposed that can help decide between the simple aggregation and the MIDAS model. 相似文献

19.

Some estimation procedures using a linear model in successive sampling

G. N. Singh S. Prasad 《统计学通讯:理论与方法》2013,42(9):2679-2698

ABSTRACT

In successive sampling some recent works depict the use of super-population models where information on stable auxiliary variable over occasions has been utilized. Stability character of auxiliary variable may not sustain, if the duration between occasions is large. To cope with such situations, the present work is an attempt to develop some estimation procedures by utilizing the information on two independent auxiliary variables through a linear super-population model. Some estimators are proposed to estimate the current population mean in two occasions successive (rotation) sampling. Optimum replacement strategies are formulated and performances of the proposed estimators have been discussed. Results are interpreted through empirical studies. 相似文献

20.

Tilting methods for assessing the influence of components in a classifier

Peter Hall D. M. Titterington Jing-Hao Xue 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(4):783-803

Summary. Many contemporary classifiers are constructed to provide good performance for very high dimensional data. However, an issue that is at least as important as good classification is determining which of the many potential variables provide key information for good decisions. Responding to this issue can help us to determine which aspects of the datagenerating mechanism (e.g. which genes in a genomic study) are of greatest importance in terms of distinguishing between populations. We introduce tilting methods for addressing this problem. We apply weights to the components of data vectors, rather than to the data vectors themselves (as is commonly the case in related work). In addition we tilt in a way that is governed by L ₂-distance between weight vectors, rather than by the more commonly used Kullback–Leibler distance. It is shown that this approach, together with the added constraint that the weights should be non-negative, produces an algorithm which eliminates vector components that have little influence on the classification decision. In particular, use of the L ₂-distance in this problem produces properties that are reminiscent of those that arise when L ₁-penalties are employed to eliminate explanatory variables in very high dimensional prediction problems, e.g. those involving the lasso. We introduce techniques that can be implemented very rapidly, and we show how to use bootstrap methods to assess the accuracy of our variable ranking and variable elimination procedures. 相似文献