期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An improved two-stage selection procedure

K. Lam 《统计学通讯:模拟与计算》2013,42(3):995-1006

The problem of selecting the best of k normal populations with unknown and possibly unequal variances is considered The two-stage procedure proposed by Rinott (1978) is improved so that less samples need to be drawn in the second-stage of the sampling scheme 相似文献

2.

Model-free variable selection

Lexin Li R. Dennis Cook Christopher J. Nachtsheim 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(2):285-299

Summary. The importance of variable selection in regression has grown in recent years as computing power has encouraged the modelling of data sets of ever-increasing size. Data mining applications in finance, marketing and bioinformatics are obvious examples. A limitation of nearly all existing variable selection methods is the need to specify the correct model before selection. When the number of predictors is large, model formulation and validation can be difficult or even infeasible. On the basis of the theory of sufficient dimension reduction, we propose a new class of model-free variable selection approaches. The methods proposed assume no model of any form, require no nonparametric smoothing and allow for general predictor effects. The efficacy of the methods proposed is demonstrated via simulation, and an empirical example is given. 相似文献

3.

Some comments on two-stage selection procedures

Nitis Mukhopadhyay 《统计学通讯:理论与方法》2013,42(7):671-683

In this paper we study the procedures of Dudewicz and Dalal ( 1975 ), and the modifications suggested by Rinott ( 1978 ), for selecting the largest mean from k normal populations with unknown variances. We look at the case k = 2 in detail, because there is an optimal allocation scheme here. We do not really allocate the total number of samples into two groups, but we estimate this optimal sample size, as well, so as to guarantee the probability of correct selection (written as P(CS)) at least P^?, 1/2 < P^? < 1 . We prove that the procedure of Rinott is “asymptotically in-efficient” (to be defined below) in the sense of Chow and Robbins ( 1965 ) for any k  2. Next, we propose two-stage procedures having all the properties of Rinott's procedure, together with the property of “asymptotic efficiency” - which is highly desirable. 相似文献

4.

Rank-based group variable selection

Guy-vanie M. Miakonkana Asheber Abebe 《Journal of nonparametric statistics》2016,28(3):550-562

A robust rank-based estimator for variable selection in linear models, with grouped predictors, is studied. The proposed estimation procedure extends the existing rank-based variable selection [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252] and the ww-scad [Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] to linear regression models with grouped variables. The resulting estimator is robust to contamination or deviations in both the response and the design space.The Oracle property and asymptotic normality of the estimator are established under some regularity conditions. Simulation studies reveal that the proposed method performs better than the existing rank-based methods [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252; Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] for grouped variables models. This estimation procedure also outperforms the adaptive hlasso [Zhou, N., and Zhu, J. (2010), ‘Group Variable Selection Via a Hierarchical Lasso and its Oracle Property’, Interface, 3(4):557–574] in the presence of local contamination in the design space or for heavy-tailed error distribution. 相似文献

5.

On two-stage selection procedures and related probability-inequalities

Yosef Rinott 《统计学通讯:理论与方法》2013,42(8):799-811

In this paper we discuss a modification of the Dudewicz-Dalal procedure for the problem of selecting the population with the largest mean from k normal populations with unknown variances. We derive some inequalities and use them to lower-bound the probability of correct selection. These bounds are applied to the determination of the second-stage sample size which is required in order to achieve a prescribed probability of correct selection. We discuss the resulting procedure and compare it to that of Dudewicz and Dalai (1975). 相似文献

6.

Empirical likelihood based variable selection 总被引：1，自引：0，他引：1

Asokan Mulayath Variyath Jiahua Chen Bovas Abraham 《Journal of statistical planning and inference》2010

Information criteria form an important class of model/variable selection methods in statistical analysis. Parametric likelihood is a crucial part of these methods. In some applications such as the generalized linear models, the models are only specified by a set of estimating functions. To overcome the non-availability of well defined likelihood function, the information criteria under empirical likelihood are introduced. Under this setup, we successfully solve the existence problem of the profile empirical likelihood due to the over constraint in variable selection problems. The asymptotic properties of the new method are investigated. The new method is shown to be consistent at selecting the variables under mild conditions. Simulation studies find that the proposed method has comparable performance to the parametric information criteria when a suitable parametric model is available, and is superior when the parametric model assumption is violated. A real data set is also used to illustrate the usefulness of the new method. 相似文献

7.

Penalized variable selection with U-estimates

Song X Ma S 《Journal of nonparametric statistics》2010,22(4):499-515

U-estimates are defined as maximizers of objective functions that are U-statistics. As an alternative to M-estimates, U-estimates have been extensively used in linear regression, classification, survival analysis, and many other areas. They may rely on weaker data and model assumptions and be preferred over alternatives. In this article, we investigate penalized variable selection with U-estimates. We propose smooth approximations of the objective functions, which can greatly reduce computational cost without affecting asymptotic properties. We study penalized variable selection using penalties that have been well investigated with M-estimates, including the LASSO, adaptive LASSO, and bridge, and establish their asymptotic properties. Generically applicable computational algorithms are described. Performance of the penalized U-estimates is assessed using numerical studies. 相似文献

8.

Forward and backward stepping in variable selection

《Journal of Statistical Computation and Simulation》2012,82(3-4):177-185

For stepwise regression and discriminant analysis the parameters F _in and F _out govern the inclusion and deletion of variables. The candidate variable with the biggest F—ratio is included if this exceeds F _inthe included variable with the smallest F—ratio is deleted if this is less than F _in If F _in ≧F _out; then return to a previous subset size implies improvement in the criterion measure. This result also holds for a generalization, stepwise multivariate analysis, which includes stepwise regression and discriminant analysis as special cases

Eliminations do not occur if forward regression and backward elimination yield the same sequence of subsets. Conversely, there is a more liberal stepping rule which always eliminates if the two sequences differ. 相似文献

9.

Penalized variable selection in competing risks regression

Zhixuan Fu Chirag R. Parikh Bingqing Zhou 《Lifetime data analysis》2017,23(3):353-376

Penalized variable selection methods have been extensively studied for standard time-to-event data. Such methods cannot be directly applied when subjects are at risk of multiple mutually exclusive events, known as competing risks. The proportional subdistribution hazard (PSH) model proposed by Fine and Gray (J Am Stat Assoc 94:496–509, 1999) has become a popular semi-parametric model for time-to-event data with competing risks. It allows for direct assessment of covariate effects on the cumulative incidence function. In this paper, we propose a general penalized variable selection strategy that simultaneously handles variable selection and parameter estimation in the PSH model. We rigorously establish the asymptotic properties of the proposed penalized estimators and modify the coordinate descent algorithm for implementation. Simulation studies are conducted to demonstrate the good performance of the proposed method. Data from deceased donor kidney transplants from the United Network of Organ Sharing illustrate the utility of the proposed method. 相似文献

10.

Multivariate Bayesian variable selection and prediction

P. J. Brown M. Vannucci & T. Fearn 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(3):627-641

The multivariate regression model is considered with p regressors. A latent vector with p binary entries serves to identify one of two types of regression coefficients: those close to 0 and those not. Specializing our general distributional setting to the linear model with Gaussian errors and using natural conjugate prior distributions, we derive the marginal posterior distribution of the binary latent vector. Fast algorithms aid its direct computation, and in high dimensions these are supplemented by a Markov chain Monte Carlo approach to sampling from the known posterior distribution. Problems with hundreds of regressor variables become quite feasible. We give a simple method of assigning the hyperparameters of the prior distribution. The posterior predictive distribution is derived and the approach illustrated on compositional analysis of data involving three sugars with 160 near infrared absorbances as regressors. 相似文献

11.

Influential subsets on the variable selection

Choongrak Kim Soonyoung Hwang 《统计学通讯:理论与方法》2013,42(2):335-347

When one or few observations are deleted in the multiple linear regression model, they can affect the variable selection. In this paper we derived the formula for the Mallows C_p criterion when k observations are deleted and express it as a functionn of basic building blocks such as residuals and leverages. Also, two real date sets are used to see how the selected model changes as few observations re deleted. 相似文献

12.

Bayesian variable selection with related predictors

Hugh Chipman 《Revue canadienne de statistique》1996,24(1):17-36

In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not allow for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main effect A or B. This paper develops mathematical representations of this and other relations between predictors, which may then be incorporated in a model selection procedure. A Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information. Priors relevant to arbitrary interactions and polynomials, dummy variables for categorical factors, competing predictors, and restrictions on the size of the models are developed. Since the relations developed are for priors, they may be incorporated in any Bayesian variable selection algorithm for any type of linear model. The application of the methods here is illustrated via the stochastic search variable selection algorithm of George and McCulloch (1993), which is modified to utilize the new priors. The performance of the approach is illustrated with two constructed examples and a computer performance dataset. 相似文献

13.

Variational discriminant analysis with variable selection

Weichang Yu John T. Ormerod Michael Stewart 《Statistics and Computing》2020,30(4):933-951

A fast Bayesian method that seamlessly fuses classification and hypothesis testing via discriminant analysis is developed. Building upon the original discriminant analysis classifier, modelling components are added to identify discriminative variables. A combination of cake priors and a novel form of variational Bayes we call reverse collapsed variational Bayes gives rise to variable selection that can be directly posed as a multiple hypothesis testing approach using likelihood ratio statistics. Some theoretical arguments are presented showing that Chernoff-consistency (asymptotically zero type I and type II error) is maintained across all hypotheses. We apply our method on some publicly available genomics datasets and show that our method performs well in practice for its computational cost. An R package VaDA has also been made available on Github. 相似文献

14.

Shrinkage and variable selection by polytopes

Sebastian Petry Gerhard Tutz 《Journal of statistical planning and inference》2012,142(1):48-64

Constrained estimators that enforce variable selection and grouping of highly correlated data have been shown to be successful in finding sparse representations and obtaining good performance in prediction. We consider polytopes as a general class of compact and convex constraint regions. Well-established procedures like LASSO (Tibshirani, 1996) or OSCAR (Bondell and Reich, 2008) are shown to be based on specific subclasses of polytopes. The general framework of polytopes can be used to investigate the geometric structure that underlies these procedures. Moreover, we propose a specifically designed class of polytopes that enforces variable selection and grouping. Simulation studies and an application illustrate the usefulness of the proposed method. 相似文献

15.

Rank-based variable selection with censored data

Jinfeng Xu Chenlei Leng Zhiliang Ying 《Statistics and Computing》2010,20(2):165-176

A rank-based variable selection procedure is developed for the semiparametric accelerated failure time model with censored observations where the penalized likelihood (partial likelihood) method is not directly applicable. 相似文献

16.

A stepwise discrete variable selection procedure

Matthew Goldstein William R. Dillon 《统计学通讯:理论与方法》2013,42(14):1423-1436

A stepwise variable selection procedure for multinomial discrimination is presented and discussed. Based upon the work of Kullback and Hills, stopping rules are proposed and illustrated for a set of data on communication buyer behavior. 相似文献

17.

On variable bandwidth selection in local polynomial regression

Kjell Doksum Derick Peterson & Alex Samarov 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(3):431-448

The performances of data-driven bandwidth selection procedures in local polynomial regression are investigated by using asymptotic methods and simulation. The bandwidth selection procedures considered are based on minimizing 'prelimit' approximations to the (conditional) mean-squared error (MSE) when the MSE is considered as a function of the bandwidth h . We first consider approximations to the MSE that are based on Taylor expansions around h=0 of the bias part of the MSE. These approximations lead to estimators of the MSE that are accurate only for small bandwidths h . We also consider a bias estimator which instead of using small h approximations to bias naïvely estimates bias as the difference of two local polynomial estimators of different order and we show that this estimator performs well only for moderate to large h . We next define a hybrid bias estimator which equals the Taylor-expansion-based estimator for small h and the difference estimator for moderate to large h . We find that the MSE estimator based on this hybrid bias estimator leads to a bandwidth selection procedure with good asymptotic and, for our Monte Carlo examples, finite sample properties. 相似文献

18.

Estimation and variable selection in nonparametric heteroscedastic regression

Paul Yau Robert Kohn 《Statistics and Computing》2003,13(3):191-208

The article considers a Gaussian model with the mean and the variance modeled flexibly as functions of the independent variables. The estimation is carried out using a Bayesian approach that allows the identification of significant variables in the variance function, as well as averaging over all possible models in both the mean and the variance functions. The computation is carried out by a simulation method that is carefully constructed to ensure that it converges quickly and produces iterates from the posterior distribution that have low correlation. Real and simulated examples demonstrate that the proposed method works well. The method in this paper is important because (a) it produces more realistic prediction intervals than nonparametric regression estimators that assume a constant variance; (b) variable selection identifies the variables in the variance function that are important; (c) variable selection and model averaging produce more efficient prediction intervals than those obtained by regular nonparametric regression. 相似文献

19.

Bayesian variable selection in Poisson change-point regression analysis

S. Min 《统计学通讯:模拟与计算》2017,46(3):2267-2282

In this article, we develop a Bayesian variable selection method that concerns selection of covariates in the Poisson change-point regression model with both discrete and continuous candidate covariates. Ranging from a null model with no selected covariates to a full model including all covariates, the Bayesian variable selection method searches the entire model space, estimates posterior inclusion probabilities of covariates, and obtains model averaged estimates on coefficients to covariates, while simultaneously estimating a time-varying baseline rate due to change-points. For posterior computation, the Metropolis-Hastings within partially collapsed Gibbs sampler is developed to efficiently fit the Poisson change-point regression model with variable selection. We illustrate the proposed method using simulated and real datasets. 相似文献

20.

Orthogonal weighted empirical likelihood-based variable selection for semiparametric instrumental variable models

Jiting Huang 《统计学通讯:理论与方法》2018,47(18):4375-4388

In this article, we consider the variable selection for a class of semiparametric instrumental variable models. By combining orthogonal weighting technology and empirical likelihood method, we propose an orthogonal weighted empirical likelihood-based variable selection procedure. Under some mild conditions, the consistency and sparsity of the variable selection procedure are studied. Furthermore, some simulation studies and a real data analysis are carried out to examine the finite-sample performance of the proposed method. 相似文献