期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

Matthias Chung Qi Long Brent A. Johnson 《Statistics and Computing》2013,23(5):601-614

The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n ²+p) subject to n ² linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies. 相似文献

2.

On empirical Bayes simultaneous selection procedures for comparing normal populations with a standard

《Journal of statistical planning and inference》1999,77(1):73-88

In this paper, we derive statistical selection procedures to partition k normal populations into ‘good’ or ‘bad’ ones, respectively, using the nonparametric empirical Bayes approach. The relative regret risk of a selection procedure is used as a measure of its performance. We establish the asymptotic optimality of the proposed empirical Bayes selection procedures and investigate the associated rates of convergence. Under a very mild condition, the proposed empirical Bayes selection procedures are shown to have rates of convergence of order close to O(k^−1/2) where k is the number of populations involved in the selection problem. With further strong assumptions, the empirical Bayes selection procedures have rates of convergence of order O(k^{−α(r−1)/(2r+1)}), where 1<α<2 and r is an integer greater than 2. 相似文献

3.

The predictive Lasso

Minh-Ngoc Tran David J. Nott Chenlei Leng 《Statistics and Computing》2012,22(5):1069-1084

We propose a shrinkage procedure for simultaneous variable selection and estimation in generalized linear models (GLMs) with an explicit predictive motivation. The procedure estimates the coefficients by minimizing the Kullback-Leibler divergence of a set of predictive distributions to the corresponding predictive distributions for the full model, subject to an l ₁ constraint on the coefficient vector. This results in selection of a parsimonious model with similar predictive performance to the full model. Thanks to its similar form to the original Lasso problem for GLMs, our procedure can benefit from available l ₁-regularization path algorithms. Simulation studies and real data examples confirm the efficiency of our method in terms of predictive performance on future observations. 相似文献

4.

A MORE GENERAL CRITERION FOR SUBSET SELECTION IN MULTIPLE LINEAR REGRESSION

《统计学通讯:理论与方法》2013,42(5):795-811

ABSTRACT

In this article, we propose a more general criterion called S_p -criterion, for subset selection in the multiple linear regression Model. Many subset selection methods are based on the Least Squares (LS) estimator of β, but whenever the data contain an influential observation or the distribution of the error variable deviates from normality, the LS estimator performs ‘poorly’ and hence a method based on this estimator (for example, Mallows’ C_p -criterion) tends to select a ‘wrong’ subset. The proposed method overcomes this drawback and its main feature is that it can be used with any type of estimator (either the LS estimator or any robust estimator) of β without any need for modification of the proposed criterion. Moreover, this technique is operationally simple to implement as compared to other existing criteria. The method is illustrated with examples. 相似文献

5.

Joint covariate selection and joint subspace selection for multiple classification problems

Guillaume Obozinski Ben Taskar Michael I. Jordan 《Statistics and Computing》2010,20(2):231-252

We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of ℓ ₂ norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time. 相似文献

6.

Selection and screening procedures to determine optimal product designs

《Journal of statistical planning and inference》1998,67(2):311-330

To compare several promising product designs, manufacturers must measure their performance under multiple environmental conditions. In many applications, a product design is considered to be seriously flawed if its performance is poor for any level of the environmental factor. For example, if a particular automobile battery design does not function well under temperature extremes, then a manufacturer may not want to put this design into production. Thus, this paper considers the measure of a product's quality to be its worst performance over the levels of the environmental factor. We develop statistical procedures to identify (a near) optimal product design among a given set of product designs, i.e., the manufacturing design that maximizes the worst product performance over the levels of the environmental variable. We accomplish this by intuitive procedures based on the split-plot experimental design (and the randomized complete block design as a special case); split-plot designs have the essential structure of a product array and the practical convenience of local randomization. Two classes of statistical procedures are provided. In the first, the δ-best formulation of selection problems, we determine the number of replications of the basic split-plot design that are needed to guarantee, with a given confidence level, the selection of a product design whose minimum performance is within a specified amount, δ, of the performance of the optimal product design. In particular, if the difference between the quality of the best and second best manufacturing designs is δ or more, then the procedure guarantees that the best design will be selected with specified probability. For applications where a split-plot experiment that involves several product designs has been completed without the planning required of the δ-best formulation, we provide procedures to construct a ‘confidence subset’ of the manufacturing designs; the selected subset contains the optimal product design with a prespecified confidence level. The latter is called the subset selection formulation of selection problems. Examples are provided to illustrate the procedures. 相似文献

7.

Robust Coordinate Descent Algorithm Robust Solution Path for High-dimensional Sparse Regression Modeling

H. Park S. Konishi 《统计学通讯:模拟与计算》2016,45(1):115-129

The L₁-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L₁-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L₁-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers. 相似文献

8.

Different definitions of Δ-correct selection for the indifference zone formulation

《Journal of statistical planning and inference》1996,54(2):175-199

The goal of the indifference zone formulation of selection (Bechhofer, 1954) consists of selecting the t best variants out of k variants with a probability of at least 1 − β if the parameter difference between the t ‘good’ variants and the k − t ‘bad’ variants is not less than Δ. A review of generalized selection goals not using this difference condition is presented. Within some general classes of distributions, the suitable experimental designs for all these selection goals are identical. Similar results are described for the problem of selecting the best variant in comparison with a control, or standard. 相似文献

9.

Robust estimation and wavelet thresholding in partially linear models

Irène Gannaz 《Statistics and Computing》2007,17(4):293-310

This paper is concerned with a semiparametric partially linear regression model with unknown regression coefficients, an unknown nonparametric function for the non-linear component, and unobservable Gaussian distributed random errors. We present a wavelet thresholding based estimation procedure to estimate the components of the partial linear model by establishing a connection between an l ₁-penalty based wavelet estimator of the nonparametric component and Huber’s M-estimation of a standard linear model with outliers. Some general results on the large sample properties of the estimates of both the parametric and the nonparametric part of the model are established. Simulations are used to illustrate the general results and to compare the proposed methodology with other methods available in the recent literature. 相似文献

10.

Variance Reduction Using Nonlinear Controls and Transformations

Peter A. W. Lewis Richard L. Ressler R. Kevin Wood 《统计学通讯:模拟与计算》2013,42(2):655-672

Nonlinear regression-adjusted control variables are investigated for improving variance reduction in statistical and system simulations. To this end, simple control variables are piecewise sectioned and then transformed using linear and nonlinear transformations. Optimal parameters of these transformations are selected using linear or nonlinear least-squares regression algorithms. As an example, piecewise power-transformed variables are used in the estimation of the mean for the twovariable Anderson-Darling goodness-of-fit statistic W ₂ ². Substantial variance reduction over straightforward controls is obtained. These parametric transformations are compared against optimal, additive nonparametric transformations obtained by using the ACE algorithm and are shown, in comparison to the results from ACE, to be nearly optimal. 相似文献

11.

Asymptotics of Selective Inference

下载免费PDF全文

Xiaoying Tian Jonathan Taylor 《Scandinavian Journal of Statistics》2017,44(2):480-499

In this paper, we seek to establish asymptotic results for selective inference procedures removing the assumption of Gaussianity. The class of selection procedures we consider are determined by affine inequalities, which we refer to as affine selection procedures. Examples of affine selection procedures include selective inference along the solution path of the least absolute shrinkage and selection operator (LASSO), as well as selective inference after fitting the least absolute shrinkage and selection operator at a fixed value of the regularization parameter. We also consider some tests in penalized generalized linear models. Our result proves asymptotic convergence in the high‐dimensional setting where n<p, and n can be of a logarithmic factor of the dimension p for some procedures. 相似文献

12.

Optimality of blue's in a general linear model with incorrect design matrix

Thomas Mathew P. Bhimasankaram 《Journal of statistical planning and inference》1983,8(3):315-329

We consider the Gauss-Markoff model (Y,X₀β,σ²V) and provide solutions to the following problem: What is the class of all models (Y,Xβ,σ²V) such that a specific linear representation/some linear representation/every linear representation of the BLUE of every estimable parametric functional p'β under (Y,X₀β,σ²V) is (a) an unbiased estimator, (b) a BLUE, (c) a linear minimum bias estimator and (d) best linear minimum bias estimator of p'β under (Y,Xβ,σ²V)? We also analyse the above problems, when attention is restricted to a subclass of estimable parametric functionals. 相似文献

13.

Penalized least-squares estimation for regression coefficients in high-dimensional partially linear models

Huey-Fan Ni 《Journal of statistical planning and inference》2012,142(2):379-389

We consider a partially linear model with diverging number of groups of parameters in the parametric component. The variable selection and estimation of regression coefficients are achieved simultaneously by using the suitable penalty function for covariates in the parametric component. An MM-type algorithm for estimating parameters without inverting a high-dimensional matrix is proposed. The consistency and sparsity of penalized least-squares estimators of regression coefficients are discussed under the setting of some nonzero regression coefficients with very small values. It is found that the root p_n/n-consistency and sparsity of the penalized least-squares estimators of regression coefficients cannot be given consideration simultaneously when the number of nonzero regression coefficients with very small values is unknown, where p_n and n, respectively, denote the number of regression coefficients and sample size. The finite sample behaviors of penalized least-squares estimators of regression coefficients and the performance of the proposed algorithm are studied by simulation studies and a real data example. 相似文献

14.

Information in experiments and sufficiency

K. Ferentinos T. Papaioannou 《Journal of statistical planning and inference》1982,6(4):309-317

We define measures of information contained in an experiment which are by-products of the parametric measures of Fisher, Vajda, Mathai and Boekee and the non-parametric measures of Bhattacharyya, Rényi, Matusita, Kagan and Csiszár. We use these measures to compare sufficient experiments according to Blackwell's definition. In particular, we prove that if δ_X and δ_Y are two experiments and δ_X≥δ_Y then l_X≥l_y for all of the above measures. 相似文献

15.

Statistical inferences for partially linear single-index models with error-prone linear covariates

Zhensheng Huang 《Journal of statistical planning and inference》2011,141(2):899-909

We consider statistical inference for partially linear single-index models (PLSIM) when some linear covariates are not observed, but ancillary variables are available. Based on the profile least-squared estimators of the unknowns, we study the testing problems for parametric components in the proposed models. It is to see whether the generalized likelihood ratio (GLR) tests proposed by Fan et al. (2001) are applicable to testing for the parametric components. We show that under the null hypothesis the proposed GLR statistics follow asymptotically the χ²-distributions with the scale constants and the degrees of freedom being independent of the nuisance parameters or functions, which is called the Wilks phenomenon. Simulated experiments are conducted to illustrate our proposed methodology. 相似文献

16.

Corrected proof of the result of 'A prediction error property of the Lasso estimator and its generalization' by Huang (2003)

Saharon Rosset Ji Zhu 《Australian & New Zealand Journal of Statistics》2004,46(3):505-510

The Lasso achieves variance reduction and variable selection by solving an ?₁‐regularized least squares problem. Huang (2003) claims that ‘there always exists an interval of regularization parameter values such that the corresponding mean squared prediction error for the Lasso estimator is smaller than for the ordinary least square estimator’. This result is correct. However, its proof in Huang (2003) is not. This paper presents a corrected proof of the claim, which exposes and uses some interesting fundamental properties of the Lasso. 相似文献

17.

The structured elastic net for quantile regression and?support vector classification

Martin Slawski 《Statistics and Computing》2012,22(1):153-168

In view of its ongoing importance for a variety of practical applications, feature selection via ℓ ₁-regularization methods like the lasso has been subject to extensive theoretical as well empirical investigations. Despite its popularity, mere ℓ ₁-regularization has been criticized for being inadequate or ineffective, notably in situations in which additional structural knowledge about the predictors should be taken into account. This has stimulated the development of either systematically different regularization methods or double regularization approaches which combine ℓ ₁-regularization with a second kind of regularization designed to capture additional problem-specific structure. One instance thereof is the ‘structured elastic net’, a generalization of the proposal in Zou and Hastie (J. R. Stat. Soc. Ser. B 67:301–320, 2005), studied in Slawski et al. (Ann. Appl. Stat. 4(2):1056–1080, 2010) for the class of generalized linear models. 相似文献

18.

TESTS FOR DETECTING MORE NBU-NESS AT A SPECIFIC AGE

Jae-Hak Lim Dae-Kyung Kim Dong Ho Park 《Australian & New Zealand Journal of Statistics》2005,47(3):329-337

This paper proposes a class of non‐parametric test procedures for testing the null hypothesis that two distributions, F and G, are equal versus the alternative hypothesis that F is ‘more NBU (new better than used) at specified age t₀’ than G. Using Hoeffding's two‐sample U‐statistic theorem, it establishes the asymptotic normality of the test statistics and produces a class of asymptotically distribution‐free tests. Pitman asymptotic efficacies of the proposed tests are calculated with respect to the location and shape parameters. A numerical example is provided for illustrative purposes. 相似文献

19.

A coordinate majorization descent algorithm for ℓ1 penalized learning

《Journal of Statistical Computation and Simulation》2012,82(1):84-95

The glmnet package by Friedman et al. [Regularization paths for generalized linear models via coordinate descent, J. Statist. Softw. 33 (2010), pp. 1–22] is an extremely fast implementation of the standard coordinate descent algorithm for solving ?₁ penalized learning problems. In this paper, we consider a family of coordinate majorization descent algorithms for solving the ?₁ penalized learning problems by replacing each coordinate descent step with a coordinate-wise majorization descent operation. Numerical experiments show that this simple modification can lead to substantial improvement in speed when the predictors have moderate or high correlations. 相似文献

20.

A new enumerative property of the Narayana numbers

《Journal of statistical planning and inference》1986,14(1):63-67

Defining a k-bridge as a word of 2k letters taken from {a,b} (k times either) so that the i-th b never appears before the i-th a, the Narayana numbers are known to enumerate the k- bridges according to the number l+1 of ‘jumps’ (sequences of a's) or equally of ‘landings’ (sequences of b's). The paper shows that the same numbers also enumerate the k-bridges according to the total number l of non-final sequences (i.e. jumps and landings together) with lengths ≥2. The proof leans on two lemmas and uses induction with respect to k. 相似文献