首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   20篇
  免费   0篇
统计学   20篇
  2019年   1篇
  2017年   1篇
  2016年   2篇
  2015年   2篇
  2014年   1篇
  2013年   4篇
  2012年   3篇
  2011年   1篇
  2010年   1篇
  2009年   1篇
  2008年   2篇
  2007年   1篇
排序方式: 共有20条查询结果,搜索用时 15 毫秒
1.
Transductive methods are useful in prediction problems when the training dataset is composed of a large number of unlabeled observations and a smaller number of labeled observations. In this paper, we propose an approach for developing transductive prediction procedures that are able to take advantage of the sparsity in the high dimensional linear regression. More precisely, we define transductive versions of the LASSO (Tibshirani, 1996) and the Dantzig Selector (Candès and Tao, 2007). These procedures combine labeled and unlabeled observations of the training dataset to produce a prediction for the unlabeled observations. We propose an experimental study of the transductive estimators that shows that they improve the LASSO and Dantzig Selector in many situations, and particularly in high dimensional problems when the predictors are correlated. We then provide non-asymptotic theoretical guarantees for these estimation methods. Interestingly, our theoretical results show that the Transductive LASSO and Dantzig Selector satisfy sparsity inequalities under weaker assumptions than those required for the “original” LASSO.  相似文献   
2.
张波  刘晓倩 《统计研究》2019,36(4):119-128
本文旨在研究基于fused惩罚的稀疏主成分分析方法,以适用于相邻变量之间高度相关甚至完全相等的数据情形。首先,从回归分析角度出发,提出一种求解稀疏主成分的简便思路,给出一种广义的稀疏主成分模型—— GSPCA模型及其求解算法,并证明在惩罚函数取1-范数时,该模型与现有的稀疏主成分模型——SPC模型的求解结果一致。其次,本文提出将fused惩罚与主成分分析相结合,得到一种fused稀疏主成分分析方法,并从惩罚性矩阵分解和回归分析两个角度,给出两种模型形式。在理论上证明了两种模型的求解结果是一致的,故将其统称为FSPCA模型。模拟实验显示,FSPCA模型在处理相邻变量之间高度相关甚至完全相等的数据集上的表现良好。最后,将FSPCA模型应用于手写数字识别,发现与SPC模型相比,FSPCA模型所提取的主成分具备更好的解释性,这使得该模型更具实用价值。  相似文献   
3.
This article considers in-sample prediction and out-of-sample forecasting in regressions with many exogenous predictors. We consider four dimension-reduction devices: principal components, ridge, Landweber Fridman, and partial least squares. We derive rates of convergence for two representative models: an ill-posed model and an approximate factor model. The theory is developed for a large cross-section and a large time-series. As all these methods depend on a tuning parameter to be selected, we also propose data-driven selection methods based on cross-validation and establish their optimality. Monte Carlo simulations and an empirical application to forecasting inflation and output growth in the U.S. show that data-reduction methods outperform conventional methods in several relevant settings, and might effectively guard against instabilities in predictors’ forecasting ability.  相似文献   
4.
In high-dimensional setting, componentwise L2boosting has been used to construct sparse model that performs well, but it tends to select many ineffective variables. Several sparse boosting methods, such as, SparseL2Boosting and Twin Boosting, have been proposed to improve the variable selection of L2boosting algorithm. In this article, we propose a new general sparse boosting method (GSBoosting). The relations are established between GSBoosting and other well known regularized variable selection methods in the orthogonal linear model, such as adaptive Lasso, hard thresholds, etc. Simulation results show that GSBoosting has good performance in both prediction and variable selection.  相似文献   
5.
Summary.  We present a new class of methods for high dimensional non-parametric regression and classification called sparse additive models. Our methods combine ideas from sparse linear modelling and additive non-parametric regression. We derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size. Sparse additive models are essentially a functional version of the grouped lasso of Yuan and Lin. They are also closely related to the COSSO model of Lin and Zhang but decouple smoothing and sparsity, enabling the use of arbitrary non-parametric smoothers. We give an analysis of the theoretical properties of sparse additive models and present empirical results on synthetic and real data, showing that they can be effective in fitting sparse non-parametric models in high dimensional data.  相似文献   
6.
Statistical inference of genetic regulatory networks is essential for understanding temporal interactions of regulatory elements inside the cells. In this work, we propose to infer the parameters of the ordinary differential equations using the techniques from functional data analysis (FDA) by regarding the observed time course expression data as continuous-time curves. For networks with a large number of genes, we take advantage of the sparsity of the networks by penalizing the linear coefficients with a L 1 norm. The ability of the algorithm to infer network structure is demonstrated using the cell-cycle time course data for Saccharomyces cerevisiae.  相似文献   
7.
Summary.  Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data.  相似文献   
8.
9.
The paper gives a highly personal sketch of some current trends in statistical inference. After an account of the challenges that new forms of data bring, there is a brief overview of some topics in stochastic modelling. The paper then turns to sparsity, illustrated using Bayesian wavelet analysis based on a mixture model and metabolite profiling. Modern likelihood methods including higher order approximation and composite likelihood inference are then discussed, followed by some thoughts on statistical education.  相似文献   
10.
Order selection is an important step in the application of finite mixture models. Classical methods such as AIC and BIC discourage complex models with a penalty directly proportional to the number of mixing components. In contrast, Chen and Khalili propose to link the penalty to two types of overfitting. In particular, they introduce a regularization penalty to merge similar subpopulations in a mixture model, where the shrinkage idea of regularized regression is seamlessly employed. However, the new method requires an effective and efficient algorithm. When the popular expectation-maximization (EM)-algorithm is used, we need to maximize a nonsmooth and nonconcave objective function in the M-step, which is computationally challenging. In this article, we show that such an objective function can be transformed into a sum of univariate auxiliary functions. We then design an iterative thresholding descent algorithm (ITD) to efficiently solve the associated optimization problem. Unlike many existing numerical approaches, the new algorithm leads to sparse solutions and thereby avoids undesirable ad hoc steps. We establish the convergence of the ITD and further assess its empirical performance using both simulations and real data examples.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号