期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A novel regularization method for estimation and variable selection in multi-index models

Peng Zeng Yu Zhu 《统计学通讯:理论与方法》2019,48(12):3055-3067

Multi-index models have attracted much attention recently as an approach to circumvent the curse of dimensionality when modeling high-dimensional data. This paper proposes a novel regularization method, called MAVE-glasso, for simultaneous parameter estimation and variable selection in multi-index models. The advantages of the proposed method include transformation invariance, automatic variable selection, automatic removal of noninformative observations, and row-wise shrinkage. An efficient row-wise coordinate descent algorithm is proposed to calculate the estimates. Simulation and real examples are used to demonstrate the excellent performance of MAVE-glasso. 相似文献

2.

Mixtures of Gaussian copula factor analyzers for clustering high dimensional data

《Journal of the Korean Statistical Society》2019,48(3):480-492

Mixtures of factor analyzers is a useful model-based clustering method which can avoid the curse of dimensionality in high-dimensional clustering. However, this approach is sensitive to both diverse non-normalities of marginal variables and outliers, which are commonly observed in multivariate experiments. We propose mixtures of Gaussian copula factor analyzers (MGCFA) for clustering high-dimensional clustering. This model has two advantages; (1) it allows different marginal distributions to facilitate fitting flexibility of the mixture model, (2) it can avoid the curse of dimensionality by embedding the factor-analytic structure in the component-correlation matrices of the mixture distribution.An EM algorithm is developed for the fitting of MGCFA. The proposed method is free of the curse of dimensionality and allows any parametric marginal distribution which fits best to the data. It is applied to both synthetic data and a microarray gene expression data for clustering and shows its better performance over several existing methods. 相似文献

3.

Adaptive wavelet series estimation in separable nonparametric regression models

Umberto Amato Anestis Antoniadis 《Statistics and Computing》2001,11(4):373-394

It is well-known that multivariate curve estimation suffers from the curse of dimensionality. However, reasonable estimators are possible, even in several dimensions, under appropriate restrictions on the complexity of the curve. In the present paper we explore how much appropriate wavelet estimators can exploit a typical restriction on the curve such as additivity. We first propose an adaptive and simultaneous estimation procedure for all additive components in additive regression models and discuss rate of convergence results and data-dependent truncation rules for wavelet series estimators. To speed up computation we then introduce a wavelet version of functional ANOVA algorithm for additive regression models and propose a regularization algorithm which guarantees an adaptive solution to the multivariate estimation problem. Some simulations indicate that wavelets methods complement nicely the existing methodology for nonparametric multivariate curve estimation. 相似文献

4.

Penalised spline estimation for generalised partially linear single-index models

Yan Yu Chaojiang Wu Yuankun Zhang 《Statistics and Computing》2017,27(2):571-582

Generalised linear models are frequently used in modeling the relationship of the response variable from the general exponential family with a set of predictor variables, where a linear combination of predictors is linked to the mean of the response variable. We propose a penalised spline (P-spline) estimation for generalised partially linear single-index models, which extend the generalised linear models to include nonlinear effect for some predictors. The proposed models can allow flexible dependence on some predictors while overcome the “curse of dimensionality”. We investigate the P-spline profile likelihood estimation using the readily available R package mgcv, leading to straightforward computation. Simulation studies are considered under various link functions. In addition, we examine different choices of smoothing parameters. Simulation results and real data applications show effectiveness of the proposed approach. Finally, some large sample properties are established. 相似文献

5.

Non-linear regression models for Approximate Bayesian Computation

Michael G. B. Blum Olivier François 《Statistics and Computing》2010,20(1):63-73

Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model. 相似文献

6.

Dimension reduction in estimating equations with covariates missing at random

Ying Zhang 《Journal of nonparametric statistics》2018,30(2):491-504

To estimate parameters defined by estimating equations with covariates missing at random, we consider three bias-corrected nonparametric approaches based on inverse probability weighting, regression and augmented inverse probability weighting. However, when the dimension of covariates is not low, the estimation efficiency will be affected due to the curse of dimensionality. To address this issue, we propose a two-stage estimation procedure by using the dimension-reduced kernel estimation in conjunction with bias-corrected estimating equations. We show that the resulting three estimators are asymptotically equivalent and achieve the desirable properties. The impact of dimension reduction in nonparametric estimation of parameters is also investigated. The finite-sample performance of the proposed estimators is studied through simulation, and an application to an automobile data set is also presented. 相似文献

7.

Mean response estimation with missing response in the presence of high-dimensional covariates

Yongjin Li Qihua Wang Liping Zhu 《统计学通讯:理论与方法》2017,46(2):628-643

This paper studies the problem of mean response estimation where missingness occurs to the response but multiple-dimensional covariates are observable. Two main challenges occur in this situation: curse of dimensionality and model specification. The non parametric imputation method relieves model specification but suffers curse of dimensionality, while some model-based methods such as inverse probability weighting (IPW) and augmented inverse probability weighting (AIPW) methods are the opposite. We propose a unified non parametric method to overcome the two challenges with the aiding of sufficient dimension reduction. It imposes no parametric structure on propensity score or conditional mean response, and thus retains the non parametric flavor. Moreover, the estimator achieves the optimal efficiency that a double robust estimator can attain. Simulations were conducted and it demonstrates the excellent performances of our method in various situations. 相似文献

8.

A new local estimation method for single index models for longitudinal data

Hongmei Lin Jianhong Shi Jicai Liu Yanghui Liu 《Journal of nonparametric statistics》2016,28(3):644-658

Single index models are natural extensions of linear models and overcome the so-called curse of dimensionality. They are very useful for longitudinal data analysis. In this paper, we develop a new efficient estimation procedure for single index models with longitudinal data, based on Cholesky decomposition and local linear smoothing method. Asymptotic normality for the proposed estimators of both the parametric and nonparametric parts will be established. Monte Carlo simulation studies show excellent finite sample performance. Furthermore, we illustrate our methods with a real data example. 相似文献

9.

Adaptive density estimation: A curse of support?

Patricia Reynaud-Bouret Vincent Rivoirard Christine Tuleau-Malot 《Journal of statistical planning and inference》2011,141(1):115-139

This paper deals with the classical problem of density estimation on the real line. Most of the existing papers devoted to minimax properties assume that the support of the underlying density is bounded and known. But this assumption may be very difficult to handle in practice. In this work, we show that, exactly as a curse of dimensionality exists when the data lie in R^d, there exists a curse of support as well when the support of the density is infinite. As for the dimensionality problem where the rates of convergence deteriorate when the dimension grows, the minimax rates of convergence may deteriorate as well when the support becomes infinite. This problem is not purely theoretical since the simulations show that the support-dependent methods are really affected in practice by the size of the density support, or by the weight of the density tail. We propose a method based on a biorthogonal wavelet thresholding rule that is adaptive with respect to the nature of the support and the regularity of the signal, but that is also robust in practice to this curse of support. The threshold, that is proposed here, is very accurately calibrated so that the gap between optimal theoretical and practical tuning parameters is almost filled. 相似文献

10.

Dimension reduced kernel estimation for distribution function with incomplete data

Hu Z Follmann DA Qin J 《Journal of statistical planning and inference》2011,141(9):3084-3093

This work focuses on the estimation of distribution functions with incomplete data, where the variable of interest Y has ignorable missingness but the covariate X is always observed. When X is high dimensional, parametric approaches to incorporate X—information is encumbered by the risk of model misspecification and nonparametric approaches by the curse of dimensionality. We propose a semiparametric approach, which is developed under a nonparametric kernel regression framework, but with a parametric working index to condense the high dimensional X—information for reduced dimension. This kernel dimension reduction estimator has double robustness to model misspecification and is most efficient if the working index adequately conveys the X—information about the distribution of Y. Numerical studies indicate better performance of the semiparametric estimator over its parametric and nonparametric counterparts. We apply the kernel dimension reduction estimation to an HIV study for the effect of antiretroviral therapy on HIV virologic suppression. 相似文献

11.

Flexible Copula Density Estimation with Penalized Hierarchical B‐splines

Göran Kauermann Christian Schellhase David Ruppert 《Scandinavian Journal of Statistics》2013,40(4):685-705

The paper introduces a new method for flexible spline fitting for copula density estimation. Spline coefficients are penalized to achieve a smooth fit. To weaken the curse of dimensionality, instead of a full tensor spline basis, a reduced tensor product based on so called sparse grids (Notes Numer. Fluid Mech. Multidiscip. Des., 31, 1991, 241‐251) is used. To achieve uniform margins of the copula density, linear constraints are placed on the spline coefficients, and quadratic programming is used to fit the model. Simulations and practical examples accompany the presentation. 相似文献

12.

An oracle property of the Nadaraya–Watson kernel estimator for high‐dimensional nonparametric regression

Daniel Conn Gang Li 《Scandinavian Journal of Statistics》2019,46(3):735-764

The Nadaraya–Watson estimator is among the most studied nonparametric regression methods. A classical result is that its convergence rate depends on the number of covariates and deteriorates quickly as the dimension grows. This underscores the “curse of dimensionality” and has limited its use in high‐dimensional settings. In this paper, however, we show that the Nadaraya–Watson estimator has an oracle property such that when the true regression function is single‐ or multi‐index, it discovers the low‐rank dependence structure between the response and the covariates, mitigating the curse of dimensionality. Specifically, we prove that, using K‐fold cross‐validation and a positive‐semidefinite bandwidth matrix, the Nadaraya–Watson estimator has a convergence rate that depends on the number of indices rather than on the number of covariates. This result follows by allowing the bandwidths to diverge to infinity rather than restricting them all to converge to zero at certain rates, as in previous theoretical studies. 相似文献

13.

Non‐parametric Regression Tests Using Dimension Reduction Techniques

BERTHOLD R. HAAG 《Scandinavian Journal of Statistics》2008,35(4):719-738

Abstract. Testing for parametric structure is an important issue in non‐parametric regression analysis. A standard approach is to measure the distance between a parametric and a non‐parametric fit with a squared deviation measure. These tests inherit the curse of dimensionality from the non‐parametric estimator. This results in a loss of power in finite samples and against local alternatives. This article proposes to circumvent the curse of dimensionality by projecting the residuals under the null hypothesis onto the space of additive functions. To estimate this projection, the smooth backfitting estimator is used. The asymptotic behaviour of the test statistic is derived and the consistency of a wild bootstrap procedure is established. The finite sample properties are investigated in a simulation study. 相似文献

14.

Semiparametric inference for the recurrent events process by means of a single-index model

Olivier Bouaziz Ségolen Geffray Olivier Lopez 《Statistics》2015,49(2):361-385

In this paper, we introduce new parametric and semiparametric regression techniques for a recurrent event process subject to random right censoring. We develop models for the cumulative mean function and provide asymptotically normal estimators. Our semiparametric model which relies on a single-index assumption can be seen as a dimension reduction technique that, contrary to a fully nonparametric approach, is not stroke by the curse of dimensionality when the number of covariates is high. We discuss data-driven techniques to choose the parameters involved in the estimation procedures and provide a simulation study to support our theoretical results. 相似文献

15.

Books on probability ideas are reviewed

《Journal of Statistical Computation and Simulation》2012,82(6):707-711

In this paper, semiparametric methods are applied to estimate multivariate volatility functions, using a residual approach as in [J. Fan and Q. Yao, Efficient estimation of conditional variance functions in stochastic regression, Biometrika 85 (1998), pp. 645–660; F.A. Ziegelmann, Nonparametric estimation of volatility functions: The local exponential estimator, Econometric Theory 18 (2002), pp. 985–991; F.A. Ziegelmann, A local linear least-absolute-deviations estimator of volatility, Comm. Statist. Simulation Comput. 37 (2008), pp. 1543–1564], among others. Our main goal here is two-fold: (1) describe and implement a number of semiparametric models, such as additive, single-index and (adaptive) functional-coefficient, in volatility estimation, all motivated as alternatives to deal with the curse of dimensionality present in fully nonparametric models; and (2) propose the use of a variation of the traditional cross-validation method to deal with model choice in the class of adaptive functional-coefficient models, choosing simultaneously the bandwidth, the number of covariates in the model and also the single-index smoothing variable. The modified cross-validation algorithm is able to tackle the computational burden caused by the model complexity, providing an important tool in semiparametric volatility estimation. We briefly discuss model identifiability when estimating volatility as well as nonnegativity of the resulting estimators. Furthermore, Monte Carlo simulations for several underlying generating models are implemented and applications to real data are provided. 相似文献

16.

Efficient Bayesian Multivariate Surface Regression

Feng Li Mattias Villani 《Scandinavian Journal of Statistics》2013,40(4):706-723

Methods for choosing a fixed set of knot locations in additive spline models are fairly well established in the statistical literature. The curse of dimensionality makes it nontrivial to extend these methods to nonadditive surface models, especially when there are more than a couple of covariates. We propose a multivariate Gaussian surface regression model that combines both additive splines and interactive splines, and a highly efficient Markov chain Monte Carlo algorithm that updates all the knot locations jointly. We use shrinkage prior to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. Simulated data and an application to firm leverage data show that the approach is computationally efficient, and that allowing for freely estimated knot locations can offer a substantial improvement in out‐of‐sample predictive performance. 相似文献

17.

Quantile regression in varying coefficient models

《Journal of statistical planning and inference》2004,121(1):113-125

This paper deals with the estimation of conditional quantiles in varying coefficient models by estimating the coefficients. Varying coefficient models are among popular models that have been proposed to alleviate the curse of dimensionality. Previous works on varying coefficient models deal with conditional means directly or indirectly. However, quantiles themselves can be defined without moment conditions and plotting several conditional quantiles would give us more understanding of the data than plotting just the conditional mean. Particularly, we estimate the conditional median by estimating varying coefficients by local L₁ regression. 相似文献

18.

Additive Nonparametric Regression in the Presence of Endogenous Regressors

Deniz Ozabaci Daniel J. Henderson Liangjun Su 《商业与经济统计学杂志》2014,32(4):555-575

In this article we consider nonparametric estimation of a structural equation model under full additivity constraint. We propose estimators for both the conditional mean and gradient which are consistent, asymptotically normal, oracle efficient, and free from the curse of dimensionality. Monte Carlo simulations support the asymptotic developments. We employ a partially linear extension of our model to study the relationship between child care and cognitive outcomes. Some of our (average) results are consistent with the literature (e.g., negative returns to child care when mothers have higher levels of education). However, as our estimators allow for heterogeneity both across and within groups, we are able to contradict many findings in the literature (e.g., we do not find any significant differences in returns between boys and girls or for formal versus informal child care). Supplementary materials for this article are available online. 相似文献

19.

A misspecification test for the higher order co-moments of the factor model

Wanbo Lu Dong Yang Kris Boudt 《Statistics》2019,53(3):471-488

The traditional estimation of higher order co-moments of non-normal random variables by the sample analog of the expectation faces a curse of dimensionality, as the number of parameters increases steeply when the dimension increases. Imposing a factor structure on the process solves this problem; however, it leads to the challenging task of selecting an appropriate factor model. This paper contributes by proposing a test that exploits the following feature: when the factor model is correctly specified, the higher order co-moments of the unexplained return variation are sparse. It recommends a general to specific approach for selecting the factor model by choosing the most parsimonious specification for which the sparsity assumption is satisfied. This approach uses a Wald or Gumbel test statistic for testing the joint statistical significance of the co-moments that are zero when the factor model is correctly specified. The asymptotic distribution of the test is derived. An extensive simulation study confirms the good finite sample properties of the approach. This paper illustrates the practical usefulness of factor selection on daily returns of random subsets of S&P 100 constituents. 相似文献

20.

An Extended Single‐index Model with Missing Response at Random

Qihua Wang Tao Zhang Wolfgang Karl Härdle 《Scandinavian Journal of Statistics》2016,43(4):1140-1152

An extended single‐index model is considered when responses are missing at random. A three‐step estimation procedure is developed to define an estimator for the single‐index parameter vector by a joint estimating equation. The proposed estimator is shown to be asymptotically normal. An algorithm for computing this estimator is proposed. This algorithm only involves one‐dimensional nonparametric smoothers, thereby avoiding the data sparsity problem caused by high model dimensionality. Some simulation studies are conducted to investigate the finite sample performances of the proposed estimators. 相似文献