排序方式: 共有6条查询结果,搜索用时 15 毫秒
1
1.
2.
3.
In the era of Big Data, extracting the most important exploratory variables available in ultrahigh-dimensional data plays a key role in scientific researches. Existing researches have been mainly focusing on applying the extracted exploratory variables to describe the central tendency of their related response variables. For a response variable, its variability characteristic is as much important as the central tendency in statistical inference. This paper focuses on the variability and proposes a new model-free feature screening approach: sure explained variability and independence screening (SEVIS). The core of SEVIS is to take the advantage of recently proposed asymmetric and nonlinear generalised measures of correlation in the screening. Under some mild conditions, the paper shows that SEVIS not only possesses desired sure screening property and ranking consistency property, but also is a computational convenient variable selection method to deal with ultrahigh-dimensional data sets with more features than observations. The superior performance of SEVIS, compared with existing model-free methods, is illustrated in extensive simulations. A real example in ultrahigh-dimensional variable selection demonstrates that the variables selected by SEVIS better explain not only the response variables, but also the variables selected by other methods. 相似文献
4.
In recent years, numerous feature screening schemes have been developed for ultra-high dimensional standard survival data with only one failure event. Nevertheless, existing literature pays little attention to related investigations for competing risks data, in which subjects suffer from multiple mutually exclusive failures. In this article, we develop a new marginal feature screening for ultra-high dimensional time-to-event data to allow for competing risks. The proposed procedure is model-free, and robust against heavy-tailed distributions and potential outliers for time to the type of failure of interest. Apart from this, it is invariant to any monotone transformation of event time of interest. Under rather mild assumptions, it is shown that the newly suggested approach possesses the ranking consistency and sure independence screening properties. Some numerical studies are conducted to evaluate the finite-sample performance of our method and make a comparison with its competitor, while an application to a real data set is provided to serve as an illustration. 相似文献
5.
Xiaolin Chen 《Journal of Statistical Computation and Simulation》2018,88(12):2425-2446
This paper is concerned with the conditional feature screening for ultra-high dimensional right censored data with some previously identified important predictors. A new model-free conditional feature screening approach, conditional correlation rank sure independence screening, has been proposed and investigated theoretically. The suggested conditional screening procedure has several desirable merits. First, it is model free, and thus robust to model misspecification. Second, it has the advantage of robustness of heavy-tailed distributions of the response and the presence of potential outliers in response. Third, it is naturally applicable to complete data when there is no censoring. Through simulation studies, we demonstrate that the proposed approach outperforms the CoxCS of Hong et al. under some circumstances. A real dataset is used to illustrate the usefulness of the proposed conditional screening method. 相似文献
6.
This paper demonstrates how certain statistics, computed from a sample of size n (from almost any distribution) may be simulated by using a sequence of substantially less than n random normal variates. Many statistics, θ, including almost all maximum likelihood estimates, can be expressed in terms of the sample trigonometric moments, STM. The STM are asymptotically multivariate normal with a mean vector and variance-covariance matrix easily expressible in terms of equally spaced characteristic function evaluations. Thus one only needs to know the Fourier transform or equivalently the characteristic function associated with elements of any moderate to large i. i. d. sample and have access to a normal random number generator to generate a sequence of STM with distributional properties almost identical to those of STM computed from that sample. These STM can in turn be used to compute the desired statistic θ. 相似文献
1