In high-dimensional linear regression, the dimension of variables is always greater than the sample size. In this situation, the traditional variance estimation technique based on ordinary least squares constantly exhibits a high bias even under sparsity assumption. One of the major reasons is the high spurious correlation between unobserved realized noise and several predictors. To alleviate this problem, a refitted cross-validation (RCV) method has been proposed in the literature. However, for a complicated model, the RCV exhibits a lower probability that the selected model includes the true model in case of finite samples. This phenomenon may easily result in a large bias of variance estimation. Thus, a model selection method based on the ranks of the frequency of occurrences in six votes from a blocked 3×2 cross-validation is proposed in this study. The proposed method has a considerably larger probability of including the true model in practice than the RCV method. The variance estimation obtained using the model selected by the proposed method also shows a lower bias and a smaller variance. Furthermore, theoretical analysis proves the asymptotic normality property of the proposed variance estimation. 相似文献
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets. 相似文献
Let X = {X1, X2, …} be a sequence of independent but not necessarily identically distributed random variables, and let η be a counting random variable independent of X. Consider randomly stopped sum Sη = ∑ηk = 1Xk and random maximum S(η) ? max?{S0, …, Sη}. Assuming that each Xk belongs to the class of consistently varying distributions, on the basis of the well-known precise large deviation principles, we prove that the distributions of Sη and S(η) belong to the same class under some mild conditions. Our approach is new and the obtained results are further studies of Kizinevi?, Sprindys, and ?iaulys (2016) and Andrulyt?, Manstavi?ius, and ?iaulys (2017). 相似文献
China implemented the two-child policy in 2016, however, potential impacts of this new policy on its population reality have not been adequately understood. Using population census data and 1% population sampling data during the period of 1982–2015, this study develops a fertility simulation model to explore the effects of the two-child policy on women’s total fertility rate, and employs Cohort Component Method in population projections to examine China’s demographic future with different fertility regimes. The fertility simulation results reveal that the two-child policy will make significantly positive effects on China’s total fertility rate through increasing second births, leading to a sharp but temporary increase in the first 5 years after the implementation of the new policy. In addition, population projections using simulated total fertility rates show that the Chinese population would reach its peak value around the middle 2020s and be faced with the reduction of labor force supply and rapid aging process, featured with remarkable increases in both size and share of the elderly population. The findings suggest that the two-child policy would undoubtedly affect China’s fertility rates and demographic future; however, the effects are mild and temporary.