首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
针对混合效应模型中固定效应与随机效应同时选择问题,提出了施加多个惩罚项的回归过程,同时给出了参数估计的交替迭代算法,并证明了算法的收敛性。针对两种特殊的多惩罚回归过程,分别利用计算机模拟数据进行了比较分析,结果显示新方法在各种不同条件下均有良好的表现,尤其是能处理高维稀疏的混合效应模型。最后通过一个实际数据演示了新方法的应用。  相似文献   

2.
文章针对面板数据的聚类问题的高维复杂性,利用线性投影技术将其转换为关于投影特征向量的线性聚类问题;从而实现在低维空间对高维数据样本的聚类分析。最后实证分析验证了面板数据聚类分析的投影寻踪模型的可行性与有效性。  相似文献   

3.
在含潜变量的纵向数据混合效应模型应用中,通常包含大量截尾数据,若直接采用一般贝叶斯Tobit分位回归模型,参数估计的马尔科夫链蒙特卡罗抽样算法将会极其复杂,造成计算效率低下且估计结果偏差较大。同时,在高维情形下,由于受大量未知随机效应的干扰,固定效应中关键变量的选择与系数估计变得更为困难。为了解决上述问题,文章提出了一种新的双Adaptive Lasso惩罚贝叶斯Tobit分位回归方法,主要研究响应变量左删失情形下高维纵向数据的变量选择与参数估计问题。通过将Adaptive Lasso惩罚同时引入固定效应与随机效应的先验分布中,构造了参数估计的Gibbs抽样算法。蒙特卡罗模拟结果表明,新方法较无惩罚法和Lasso惩罚法在重要变量选择及系数估计上均更占优势。  相似文献   

4.
空间面板数据模型设定问题分析   总被引:4,自引:0,他引:4  
空间面板数据模型将空间计量经济学和面板数据方法相结合,不仅同时考虑时空特征,而且将空间效应纳入研究体系,成为当前计量经济学的热点研究领域,但其模型设定、参数估计及模型检验也更为复杂,实证研究中往往出现模型设定偏误等问题。因此,基于空间面板数据模型的前沿理论,重点探讨模型设定中的常见问题,包括空间滞后模型与空间误差模型的选择、随机效应与固定效应的选择以及模型拟合优度的选择与比较,为模型的应用和新模型的扩展提供理论依据和参考。  相似文献   

5.
医疗费用预测是健康保险费率厘定的前提和基础。对于多年期的医疗费用数据,通常使用线性混合效应模型对其进行拟合,但线性混合效应模型对非线性关系的纵向数据建模具有一定的局限性。本文对线性混合效应模型进行扩展,根据医疗费用数据中变量之间的非线性关系,建立了多项式混合效应模型,并将其应用于一组医疗费用数据进行实证研究。结果表明,多项式混合效应模型对住院医疗费用的拟合效果显著优于通常使用的线性混合模型,在医疗费用管理和健康保险的费率厘定中具有重要的应用价值。  相似文献   

6.
于力超  金勇进 《统计研究》2016,33(1):95-102
抽样调查领域常采用对多个受访者进行跟踪调查得到面板数据,进而对总体特性进行统计推断,在面板数据中常含缺失数据,大多数处理面板缺失数据的软件都是直接删去含缺失值的受访者以得到完全数据集,当数据缺失机制为非随机缺失时会导致总体参数估计结果有偏。本文针对数据缺失机制为非随机缺失情形下,如何对面板数据进行统计分析进行了阐述,主要采用的是基于模型的似然推断法,对目标变量、缺失指示变量和随机效应向量的联合分布建模,在已有选择模型和模式混合模型的基础上,引入随机效应,研究目标变量期望的计算方法,并研究随机效应杂合模型下参数的估计方法,在变量分布相对简单的情形下给出了用极大似然法推断总体参数的估计步骤,最后通过模拟分析比较方法的优劣。  相似文献   

7.
面板数据模型的设定、统计检验和新进展   总被引:2,自引:2,他引:0  
在介绍面板数据及其优势与局限的基础上,首先,从异质性、时变性和相关性的观点对静态面板数据计量模型的设定、动态面板数据模型的估计方法和Granger因果检验进行系统的讨论。其次,按照假设检验的零假设进行分类,系统阐述面板单位根检验和协整检验理论。最后,介绍面板数据计量经济学的一些新进展。  相似文献   

8.
空气质量指数是与人们的日常活动密切相关的指标。基于中国18个城市2014年共52个周的空气污染计数数据进行负二项回归分析,通过运用广义线性混合效应模型和广义估计方程的方法进行比较分析,从理论和实际应用上得到了一定结论。研究结果表明:广义线性混合效应模型和广义估计方程两种方法在分析空气污染问题中差别不大;人口因素、城市园林绿化状况、气象因素、城市群效应以及季节效应对所研究城市的空气污染状况发生与否及其严重程度有显著的影响。  相似文献   

9.
考虑到面板数据的选择性偏误、不响应、样本流失及轮换面板数据的高成本,在实际应用中,根据研究的需要和两种样本各自的特征,有时将两种样本结合使用,从而得到普通面板数据和轮换面板数据的混合样本。文章提出了混合样本下双因素误差面板回归模型的迭代极大似然估计方法,得到了未知参数的迭代公式。使用蒙特卡罗模拟方法分析了面板数据和混合样本下参数估计的平均绝对偏差和均方误差,结果显示:与面板数据下的极大似然估计量相比,混合样本下迭代极大似然估计方法整体上降低了估计量的平均绝对偏差和均方误差,优于面板数据下的极大似然估计量。  相似文献   

10.
面板数据中,如果每个时期在样本中的个体不完全一样,则被称为非平衡面板数据.文章整理了非平衡面板数据估计方法的原理和思路,并采用2004~2011年中西部省际非平衡面板数据建立模型对影响中西部引进内资的主要因素进行了实证研究,结果显示,集聚效应因素、地区创新能力与中西部省份引进内资规模显著影响正相关.  相似文献   

11.
Simplifying Regression Models Using Dimensional Analysis   总被引:1,自引:0,他引:1  
Dimensional analysis can make a contribution to model formulation when some of the measurements in the problem are of physical factors. The analysis constructs a set of independent dimensionless factors that should be used as the variables of the regression in place of the original measurements. There are fewer of these than the originals and they often have a more appropriate interpretation. The technique is described briefly and its proposed role in regression discussed and illustrated with examples. We conclude that dimensional analysis can be effective in the preliminary stages of regression analysis whendeveloping formulations involving continuous variables with several dimensions.  相似文献   

12.
高维数据给传统的协方差阵估计方法带来了巨大的挑战,数据维度和噪声的影响使传统的CCCGARCH模型估计起来较为困难。将主成分和门限方法有效结合,应用到CCC-GARCH模型的估计中,提出基于主成分正交补门限方法的CCC-GARCH模型(PTCCC-GARCH)。PTCCC模型主要通过前K个最优主成分来刻画大维协方差阵的信息,并通过门限函数以剔除噪声的影响。通过模拟和实证研究发现:较CCCGARCH模型而言,PTCCC-GARCH模型明显提高了高维协方差阵的估计和预测效率;并且将其应用在投资组合时,投资者获得了更高的投资收益和经济福利。  相似文献   

13.
Many studies demonstrate that inference for the parameters arising in portfolio optimization often fails. The recent literature shows that this phenomenon is mainly due to a high‐dimensional asset universe. Typically, such a universe refers to the asymptotics that the sample size n + 1 and the sample dimension d both go to infinity while dnc ∈ (0,1). In this paper, we analyze the estimators for the excess returns’ mean and variance, the weights and the Sharpe ratio of the global minimum variance portfolio under these asymptotics concerning consistency and asymptotic distribution. Problems for stating hypotheses in high dimension are also discussed. The applicability of the results is demonstrated by an empirical study. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
数据分布密度划分的聚类算法是数据挖掘聚类算法的主要方法之一。针对传统密度划分聚类算法存在运算复杂、运行效率不高等缺陷,设计高维分步投影的多重分区聚类算法;以高维分布投影密度为依据,对数据集进行多重分区,产生数据集的子簇空间,并进行子簇合并,形成理想的聚类结果;依据该算法进行实验,结果证明该算法具有运算简单和运行效率高等优良性。  相似文献   

15.
Abstract. We propose an ?1‐penalized estimation procedure for high‐dimensional linear mixed‐effects models. The models are useful whenever there is a grouping structure among high‐dimensional observations, that is, for clustered data. We prove a consistency and an oracle optimality result and we develop an algorithm with provable numerical convergence. Furthermore, we demonstrate the performance of the method on simulated and a real high‐dimensional data set.  相似文献   

16.
In their recent work, Jiang and Yang studied six classical Likelihood Ratio Test statistics under high‐dimensional setting. Assuming that a random sample of size n is observed from a p‐dimensional normal population, they derive the central limit theorems (CLTs) when p and n are proportional to each other, which are different from the classical chi‐square limits as n goes to infinity, while p remains fixed. In this paper, by developing a new tool, we prove that the mentioned six CLTs hold in a more applicable setting: p goes to infinity, and p can be very close to n. This is an almost sufficient and necessary condition for the CLTs. Simulations of histograms, comparisons on sizes and powers with those in the classical chi‐square approximations and discussions are presented afterwards.  相似文献   

17.
We consider hypothesis testing problems for low‐dimensional coefficients in a high dimensional additive hazard model. A variance reduced partial profiling estimator (VRPPE) is proposed and its asymptotic normality is established, which enables us to test the significance of each single coefficient when the data dimension is much larger than the sample size. Based on the p‐values obtained from the proposed test statistics, we then apply a multiple testing procedure to identify significant coefficients and show that the false discovery rate can be controlled at the desired level. The proposed method is also extended to testing a low‐dimensional sub‐vector of coefficients. The finite sample performance of the proposed testing procedure is evaluated by simulation studies. We also apply it to two real data sets, with one focusing on testing low‐dimensional coefficients and the other focusing on identifying significant coefficients through the proposed multiple testing procedure.  相似文献   

18.
Yanfang Li  Jing Lei 《Statistics》2018,52(4):782-800
We study high dimensional multigroup classification from a sparse subspace estimation perspective, unifying the linear discriminant analysis (LDA) with other recent developments in high dimensional multivariate analysis using similar tools, such as penalization method. We develop two two-stage sparse LDA models, where in the first stage, convex relaxation is used to convert two classical formulations of LDA to semidefinite programs, and furthermore subspace perspective allows for straightforward regularization and estimation. After the initial convex relaxation, we use a refinement stage to improve the accuracy. For the first model, a penalized quadratic program with group lasso penalty is used for refinement, whereas a sparse version of the power method is used for the second model. We carefully examine the theoretical properties of both methods, alongside with simulations and real data analysis.  相似文献   

19.
Fan J  Lv J 《Statistica Sinica》2010,20(1):101-148
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.  相似文献   

20.
Technical advances in many areas have produced more complicated high‐dimensional data sets than the usual high‐dimensional data matrix, such as the fMRI data collected in a period for independent trials, or expression levels of genes measured in different tissues. Multiple measurements exist for each variable in each sample unit of these data. Regarding the multiple measurements as an element in a Hilbert space, we propose Principal Component Analysis (PCA) in Hilbert space. The principal components (PCs) thus defined carry information about not only the patterns of variations in individual variables but also the relationships between variables. To extract the features with greatest contributions to the explained variations in PCs for high‐dimensional data, we also propose sparse PCA in Hilbert space by imposing a generalized elastic‐net constraint. Efficient algorithms to solve the optimization problems in our methods are provided. We also propose a criterion for selecting the tuning parameter.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号