针对高维混合效应模型,本文提出了一种双正则化分位回归方法.通过对随机和固定效应系数同时实施L1正则化惩罚,一方面能够对重要解释变量进行挑选,另一方面能够消除个体随机波动带来的偏差.求解参数估计的交替迭代算法不仅破解了要同时确定两个调整参数的难题,而且算法速度快.模拟结果也表明该方法不仅对误差类型有很强的抗干扰能力,同时在模型有不同稀疏程度时均表现良好,尤其是对于解释变量多于样本的高维情况.为了方便在实际问题中选择最优正则化参数,本文还对两种参数选取标准进行了比较研究.最后利用新方法对一个教育方面的数据进行了实证演示,找出了在各个分位点处对学生成绩有影响的重要因素.  相似文献   

尽管贝叶斯分位数回归方法能够有效克服经济金融数据的尖峰厚尾、结构突变等问题,充分借鉴已有研究成果信息,但是其并不能很好解决多维变量模型的维数灾难问题.为此,文章在贝叶斯分位数回归基础上,结合自适应Lasso变量惩罚作用,构建了基于MH抽样的自适应Lasso惩罚贝叶斯分位数回归模型.通过仿真模拟实验以及MCMC链条检验,证明上述模型具有优良拟合性质,尤其是在小样本情形下.  相似文献   

SCAD惩罚逻辑回归的财务预警模型   总被引:2,自引:1,他引:2  
作为一种有监督学习算法,逻辑回归(Logistic Regression,LR)已广泛应用于财务危机建模分析,但其潜在地存在过拟合问题。鉴此,提出一种基于平滑削边绝对偏离(Smoothly Clipped Absolute Deviation,SCAD)惩罚逻辑回归的财务预警模型。该模型不仅能很好地解决模型过拟合问题,而且还可以同时实现变量选择和模型系数估计,并提高了模型的解释性。结合沪深股市A股制造业上市公司的财务数据进行实证研究,同时对比一般的L1正则化和L2正则化逻辑回归模型的预警效果进行实证分析,实验结果表明:SCAD惩罚逻辑回归模型具有较好的分类效果和较强的经济解释能力。  相似文献   

文章关注系数具有两维异质性结构的面板分位数模型,基于SCAD惩罚函数和MCP惩罚函数提出双惩罚最小加权绝对偏差目标函数,同时进行参数估计和两维异质性结构识别。利用ADMM算法求解目标函数,并使用BIC信息准则通过网格搜索选择最优调节参数。根据蒙特卡洛模拟结果验证了所提方法的有限样本性质,最后使用实际数据检验了其应用效果。研究结果表明:所提出的方法能够准确识别两维异质性结构,并且Post估计量的参数估计精确度接近于Oracle估计量。  相似文献   

文章采用非线性时变因子模型对中国农村金融发展的收敛性进行了研究,研究结果表明中国农村金融发展整体上不存在收敛性,传统东中西三大经济地带中仅西部地区表现出收敛特征.而通过俱乐部收敛算法,内生识别出中国农村金融发展存在4个收敛俱乐部,并进一步对4个收敛俱乐部的分布及相对转移路径的形成机制进行了分析.  相似文献   

在云模型、量子算法、神经网络算法等理论研究的基础上,设计了一种以量子比特神经元为信息处理单元的多层量子神经网络——基于云模型的混合量子神经网络算法。在标准数据集上进行的实验仿真表明:混合量子算法具有量子算法轨迹行为性能的优势;同时该混合算法可将提取的特征输入到量子神经网络中对数据集进行分类。该算法改进了量子神经网络的损失函数,提高了误差分析性能。最后,通过仿真实验验证了该混合量子算法在收敛速度和鲁棒性等方面均优于量子神经网络算法。  相似文献   

本文对混合效应模型提出了一种非参数贝叶斯分位回归方法,通过引进一种新的分层有限正态混合分布,将分位回归建模时对随机误差项的假定放宽至仅有分位点约束之下.通过对混合比例参数假设广泛灵活的Stick-Breaking先验,使得模型捕捉复杂数据分布信息的能力更强.在建立的非参数贝叶斯分层分位回归模型中引入潜变量,使模型参数估计的Gibbs抽样算法中原来每次需要计算(2M)N项函数值变为每次只需计算N项即可.蒙特卡罗模拟显示,在误差分布函数变得较为复杂时,非参数贝叶斯分位回归方法比参数方法在估计效果上有更大的优势.  相似文献   

罗军 《统计与决策》2020,(8):170-173
文章基于Stackelberg博弈模型设定了供应中断惩罚机制,分析了存在供应中断的风险下,当供应商采用MTO方式供应时,双方博弈的结果是:采购商的订货量随着终端市场销售价格和供应商供应中断惩罚成本的提升而提升,随着供应商的稳定性水平、定价水平的增加而降低,这一订货量决策是采购商的Stackelberg博弈最优解。供应商的稳定性水平是决策模型最重要的外界环境变量,当供应中断概率增加时,供应商本能地降低产品报价,来主动规避自身无法回避的供应中断风险,从而刺激采购商提升订货量。  相似文献   

在采用回归方法进行数据预测时,对呈近似线性关系的因变量和自变量,并非要寻找到其对应的精确的非线性函数,而可在对数据进行修正后继续使用线性回归模型。文章讨论了一种引入惩罚因子的动态回归模型,该方法在传统的多元线性回归模型的基础上,在进行逐步回归的同时,通过不断调整因变量来实现实时更改其变化趋势以达到最佳预测结果的目的。该方法在对上海市历年外国游客人数进行分析和预测时得到了较理想的结果。  相似文献   

随机效应的引入为面板数据建模中样本相关和异方差问题提供了重要解决途径,过多的随机效应不仅会极大地增加模型复杂度,而且给固定效应系数的估计带来偏差.文章在考虑到随机效应具有整体性基础上,以横截面个体为单位,对其进行整体压缩.通过对固定和随机效应分别引入不同形式的条件Laplace先验,构造了一种与Group Lasso-Lasso惩罚相等价的贝叶斯双惩罚分位回归估计方法.通过设计切片Gibbs抽样算法,快速有效地解决了模型参数估计问题.计算机模拟显示,该方法不仅能对固定和随机效应参数进行精确估计,而且能对模型中真实包含的固定和随机效应进行自动选择.  相似文献   

混合地理加权回归模型的统计诊断   总被引:2,自引:0,他引:2  
混合地理加权回归模型作为一类能简单有效解决空间非平稳问题的数据分析方法已经得到了广泛的应用.在利用该模型分析实际数据时,一个或多个特殊观测点的存在能导致估计结果发生较大改变.为了能有效检测出异常点,系统研究这类半参数模型的统计诊断与影响分析.首先基于数据删除模型定义了参数分量对应的Cook统计量,其次,基于均值漂移模型讨论了异常点的检验问题,构造了相应的检验统计量.  相似文献   

The implications of parameter orthogonality for the robustness of survival regression models are considered. The question of which of the proportional hazards or the accelerated life families of models would be more appropriate for analysis is usually ignored, and the proportional hazards family is applied, particularly in medicine, for convenience. Accelerated life models have conventionally been used in reliability applications. We propose a one-parameter family mixture survival model which includes both the accelerated life and the proportional hazards models. By orthogonalizing relative to the mixture parameter, we can show that, for small effects of the covariates, the regression parameters under the alternative families agree to within a constant. This recovers a known misspecification result. We use notions of parameter orthogonality to explore robustness to other types of misspecification including misspecified base-line hazards. The results hold in the presence of censoring. We also study the important question of when proportionality matters.  相似文献   


In the stepwise procedure of selection of a fixed or a random explanatory variable in a mixed quantitative linear model with errors following a Gaussian stationary autocorrelated process, we have studied the efficiency of five estimators relative to Generalized Least Squares (GLS): Ordinary Least Squares (OLS), Maximum Likelihood (ML), Restricted Maximum Likelihood (REML), First Differences (FD), and First-Difference Ratios (FDR). We have also studied the validity and power of seven derived testing procedures, to assess the significance of the slope of the candidate explanatory variable x 2 to enter the model in which there is already one regressor x 1. In addition to five testing procedures of the literature, we considered the FDR t-test with n ? 3 df and the modified t-test with n? ? 3 df for partial correlations, where n? is Dutilleul's effective sample size. Efficiency, validity, and power were analyzed by Monte Carlo simulations, as functions of the nature, fixed vs. random (purely random or autocorrelated), of x 1 and x 2, the sample size and the autocorrelation of random terms in the regression model. We report extensive results for the autocorrelation structure of first-order autoregressive [AR(1)] type, and discuss results we obtained for other autocorrelation structures, such as spherical semivariogram, first-order moving average [MA(1)] and ARMA(1,1), but we could not present because of space constraints. Overall, we found that:
  1. the efficiency of slope estimators and the validity of testing procedures depend primarily on the nature of x 2, but not on that of x 1;

  2. FDR is the most inefficient slope estimator, regardless of the nature of x 1 and x 2;

  3. REML is the most efficient of the slope estimators compared relative to GLS, provided the specified autocorrelation structure is correct and the sample size is large enough to ensure the convergence of its optimization algorithm;

  4. the FDR t-test, the modified t-test and the REML t-test are the most valid of the testing procedures compared, despite the inefficiency of the FDR and OLS slope estimators for the former two;

  5. the FDR t-test, however, suffers from a lack of power that varies with the nature of x 1 and x 2; and

  6. the modified t-test for partial correlations, which does not require the specification of an autocorrelation structure, can be recommended when x 1 is fixed or random and x 2 is random, whether purely random or autocorrelated. Our results are illustrated by the environmental data that motivated our work.


贝叶斯非线性混合效应模型及其应用研究   总被引:1,自引:0,他引:1  
由于常用的线性混合效应模型对具有非线性关系的纵向数据建模具有一定的局限性,因此对线性混合效应模型进行扩展,根据变量间的非线性关系建立不同的非线性混合效应模型,并根据因变量的分布特征建立混合分布模型。基于一组实际的保险损失数据,建立多项式混合效应模型、截断多项式混合效应模型和B样条混合效应模型。研究结果表明,非线性混合效应模型能够显著改进对保险损失数据的建模效果,对非寿险费率厘定具有重要参考价值。  相似文献   

In this article, we use two efficient approaches to deal with the difficulty in computing the intractable integrals when implementing Gibbs sampling in the nonlinear mixed effects model (NLMM) based on Dirichlet processes (DP). In the first approach, we compute the Laplace's approximation to the integral for its high accuracy, low cost, and ease of implementation. The second approach uses the no-gaps algorithm of MacEachern and Müller (1998 MacEachern , S. , Müller , P. ( 1998 ). Estimating mixtures of Dirichlet process models . Journal of Computational and Graphical Statistics 7 : 223238 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) to perform Gibbs sampling without evaluating the difficult integral. We apply both approaches to real problems and simulations. Results show that both approaches perform well in density estimation and prediction and are superior to the parametric analysis in that they can detect important model features, such as skewness, long tails, and multimodality, whereas the parametric analysis cannot.  相似文献   

In this article, the zero-one inflated binomial mixed regression is proposed to model proportional data with large frequencies of both zeros and binomial denominators. Score tests for assessing both extra zeros and extra binomial denominators in proportional data are developed. The empirical levels and empirical powers of the score test statistics are evaluated using a simulation study. Finally, the application of the proposed model is illustrated on the whitefly data.  相似文献   

Nonparametric functional model with functional responses has been proposed within the functional reproducing kernel Hilbert spaces (fRKHS) framework. Motivated by its superior performance and also its limitations, we propose a Gaussian process model whose posterior mode coincide with the fRKHS estimator. The Bayesian approach has several advantages compared to its predecessor. We also use the predictive process models adapted from the spatial statistics literature to overcome the computational limitations. Modifications of predictive process models are nevertheless critical in our context to obtain valid inferences. The numerical results presented demonstrate the effectiveness of the modifications.  相似文献   


Errors-in-variable (EIV) regression is often used to gauge linear relationship between two variables both suffering from measurement and other errors, such as, the comparison of two measurement platforms (e.g., RNA sequencing vs. microarray). Scientists are often at a loss as to which EIV regression model to use for there are infinite many choices. We provide sound guidelines toward viable solutions to this dilemma by introducing two general nonparametric EIV regression frameworks: the compound regression and the constrained regression. It is shown that these approaches are equivalent to each other and, to the general parametric structural modeling approach. The advantages of these methods lie in their intuitive geometric representations, their distribution free nature, and their ability to offer candidate solutions with various optimal properties when the ratio of the error variances is unknown. Each includes the classic nonparametric regression methods of ordinary least squares, geometric mean regression (GMR), and orthogonal regression as special cases. Under these general frameworks, one can readily uncover some surprising optimal properties of the GMR, and truly comprehend the benefit of data normalization. Supplementary materials for this article are available online.  相似文献   

