期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

胡亚南田茂再《统计研究》2019,36(1):104-114

零膨胀计数数据破坏了泊松分布的方差-均值关系，可由取值服从泊松分布的数据和取值为零(退化分布)的数据各占一定比例所构成的混合分布所解释。本文基于自适应弹性网技术，研究了零膨胀计数数据的联合建模及变量选择问题.对于零膨胀泊松分布，引入潜变量，构造出零膨胀泊松模型的完全似然, 其中由零膨胀部分和泊松部分两项组成.考虑到协变量可能存在共线性和稀疏性，通过对似然函数加自适应弹性网惩罚得到目标函数,然后利用EM算法得到回归系数的稀疏估计量，并用贝叶斯信息准则BIC来确定最优调节参数.本文也给出了估计量的大样本性质的理论证明和模拟研究，最后把所提出的方法应用到实际问题中。相似文献

2.

零膨胀泊松模型的改进在零次索赔建模中的应用

郭念国《统计与信息论坛》2010,25(7):22-25

零膨胀是非寿险精算中的一种常见现象,国内外许多学者对此进行了研究分析,而最具影响力的方法是零膨胀泊松模型与Hurdle模型,但这两个方法在区分零之间的差别时存在不足。实际中,产生零次索赔的保单持有人并非全部同质,如何提取零中所包含的信息对保险公司来说是重要的。鉴此,基于零膨胀泊松模型与Hurdle模型的思想,提出修正的零膨胀泊松模型,并利用非寿险精算中的实际数据,对新模型进行了拟合分析。与零膨胀泊松模型拟合结果的比较说明,修正的零膨胀模型在零的处理上更符合实际情况,更能体现零中所包含的信息。相似文献

3.

函数型变量倾斜分位回归模型及其应用

田茂再梅波《统计研究》2019,36(8):114-128

本文考虑函数型数据的结构特征,针对两类函数型变量分位回归模型（函数型因变量对标量自变量和函数型因变量对函数型自变量）,基于函数型倾斜分位曲线的定义构建新型函数型倾斜分位回归模型。对于第二类模型,本文分别考虑样条基函数对模型系数展开和函数型主成分基函数对函数型自变量展开,得到倾斜分位回归模型的基本形式。参数估计采用成分梯度Boosting算法最小化加权非对称损失函数,提高计算效率。在理论上证明了倾斜分位回归模型的系数估计量均服从渐近正态分布。模拟和实证研究结果显示,倾斜分位回归模型比已有的逐点分位回归模型具有更好的拟合效果。根据积分均方预测误差准则,本文提出的模型有一致较好的预测能力。相似文献

4.

随机效应零膨胀索赔次数回归模型

孟生旺杨亮《统计研究》2015,32(11):97-103

索赔频率预测是非寿险费率厘定的重要组成部分。最常使用的索赔频率预测模型是泊松回归和负二项回归,以及与它们相对应的零膨胀回归模型。但是,当索赔次数观察值既具有零膨胀特征,又存在组内相依结构时,上述模型都不能很好地拟合实际数据。为此,本文在泊松分布、负二项分布、广义泊松分布、P型负二项分布等条件下分别建立了随机效应零膨胀损失次数回归模型。为了改进模型的预测效果,对于连续型的解释变量,还引入了二次平滑项,并建立了结构性零比例与解释变量之间的回归关系。基于一组实际索赔次数数据的实证分析结果表明,该模型可以显著改进现有模型的拟合效果。相似文献

5.

零膨胀模型在非寿险中应用 总被引：1，自引：0，他引：1

徐昕尹占华郭念国《统计教育》2009,(4):31-33,42

分类费率厘定中最常使用的模型之一是泊松回归模型，但当损失次数数据存在零膨胀特征时，通常会采用零膨胀模型来解决。本文讨论一些零膨胀模型在非寿险中的应用，并通过对一组汽车保险损失数据的拟合，发现零膨胀模型可以有效改善对实际损失数据的拟合效果。相似文献

6.

混合类型辅助变量下模型校准抽样估计研究

毕画伍业锋《统计研究》2017,(9):120-128

在超总体模型中,一般用于构建模型的辅助变量多为连续型变量,对混合类型辅助变量的模型研究较少.为了同时利用与研究变量相关的连续型和离散型辅助变量的信息,本文提出在模型校准的框架下,利用非参数核回归方法,得到混合类型辅助变量下的模型校准估计量.研究证明,该估计量是渐进设计无偏、设计一致和渐进正态的,并给出了估计量的方差和方差的估计量.数值模拟的结果显示,本文在总体回归函数为线性和非线性的情况下,估计效果均有所提高.此外,通过CLHLS数据的验证也表明该估计量的效果优于仅利用连续型辅助变量的估计量. 相似文献

7.

Hurdle模型在非寿险分类费率厘定中的应用

徐昕郭念国《统计与决策》2012,(9):28-31

泊松回归模型是常用的索赔次数预测模型。但在实务中,索赔次数往往具有零膨胀特征,如果继续使用泊松模型会低估参数的标准误差,高估其显著性水平,从而在模型中保留多余的解释变量,产生不准确费率厘定结果。Hurdel模型是一个二阶段模型,可以将索赔次数分为两个部分来处理。因此,利用该模型的这一性质来处理费率厘定中具有零膨胀特征的索赔数据,可以有效地改善拟合效果。相似文献

8.

基于部分函数型线性回归模型的改进

程丽娟《统计与决策》2017,(11):70-72

金融市场的交易是不间断的,价格始终高频的更新,在金融数据的研究中,经常遇到函数型数据.文章主要建立部分函数型线性回归模型,分析函数型数据在上证指数预测中的应用,根据函数型数据分析的原理及其求解主成分分析的方法,使用Matlab对上证指数进行预测. 相似文献

9.

第三类Tobit模型的半参数估计方法

周先波潘哲文《统计研究》2015,32(5):97-105

本文给出第三类Tobit模型的一种新的半参数估计方法。在独立性假设下,利用主方程和选择方程中可观察受限因变量的条件生存函数所满足的关系式,构造第三类Tobit模型参数的一步联立估计量。在已知选择方程中参数一致性估计量的条件下,这种方法也可用于构造主方程模型参数的两步估计量。本文证明了所提出的一步联立估计量和两步估计量的一致性和渐近正态性。实验模拟表明,我们提出的估计量在有限样本下具有良好表现,且一步联立估计量的有限样本表现优于或接近于Chen（1997）的估计量。相似文献

10.

部分函数型线性变系数模型的序列相关检验

谭祥勇等《统计研究》2021,38(2):135-145

部分函数型线性变系数模型(PFLVCM)是近几年出现的一个比较灵活、应用广泛的新模型。在实际应用中,搜集到的经济和金融数据往往存在序列相关性。如果不考虑数据间的相关性直接对其进行建模,会影响模型中参数估计的精度和有效性。本文主要研究了PFLVCM中误差的序列相关性的检验问题,基于经验似然,把标量时间序列数据相关性检验的方法拓展到函数型数据中,提出了经验对数似然比检验统计量,并在零假设下得到了检验统计量的近似分布。通过蒙特卡洛数值模拟说明该统计量在有限样本下有良好的水平和功效。最后,把该方法用于检验美国商业用电消费数据是否有序列相关性,证明该统计量的有效性和实用性。相似文献

11.

Semi Varying Coefficient Zero-Inflated Generalized Poisson Regression Model

Weihua Zhao Jicai Liu Yazhao Lv 《统计学通讯:理论与方法》2013,42(1):171-185

In this paper, the semi varying coefficient zero-inflated generalized Poisson model is discussed based on penalized log-likelihood. All the coefficient functions are fitted by penalized spline (P-spline), and Expectation-maximization algorithm is used to drive these estimators. The estimation approach is rapid and computationally stable. Under some mild conditions, the consistency and the asymptotic normality of these resulting estimators are given. The score test statistics about dispersion parameter is discussed based on the P-spline estimation. Both simulated and real data example are used to illustrate our proposed methods. 相似文献

12.

Functional Form for the Zero-Inflated Generalized Poisson Regression Model

Hossein Zamani 《统计学通讯:理论与方法》2014,43(3):515-529

The generalized Poisson (GP) regression is an increasingly popular approach for modeling overdispersed as well as underdispersed count data. Several parameterizations have been performed for the GP regression, and the two well known models, the GP-1 and the GP-2, have been applied. The GP-P regression, which has been recently proposed, has the advantage of nesting the GP-1 and the GP-2 parametrically, besides allowing the statistical tests of the GP-1 and the GP-2 against a more general alternative. In several cases, count data often have excessive number of zero outcomes than are expected in the Poisson. This zero-inflation phenomenon is a specific cause of overdispersion, and the zero-inflated Poisson (ZIP) regression model has been proposed. However, if the data continue to suggest additional overdispersion, the zero-inflated negative binomial (ZINB-1 and ZINB-2) and the zero-inflated generalized Poisson (ZIGP-1 and ZIGP-2) regression models have been considered as alternatives. This article proposes a functional form of the ZIGP which mixes a distribution degenerate at zero with a GP-P distribution. The suggested model has the advantage of nesting the ZIP and the two well known ZIGP (ZIGP-1 and ZIGP-2) regression models, besides allowing the statistical tests of the ZIGP-1 and the ZIGP-2 against a more general alternative. The ZIP and the functional form of the ZIGP regression models are fitted, compared and tested on two sets of count data; the Malaysian insurance claim data and the German healthcare data. 相似文献

13.

Estimation on semi-functional linear errors-in-variables models

Hanbing Zhu Huiying Li 《统计学通讯:理论与方法》2013,42(17):4380-4393

Abstract

Semi-functional linear regression models are important in practice. In this paper, their estimation is discussed when function-valued and real-valued random variables are all measured with additive error. By means of functional principal component analysis and kernel smoothing techniques, the estimators of the slope function and the non parametric component are obtained. To account for errors in variables, deconvolution is involved in the construction of a new class of kernel estimators. The convergence rates of the estimators of the unknown slope function and non parametric component are established under suitable norm and conditions. Simulation studies are conducted to illustrate the finite sample performance of our method. 相似文献

14.

Two-step estimation of functional linear models with applications to longitudinal data

J. Fan & J.-T. Zhang 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(2):303-322

Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coefficient functions nonparametrically but these methods are either intensive in computation or inefficient in performance. To overcome these drawbacks, in this paper, a simple and powerful two-step alternative is proposed. In particular, the implementation of the proposed approach via local polynomial smoothing is discussed. Methods for estimating standard deviations of estimated coefficient functions are also proposed. Some asymptotic results for the local polynomial estimators are established. Two longitudinal data sets, one of which involves time-dependent covariates, are used to demonstrate the approach proposed. Simulation studies show that our two-step approach improves the kernel method proposed by Hoover and co-workers in several aspects such as accuracy, computational time and visual appeal of the estimators. 相似文献

15.

A semi-parametric cox’s regression model for zero-inflated left-censored time to event data

Roel Braekers Yves Grouwels 《统计学通讯:理论与方法》2013,42(7):1969-1988

Abstract

In some clinical, environmental, or economical studies, researchers are interested in a semi-continuous outcome variable which takes the value zero with a discrete probability and has a continuous distribution for the non-zero values. Due to the measuring mechanism, it is not always possible to fully observe some outcomes, and only an upper bound is recorded. We call this left-censored data and observe only the maximum of the outcome and an independent censoring variable, together with an indicator. In this article, we introduce a mixture semi-parametric regression model. We consider a parametric model to investigate the influence of covariates on the discrete probability of the value zero. For the non-zero part of the outcome, a semi-parametric Cox’s regression model is used to study the conditional hazard function. The different parameters in this mixture model are estimated using a likelihood method. Hereby the infinite dimensional baseline hazard function is estimated by a step function. As results, we show the identifiability and the consistency of the estimators for the different parameters in the model. We study the finite sample behaviour of the estimators through a simulation study and illustrate this model on a practical data example. 相似文献

16.

Small Area Estimation for Zero-Inflated Data

Hukum Chandra U. C. Sud 《统计学通讯:模拟与计算》2013,42(5):632-643

The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study. 相似文献

17.

A Comparative Study of Observation- and Parameter-driven Zero-inflated Poisson Models for Longitudinal Count Data

M. Tariqul Hasan Gary Sneddon 《统计学通讯:模拟与计算》2016,45(10):3643-3659

Longitudinal count data with excessive zeros frequently occur in social, biological, medical, and health research. To model such data, zero-inflated Poisson (ZIP) models are commonly used, after separating zero and positive responses. As longitudinal count responses are likely to be serially correlated, such separation may destroy the underlying serial correlation structure. To overcome this problem recently observation- and parameter-driven modelling approaches have been proposed. In the observation-driven model, the response at a specific time point is modelled through the responses at previous time points after incorporating serial correlation. One limitation of the observation-driven model is that it fails to accommodate the presence of any possible over-dispersion, which frequently occurs in the count responses. This limitation is overcome in a parameter-driven model, where the serial correlation is captured through the latent process using random effects. We compare the results obtained by the two models. A quasi-likelihood approach has been developed to estimate the model parameters. The methodology is illustrated with analysis of two real life datasets. To examine model performance the models are also compared through a simulation study. 相似文献

18.

Quantile regression in functional linear semiparametric model

Tang Qingguo Linglong Kong 《Statistics》2017,51(6):1342-1358

This paper proposes nonparametric estimation methods for functional linear semiparametric quantile regression, where the conditional quantile of the scalar responses is modelled by both scalar and functional covariates and an additional unknown nonparametric function term. The slope function is estimated using the functional principal component basis and the nonparametric function is approximated by a piecewise polynomial function. The asymptotic distribution of the estimators of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. The asymptotic distribution of the estimator of the unknown nonparametric function is also established. Simulation studies are conducted to investigate the finite-sample performance of the proposed estimators. The proposed methodology is demonstrated by analysing a real data from ADHD-200 sample. 相似文献