首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
面板数据的自适应Lasso分位回归方法研究   总被引:2,自引:4,他引:2  
如何在对参数进行估计的同时自动选择重要解释变量,一直是面板数据分位回归模型中讨论的热点问题之一。通过构造一种含多重随机效应的贝叶斯分层分位回归模型,在假定固定效应系数先验服从一种新的条件Laplace分布的基础上,给出了模型参数估计的Gibbs抽样算法。考虑到不同重要程度的解释变量权重系数压缩程度应该不同,所构造的先验信息具有自适应性的特点,能够准确地对模型中重要解释变量进行自动选取,且设计的切片Gibbs抽样算法能够快速有效地解决模型中各个参数的后验均值估计问题。模拟结果显示,新方法在参数估计精确度和变量选择准确度上均优于现有文献的常用方法。通过对中国各地区多个宏观经济指标的面板数据进行建模分析,演示了新方法估计参数与挑选变量的能力。  相似文献   

SCAD惩罚逻辑回归的财务预警模型   总被引:2,自引:1,他引:2  
作为一种有监督学习算法,逻辑回归(Logistic Regression,LR)已广泛应用于财务危机建模分析,但其潜在地存在过拟合问题。鉴此,提出一种基于平滑削边绝对偏离(Smoothly Clipped Absolute Deviation,SCAD)惩罚逻辑回归的财务预警模型。该模型不仅能很好地解决模型过拟合问题,而且还可以同时实现变量选择和模型系数估计,并提高了模型的解释性。结合沪深股市A股制造业上市公司的财务数据进行实证研究,同时对比一般的L1正则化和L2正则化逻辑回归模型的预警效果进行实证分析,实验结果表明:SCAD惩罚逻辑回归模型具有较好的分类效果和较强的经济解释能力。  相似文献   

Identification of influential genes and clinical covariates on the survival of patients is crucial because it can lead us to better understanding of underlying mechanism of diseases and better prediction models. Most of variable selection methods in penalized Cox models cannot deal properly with categorical variables such as gender and family history. The group lasso penalty can combine clinical and genomic covariates effectively. In this article, we introduce an optimization algorithm for Cox regression with group lasso penalty. We compare our method with other methods on simulated and real microarray data sets.  相似文献   

Orthogonal regression is a proper tool to analyze relations between two variables when three-part compositional data, i.e., three-part observations carrying relative information (like proportions or percentages), are under examination. When linear statistical models with type-II constraints (constraints involving other parameters besides the ones of the unknown model) are employed for estimating the parameters of the regression line, approximate variances and covariances of the estimated line coefficients can be determined. Moreover, the additional assumption of normality enables to construct confidence domains and perform hypotheses testing. The theoretical results are applied to a real-world example.  相似文献   

缺失偏态数据下线性回归模型的统计推断   总被引:1,自引:2,他引:1  
研究缺失偏态数据下线性回归模型的参数估计问题,针对缺失偏态数据,为克服样本分布扭曲缺点和提高模型的回归系数、尺度参数和偏度参数的估计效果,提出了一种适合偏态数据下线性回归模型中缺失数据的修正回归插补方法.通过随机模拟和实例研究,并与均值插补、回归插补、随机回归插补方法比较,结果表明所提出的修正回归插补方法是有效可行的.  相似文献   

利用分位数回归方法,讨论了非参数固定效应Panel Data模型的估计和检验问题,得到了参数估计的渐近正态性及收敛速度。同时,建立一个秩得分(rank score)统计量来检验模型的固定效应,并证明了这个统计量渐近服从标准正态分布。  相似文献   

Recently, least absolute deviations (LAD) estimator for median regression models with doubly censored data was proposed and the asymptotic normality of the estimator was established. However, it is invalid to make inference on the regression parameter vectors, because the asymptotic covariance matrices are difficult to estimate reliably since they involve conditional densities of error terms. In this article, three methods, which are based on bootstrap, random weighting, and empirical likelihood, respectively, and do not require density estimation, are proposed for making inference for the doubly censored median regression models. Simulations are also done to assess the performance of the proposed methods.  相似文献   

Abstract.  In a case–cohort design a random sample from the study cohort, referred as a subcohort, and all the cases outside the subcohort are selected for collecting extra covariate data. The union of the selected subcohort and all cases are referred as the case–cohort set. Such a design is generally employed when the collection of information on an extra covariate for the study cohort is expensive. An advantage of the case–cohort design over more traditional case–control and the nested case–control designs is that it provides a set of controls which can be used for multiple end-points, in which case there is information on some covariates and event follow-up for the whole study cohort. Here, we propose a Bayesian approach to analyse such a case–cohort design as a cohort design with incomplete data on the extra covariate. We construct likelihood expressions when multiple end-points are of interest simultaneously and propose a Bayesian data augmentation method to estimate the model parameters. A simulation study is carried out to illustrate the method and the results are compared with the complete cohort analysis.  相似文献   

为了尝试使用贝叶斯方法研究比例数据的分位数回归统计推断问题,首先基于Tobit模型给出了分位数回归建模方法,然后通过选取合适的先验分布得到了贝叶斯层次模型,进而给出了各参数的后验分布并用于Gibbs抽样。数值模拟分析验证了所提出的贝叶斯推断方法对于比例数据分析的有效性。最后,将贝叶斯方法应用于美国加州海洛因吸毒数据,在不同的分位数水平下揭示了吸毒频率的影响因素。  相似文献   

Clustering due to unobserved heterogeneity may seriously impact on inference from binary regression models. We examined the performance of the logistic, and the logistic-normal models for data with such clustering. The total variance of unobserved heterogeneity rather than the level of clustering determines the size of bias of the maximum likelihood (ML) estimator, for the logistic model. Incorrect specification of clustering as level 2, using the logistic-normal model, provides biased estimates of the structural and random parameters, while specifying level 1, provides unbiased estimates for the former, and adequately estimates the latter. The proposed procedure appeals to many research areas.  相似文献   

Abstract.  Multivariate failure time data arises when each study subject can potentially ex-perience several types of failures or recurrences of a certain phenomenon, or when failure times are sampled in clusters. We formulate the marginal distributions of such multivariate data with semiparametric accelerated failure time models (i.e. linear regression models for log-transformed failure times with arbitrary error distributions) while leaving the dependence structures for related failure times completely unspecified. We develop rank-based monotone estimating functions for the regression parameters of these marginal models based on right-censored observations. The estimating equations can be easily solved via linear programming. The resultant estimators are consistent and asymptotically normal. The limiting covariance matrices can be readily estimated by a novel resampling approach, which does not involve non-parametric density estimation or evaluation of numerical derivatives. The proposed estimators represent consistent roots to the potentially non-monotone estimating equations based on weighted log-rank statistics. Simulation studies show that the new inference procedures perform well in small samples. Illustrations with real medical data are provided.  相似文献   

We are concerned with the problem of local weighted average estimation of the regression operator when the responses are real-valued random variables, the explanatory data are of functional fixed-design type, and the errors consist of an independent and identically distributed variables. In this article, our main contributions on the local linear functional estimation concern from one part, the situation when the data are of functional fixed-design kind, and from the other part, in deriving uniform asymptotic results on the behavior of this estimator with respect to the topological properties of the space data (normed or semi-metric).  相似文献   

We consider bivariate current status data with death which often occur in animal tumorigenicity experiments. Instead of observing exact tumor onset time, the existence of tumor is known at death time or sacrifice time. Such an incomplete data structure makes it difficult to investigate the effect of treatment on tumor onset times. Furthermore, when tumor onsets occur at two sites, information for the order of their onsets is unknown. A multistate model is applied to incorporate the sequential occurrence of events. For the inference of parameters, an EM algorithm is applied and a real NTP (National Toxicology Program) dataset is analyzed as an illustrative example.  相似文献   

Recently, there has been a great interest in the analysis of longitudinal data in which the observation process is related to the longitudinal process. In literature, the observation process was commonly regarded as a recurrent event process. Sometimes some observation duration may occur and this process is referred to as a recurrent episode process. The medical cost related to hospitalization is an example. We propose a conditional modeling approach that takes into account both informative observation process and observation duration. We conducted simulation studies to assess the performance of the method and applied it to a dataset of medical costs.  相似文献   

This article is concerned with statistical inference of the partial linear isotonic regression model missing response and measurement errors in covariates. We proposed an empirical likelihood ratio test statistics and show that it has a limiting weighted chi-square distribution. An adjusted empirical likelihood ratio statistic, which is shown to have a limiting standard central chi-square distribution, is then proposed further. A maximum empirical likelihood estimator is also developed. A simulation study is conducted to examine the finite-sample property of proposed procedure.  相似文献   

模糊数据的回归模型结构分析   总被引:4,自引:1,他引:3  
李竹渝  张成 《统计研究》2008,25(8):74-78
本文在给出对称三角模糊数样本基础上,提出模糊数据回归分析模型的一般结构。在使用线性规划LP方法进行模糊回归系数估计时,根据模糊集合的择近原则,给出了利用样本平均贴近度评价模型拟合效果的一个准则。通过实例计算,比较了模糊样本回归模型未知参数估计的FLP方法和FLS 方法。  相似文献   

When data from several independent Markov chains are aggregated over each time point, least square estimation of transition probabilities faces the problem of multi-collinearity. We propose here an estimation procedure which involves use of ridge regression for the ordinary least square estimators. Performance of this estimator is then compared with that of the ordinary least squares.  相似文献   

Abstract.  Functional magnetic resonance imaging (fMRI) is a technique for studying the active human brain. During the fMRI experiment, a sequence of MR images is obtained, where the brain is represented as a set of voxels. The data obtained are a realization of a complex spatio-temporal process with many sources of variation, both biological and technical. We present a spatio-temporal point process model approach for fMRI data where the temporal and spatial activation are modelled simultaneously. It is possible to analyse other characteristics of the data than just the locations of active brain regions, such as the interaction between the active regions. We discuss both classical statistical inference and Bayesian inference in the model. We analyse simulated data without repeated stimuli both for location of the activated regions and for interactions between the activated regions. An example of analysis of fMRI data, using this approach, is presented.  相似文献   

响应变量存在数据缺失的情况广泛出现在社会经济研究中,对响应变量存在数据缺失的回归模型提出了一个在矩估计框架下的单一的半参数估计量,这种估计量保留了参数回归估计量与非参数匹配估计量的特性,从而使得该估计量既能在响应变量被观测的子样本中保持较好的拟合性,又能够降低响应变量未被观测的子样本的估计误差,并且证明了这种估计量是一致、渐进正态估计量。  相似文献   

We propose a fully Bayesian model with a non-informative prior for analyzing misclassified binary data with a validation substudy. In addition, we derive a closed-form algorithm for drawing all parameters from the posterior distribution and making statistical inference on odds ratios. Our algorithm draws each parameter from a beta distribution, avoids the specification of initial values, and does not have convergence issues. We apply the algorithm to a data set and compare the results with those obtained by other methods. Finally, the performance of our algorithm is assessed using simulation studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号