首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
文章研究了纵向数据半参数Logistic回归模型的估计问题,给出了模型中未知参数和未知函数的估计方法,探讨了参数部分的变量选择问题,并对不同的变量选择方法进行比较分析.从模拟结果可以看到,文中给出的方法具有很好的估计效果.  相似文献   

2.
主成分分析与因子分析的异同比较及应用   总被引:6,自引:0,他引:6  
在经济实证问题研究中,对某一现象通常会考虑其众多的影响因素,因为人们希望尽可能多地收集关于分析对象的数据信息,进而能够对它有比较全面、完整的把握和认识,于是对某个分析对象的描述就会有许多的变量(或指标)。这些经济变量在大多数情况下不是独立  相似文献   

3.
在现实经济生活中,经济变量间的相关关系是大量存在的。如投入与产出、收入与支出、生产与消费、投资与经济增长等经济变量都是相互关联的。对这类经济变量间相互依存关系的分析就属于相关分析的内容。在对经济变量间的关系作相关分析时,不仅要根据经济理论和经济含义去定性认识,更重要的是要运用数理统计方法对它们的相关程度作定量测定。  相似文献   

4.
何强  董志勇 《统计研究》2020,37(12):91-104
大数据为季度GDP走势预测创新研究带来重要突破口。本文利用百度等网站的互联网大数据,基于代表性高维数据机器学习(和深度学习)模型,对我国2011-2018年季度GDP增速深入进行预测分析。研究发现,对模型中的随机干扰因素作出一定分布的统计假设,有助于降低预测误差,任由模型通过大量数据机械地学习和完善并不总是有利于模型预测能力的提升;采用对解释变量集添加惩罚约束的方法,可以有效地处理互联网大数据维度较高的棘手问题;预测季度GDP增速的最优大数据解释变量集的稳定性较高。  相似文献   

5.
吴梦云等 《统计研究》2021,38(8):132-145
多分类数据分析在实证研究中具有重要意义。然而,由于高维数、小样本及低信噪比等原因,现有的多分类方法仍面临信息量不足而导致的效果不佳问题。为此,学者们通过收集更多信息源 数据以更全面地刻画实际问题。不同于收集相同自变量的不同源样本,目前较为流行的多源数据收集了相同样本的不同源自变量,它们的独立性和相关性为统计建模带来了新的挑战。本文提出基于典型变量回归的多分类纵向整合分析方法,其中利用惩罚技术实现变量选择,并独特地考虑不同源数据间的关联结构,提出高效的ADMM算法进行模型优化。数值模拟结果表明,该方法在变量选择和分类预测 上均具有优越性。基于我国上证50的多源股票数据,利用该方法对2019年股票日收益率的影响因素进行了实证探究。研究表明,本文提出的多分类整合分析在筛选出具有解释意义变量的同时具有更好的预测效果。  相似文献   

6.
一、对应分析方法 对应分析方法是近年来发展起来的一种多元相依变量统计分析技术,它通过分析由定性数据构成的交互汇总表来揭示变量间的联系。当用变量的一系列类别分布图来描述变量之间的联系时,使用这一技术可以揭示同一变量各个类别之间的差异以及不同变量各个类别之间的对应关系。它不仅可以分析定性数据,同时还可以分析非线性关系。当我们分析的变量是定性数据,变量之间又存在非线性关系时,则可以用对应分析来揭示变量之间的联系。对应分析的基本形式是对由两个定性变量构成的交互表进行分析.将定性数据转变为可度量的分值,减…  相似文献   

7.
多图模型表示来自于不同类的同一组随机变量间的相关关系,结点表示随机变量,边表示变量之间的直接联系,各类的图模型反映了各自相关结构特征和类间共同的信息。用多图模型联合估计方法,将来自不同个体的数据按其特征分类,假设每类中各变量间的相依结构服从同一个高斯图模型,应用组Lasso方法和图Lasso方法联合估计每类的图模型结构。数值模拟验证了多图模型联合估计方法的有效性。用多图模型和联合估计方法对中国15个省份13个宏观经济指标进行相依结构分析,结果表明,不同经济发展水平省份的宏观经济变量间存在共同的相关联系,反映了中国现阶段经济发展的特征;每一类的相关结构反映了各类省份经济发展独有的特征。  相似文献   

8.
Copula函数包含了变量的边际分布和变量间的相关结构两方面的信息.用Copula函数可以很灵活地构造相关结构和边际分布不同的联合分布函数.Archimedean Copula函数在金融市场分析中很有用.在用Copula理论建模的过程中有一个很重要的环节是参数估计.文章采用对边际分布不作具体假设的非参数核密度方法来估计Archimedean Copula的参数,并用实证说明方法的有效性.  相似文献   

9.
文章在对多维交叉分类数据进行粗糙集描述的基础上,提出了用关联信息系数矩阵测度多维定性变量关联性的方法。研究表明,应用关联信息系数矩阵可以更有效地发现多维变量间的关联结构。  相似文献   

10.
时间序列分析:历史回顾与未来展望   总被引:5,自引:1,他引:4  
一、历史回顾 在工商业和经济学中,时间序列分析通常用于研究某过程的动态结构;分析变量间的动态关系;对诸如GDP和失业率等经济数据进行季节调整;当扰动项出现自相关时改善回归分析等.  相似文献   

11.
Longitudinal studies occcur frequently in many different disciplines. To fully utilize the potential value of the information contained in a longitudinal data, various multivariate linear models have been proposed. The methodology and analysis are somewhat unique in their own ways and their relationships are not well understood and presented. This article describes a general multivaritate linear model for longitudinal data and attempts to provide a constructive formulation of the components in the mean response profile. The objective is to point out the extension and connections of some well-known models that have been obscured by different areas of application. More imporiantly, the model is expressed in a unified regression form from the subject matter considerations. Such an approach is simpler and more intuitive than other ways to modeling and parameter estimation. As a cmsequeace the analyses of the general class cf models for longitudional data can be casily implemented with standard software.  相似文献   

12.
The mixed random effect model is commonly used in longitudinal data analysis within either frequentist or Bayesian framework. Here we consider a case, in which we have prior knowledge on partial parameters, while no such information on the rest of the parameters. Thus, we use the hybrid approach on the random-effects model with partial parameters. The parameters are estimated via Bayesian procedure, and the rest of parameters by the frequentist maximum likelihood estimation (MLE), simultaneously on the same model. In practice, we often know partial prior information such as, covariates of age, gender, etc. These information can be used, and accurate estimations in mixed random-effects model can be obtained. A series of simulation studies were performed to compare the results with the commonly used random-effects model with and without partial prior information. The results in hybrid estimation (HYB) and MLE were very close to each other. The estimated θ values in with partial prior information model (HYB) were more closer to true θ values, and showed less variances than without partial prior information in MLE. To compare with true θ values, the mean square of errors are much less in HYB than in MLE. This advantage of HYB is very obvious in longitudinal data with a small sample size. The methods of HYB and MLE are applied to a real longitudinal data for illustration purposes.  相似文献   

13.
Yu  Tingting  Wu  Lang  Gilbert  Peter 《Lifetime data analysis》2019,25(2):229-258

In HIV vaccine studies, longitudinal immune response biomarker data are often left-censored due to lower limits of quantification of the employed immunological assays. The censoring information is important for predicting HIV infection, the failure event of interest. We propose two approaches to addressing left censoring in longitudinal data: one that makes no distributional assumptions for the censored data—treating left censored values as a “point mass” subgroup—and the other makes a distributional assumption for a subset of the censored data but not for the remaining subset. We develop these two approaches to handling censoring for joint modelling of longitudinal and survival data via a Cox proportional hazards model fit by h-likelihood. We evaluate the new methods via simulation and analyze an HIV vaccine trial data set, finding that longitudinal characteristics of the immune response biomarkers are highly associated with the risk of HIV infection.

  相似文献   

14.
We propose a flexible functional approach for modelling generalized longitudinal data and survival time using principal components. In the proposed model the longitudinal observations can be continuous or categorical data, such as Gaussian, binomial or Poisson outcomes. We generalize the traditional joint models that treat categorical data as continuous data by using some transformations, such as CD4 counts. The proposed model is data-adaptive, which does not require pre-specified functional forms for longitudinal trajectories and automatically detects characteristic patterns. The longitudinal trajectories observed with measurement error or random error are represented by flexible basis functions through a possibly nonlinear link function, combining dimension reduction techniques resulting from functional principal component (FPC) analysis. The relationship between the longitudinal process and event history is assessed using a Cox regression model. Although the proposed model inherits the flexibility of non-parametric methods, the estimation procedure based on the EM algorithm is still parametric in computation, and thus simple and easy to implement. The computation is simplified by dimension reduction for random coefficients or FPC scores. An iterative selection procedure based on Akaike information criterion (AIC) is proposed to choose the tuning parameters, such as the knots of spline basis and the number of FPCs, so that appropriate degree of smoothness and fluctuation can be addressed. The effectiveness of the proposed approach is illustrated through a simulation study, followed by an application to longitudinal CD4 counts and survival data which were collected in a recent clinical trial to compare the efficiency and safety of two antiretroviral drugs.  相似文献   

15.
Abstract

Missing data arise frequently in clinical and epidemiological fields, in particular in longitudinal studies. This paper describes the core features of an R package wgeesel, which implements marginal model fitting (i.e., weighted generalized estimating equations, WGEE; doubly robust GEE) for longitudinal data with dropouts under the assumption of missing at random. More importantly, this package comprehensively provide existing information criteria for WGEE model selection on marginal mean or correlation structures. Also, it can serve as a valuable tool for simulating longitudinal data with missing outcomes. Lastly, a real data example and simulations are presented to illustrate and validate our package.  相似文献   

16.
Paired sequencing data are commonly collected in genomic studies to control biological variation. However, existing data processing strategies suffer at low coverage regions, which are unavoidable due to the limitation of current sequencing technology. Furthermore, information contained in the absolute values of the read counts is commonly ignored. We propose a read count ratio processing/modification method, to not only incorporate information contained in the absolute values of paired counts into one variable, but also mitigate the discrete artifact, especially when both counts are small. Simulation shows that the processed variable fits well with a Beta distribution, thus providing an easy tool for down-stream inference analysis.  相似文献   

17.
Concomitant Medications are medications used by patients in a clinical trial, other than the investigational drug. These data are routinely collected in clinical trials. The data are usually collected in a longitudinal manner, for the duration of patients' participation in the trial. The routine summaries of this data are incidence‐type, describing whether or not a medication was ever administered during the study. The longitudinal aspect of the data is essentially ignored. The aim of this article is to suggest exploratory methods for graphically displaying the longitudinal features of the data using a well‐established estimator called the ‘mean cumulative function’. This estimator permits summary and a graphical display of the data, and preparation of some statistical tests to compare between groups. This estimator may also incorporate information on censoring of patient data. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
Regression models are often used to make predictions. All the information needed is contained in the predictive distribution. However, this cannot be evaluated explicitly for most generalized linear models. We construct two approximations to this distribution and demonstrate their use on two sets of survival data, corresponding to the outcome of patients admitted to intensive care units and the survival times of leukaemia patients.Regression models are often used to make predictions. All the information needed is contained in the predictive distribution. However, this cannot be evaluated explicitly for most generalized linear models. We construct two approximations to this distribution and demonstrate their use on two sets of survival data, corresponding to the outcome of patients admitted to intensive care units and the survival times of leukaemia patients.Regression models are often used to make predictions. All the information needed is contained in the predictive distribution. However, this cannot be evaluated explicitly for most generalized linear models. We construct two approximations to this distribution and demonstrate their use on two sets of survival data, corresponding to the outcome of patients admitted to intensive care units and the survival times of leukaemia patients.Regression models are often used to make predictions. All the information needed is contained in the predictive distribution. However, this cannot be evaluated explicitly for most generalized linear models. We construct two approximations to this distribution and demonstrate their use on two sets of survival data, corresponding to the outcome of patients admitted to intensive care units and the survival times of leukaemia patients.  相似文献   

19.
In many prospective clinical and biomedical studies, longitudinal biomarkers are repeatedly measured as health indicators to evaluate disease progression when patients are followed up over a period of time. Patient visiting times can be referred to as informative observation times if they are assumed to carry information in addition to that of the longitudinal biomarker measures alone. Irregular visiting times may reflect compliance with physician instruction, disease progression and symptom severity. When the follow-up time may be stopped by competing terminal events, it is possible that patient observation times may correlate with the competing terminal events themselves, thus making the observation times difficult to assess. To explicitly account for the impact of competing terminal events and dependent observation times on the longitudinal data analysis in the context of such complex data, we propose a joint model using latent random effects to describe the association among them. A likelihood-based approach is derived for statistical inference. Extensive simulation studies reveal that the proposed approach performs well for practical situations, and an analysis of patients with chronic kidney disease in a cohort study is presented to illustrate the proposed method.  相似文献   

20.
This article considers the estimation and testing of a within-group two-stage least squares (TSLS) estimator for instruments with varying degrees of weakness in a longitudinal (panel) data model. We show that adding the repeated cross-sectional information into a regression model can improve the estimation in weak instruments. Moreover, the consistency and limiting distribution of the TSLS estimator are established when both N and T tend to infinity. Some asymptotically pivotal tests are extended to a longitudinal data model and their asymptotic properties are examined. A Monte Carlo experiment is conducted to evaluate the finite sample performance of the proposed estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号