首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
数据挖掘(机器学习)领域的研究重点是建立概念漂移数据(Concept-drift)下的模型,其中的核心问题就是探测器算法.文章提出了一种基于双窗的探测算法.其优点是给出了该算法的严格理论基础;有效提高挖掘效率,克服虚漂移的干扰.并且运用人工和实际数据进行实验,效果亦优于其他算法.  相似文献   

2.
数据流具有连续、实时、有序及无限等特点,使用传统的数据挖掘技术来处理数据流的分类面临着严重的挑战,很难处理数据流中的概念漂移问题.文章结合现有的决策树分类挖掘算法,提出了自适应集成分类器方法,构建了数据流概念漂移的自适应集成分类模型,通过不断更新训练样例的权重与属性类别,将训练样例从现有的数据集中分离出来,并被确定为新类别属性的训练样例,以达到对数据流中概念漂移现象的有效检测,仿真结果也证明该方法的适应性和可靠性.  相似文献   

3.
数据流挖掘技术是数据挖掘技术的新研究方向之一。文章介绍了数据流、数据流挖掘的特点,对现有的数据流挖掘算法进行了总结、分析,提出了数据流挖掘的研究方向和应用前景。  相似文献   

4.
经济数据常存在空间相关性,忽略空间相关性会引发内生性问题,导致相应估计量有偏且不一致。空间随机前沿模型在随机前沿模型的基础上考虑了生产单元的空间相关性,更利于效率测算。然而现有空间随机前沿模型的生产函数形式单一,适用性较差,实证分析存在局限性。文章在空间随机前沿模型中引入平滑转移效应,构建了平滑转移空间随机前沿模型,该模型同时考虑了空间相关性和个体异质性,适用性较佳。为丰富估计方法,同时采用极大似然方法和贝叶斯方法估计模型,其中极大似然估计的核心在于推导对数似然函数、对数似然函数的最优化以及使用JLMS法估计技术效率,贝叶斯估计的核心在于推导未知参数的后验分布及执行MCMC抽样。数值模拟结果显示:(1)极大似然估计和贝叶斯估计的估计精度均较高,其中贝叶斯估计的估计精度略高于极大似然估计;增加样本容量,贝叶斯估计和极大似然估计的估计精度更高。(2)若忽略空间效应或者平滑转移效应,则估计精度较低。  相似文献   

5.
文章文针对金融等领域的时间序列数据流,提出了一种直方图的构造方法,该方法具有联机处理高频时间序列数据流的能力,并具有与最优化直方图构造方法接近的精度.  相似文献   

6.
根据国外的理论研究及应用的经验,预计连续性抽样估计方法在中国的应用前景非常广阔。因此,将连续性抽样估计方法作为研究对象,对国外已有的相关研究成果进行理论化、系统化的研究综述,并比较分析各类连续性抽样估计方法,归纳出存在的问题及未来研究的趋势,为后续的研究提供参考,同时为该方法顺利地应用到中国实际调查工作中奠定理论基础,从而进一步推动中国统计调查方法体系的改革与发展。  相似文献   

7.
贺建风 《统计研究》2012,29(10):105-112
多重抽样框可以解决单一抽样框难以完整覆盖流动性目标总体的难题,连续性抽样调查则可以获取变量的时序观测数据,对总体现象进行追踪调查。本文将多重抽样框调查与连续性抽样调查两种方法结合在一起进行研究,深入分析基于多重抽样框的连续性抽样估计方法。文章首先设计了连续性调查环境下总体结构变动表;然后,在简单随机抽样假定下的轮换样本调查情形开展研究,设计了14种参数缩减方法对构建的似然函数进行估计求解,并给出了估计量的迭代计算过程;最后,对本文的研究内容进行了总结与展望。  相似文献   

8.
polya后验方法作为一种无信息贝叶斯估计方法,在有限总体抽样中,通过观测的样本,构造一系列的模拟总体,然后进行统计推断。通过统计模拟研究了polya后验方法估计的一些特点,并和Bootstrap方法进行比较。模拟结果显示:polya后验方法能够很好地估计总体的均值,随着样本量的增大,估计值与真值的差距越来越小。采用polya后验方法构造的置信区间区间长度较小,能够很好地覆盖真值。  相似文献   

9.
自1909年穆迪公司建立第一家信用评级机构开始,到目前全球的评级机构达到了约150家,其中比较著名的评级机构包括穆迪公司、标准普尔公司等。这些机构对公开交易的债券进行信用评级,用以评估发行人按时履行债务合约的能力。评级机构为对其评级的绩效和稳定性进行跟踪研究,均要构造所谓的转移矩阵。构建这些转移矩阵的目的是为了向风险管理人员或投资者提供信用等级未来预期变化的资料。转移矩阵的元素反映了信用等级由一个等级向另一个等级转移的概率。信用等级转移对于固定收益债券的投资者、机构、信用风险管理人员以及监管者都是非常值得…  相似文献   

10.
文章结合项目风险概率估计特点,利用集值统计方法对数据的区间模糊处理,建立集值统计的项目风险概率估计模型,对专家的概率估计值进行统计处理,确保项目风险概率估计值的科学性。最后将该方法应用于某软件项目的风险概率估计,充分证明其适用性。  相似文献   

11.
Many directional data such as wind directions can be collected extremely easily so that experiments typically yield a huge number of data points that are sequentially collected. To deal with such big data, the traditional nonparametric techniques rapidly require a lot of time to be computed and therefore become useless in practice if real time or online forecasts are expected. In this paper, we propose a recursive kernel density estimator for directional data which (i) can be updated extremely easily when a new set of observations is available and (ii) keeps asymptotically the nice features of the traditional kernel density estimator. Our methodology is based on Robbins–Monro stochastic approximations ideas. We show that our estimator outperforms the traditional techniques in terms of computational time while being extremely competitive in terms of efficiency with respect to its competitors in the sequential context considered here. We obtain expressions for its asymptotic bias and variance together with an almost sure convergence rate and an asymptotic normality result. Our technique is illustrated on a wind dataset collected in Spain. A Monte‐Carlo study confirms the nice properties of our recursive estimator with respect to its non‐recursive counterpart.  相似文献   

12.
Most statistical and data-mining algorithms assume that data come from a stationary distribution. However, in many real-world classification tasks, data arrive over time and the target concept to be learned from the data stream may change accordingly. Many algorithms have been proposed for learning drifting concepts. To deal with the problem of learning when the distribution generating the data changes over time, dynamic weighted majority was proposed as an ensemble method for concept drift. Unfortunately, this technique considers neither the age of the classifiers in the ensemble nor their past correct classification. In this paper, we propose a method that takes into account expert's age as well as its contribution to the global algorithm's accuracy. We evaluate the effectiveness of our proposed method by using m classifiers and training a collection of n-fold partitioning of the data. Experimental results on a benchmark data set show that our method outperforms existing ones.  相似文献   

13.
In this paper, maximum likelihood estimators (MLE) for both step and linear drift changes in the regression parameters of multivariate linear profiles are developed. Performance of the proposed estimators is compared under linear drift changes in the regression parameters when a combined MEWMA and Chi-square control charts method signals an out-of-control condition. The effect of smoothing parameter of MEWMA control charts, missing data, and multiple drift changes on the performance of the both estimators is also evaluated. The application of the proposed estimators is also investigated thorough a numerical example resulted from a real case.  相似文献   

14.
The analysis of non stationary data streams requires a continuous adaption of the model to the relevant most recent data. This requires that changes in the data stream must be distinguished from noise. Many approaches are based on heuristic adaptation schemes. We analyze simple regression models to understand the joint effects of noise and concept drift and derive the optimal sliding window size for the regression models. Our theoretical analysis and simulations show that a near optimal window size can be crucial. Our models can be used as benchmarks for other models to see how they cope with noise and drift.  相似文献   

15.
Phillips and Sweeting [J. R. Statist. Soc. B 58 (1996) 775–783.] considered estimation of the parameter of the exponential distribution with censored failure time data when there is incomplete knowledge of the censoring times. It was shown that, under particular models for the censoring mechanism and censoring errors, it will usually be safe to ignore such errors provided they are not expected to be too large. A flexible model is introduced which includes the extreme cases of no censoring errors and no information on the censoring values. The effect of alternative assumptions about knowledge of the censoring values on the estimation of failure rate is investigated.  相似文献   

16.
响应变量存在数据缺失的情况广泛出现在社会经济研究中,对响应变量存在数据缺失的回归模型提出了一个在矩估计框架下的单一的半参数估计量,这种估计量保留了参数回归估计量与非参数匹配估计量的特性,从而使得该估计量既能在响应变量被观测的子样本中保持较好的拟合性,又能够降低响应变量未被观测的子样本的估计误差,并且证明了这种估计量是一致、渐进正态估计量。  相似文献   

17.
The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study.  相似文献   

18.
This article investigates nonparametric estimation of variance functions for functional data when the mean function is unknown. We obtain asymptotic results for the kernel estimator based on squared residuals. Similar to the finite dimensional case, our asymptotic result shows the smoothness of the unknown mean function has an effect on the rate of convergence. Our simulation studies demonstrate that estimator based on residuals performs much better than that based on conditional second moment of the responses.  相似文献   

19.
When data from several independent Markov chains are aggregated over each time point, least square estimation of transition probabilities faces the problem of multi-collinearity. We propose here an estimation procedure which involves use of ridge regression for the ordinary least square estimators. Performance of this estimator is then compared with that of the ordinary least squares.  相似文献   

20.
We consider a repairable system with general repairs introduced by Last & Szekli (1998a ). Apart from simple special cases this model leads to a strong dependency among the observed failure times. Our aim is to estimate the underlying failure time distribution and its cumulative hazard given that the failure process has been observed up to the n th failure. We use non-parametric estimators of Kaplan–Meier and Nelson–Aalen type. We prove strong uniform consistency of the estimators as n tends to infinity. Further results on weak convergence are derived. Neither stationarity nor mixing conditions are required.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号