期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

白仲林隋雯霞刘传文《统计与信息论坛》2013,28(4):3-9

为了更准确地揭示金融资产收益率数据的真实数据生成过程,提出了基于混合贝塔分布的随机波动模型,讨论了混合贝塔分布随机波动模型的贝叶斯估计方法,并给出了一种Gibbs抽样算法。以上证A股综指简单收益率为例,分别建立了基于正态分布和混合贝塔分布的随机波动模型,研究表明,基于混合贝塔分布的随机波动模型更准确地描述了样本数据的真实数据生成过程,而正态分布的随机波动模型将高峰厚尾等现象归结为波动冲击,从而低估了收益率的平均波动水平,高估了波动的持续性和波动的冲击扰动。相似文献

2.

Concurrent generation of multivariate mixed data with variables of dissimilar types

《Journal of Statistical Computation and Simulation》2012,82(18):3595-3607

ABSTRACT

Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios. 相似文献

3.

A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications

Leila Amiri Mojtaba Khazaei Mojtaba Ganjali 《AStA Advances in Statistical Analysis》2018,102(1):95-115

Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model. 相似文献

4.

Analysis of a longitudinal ordinal response clinical trial using dynamic models

P. J. Lindsey J. Kaufmann 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(3):523-537

Summary. In many areas of pharmaceutical research, there has been increasing use of categorical data and more specifically ordinal responses. In many cases, complex models are required to account for different types of dependences among the responses. The clinical trial that is considered here involved patients who were required to remain in a particular state to enable the doctors to examine their heart. The aim of this trial was to study the relationship between the dose of the drug administered and the time that was spent by the patient in the state permitting examination. The patient's state was measured every second by a continuous Doppler signal which was categorized by the doctors into one of four ordered categories. Hence, the response consisted of repeated ordinal series. These series were of different lengths because the drug effect wore off faster (or slower) on certain patients depending on the drug dose administered and the infusion rate, and therefore the length of drug administration. A general method for generating new ordinal distributions is presented which is sufficiently flexible to handle unbalanced ordinal repeated measurements. It consists of obtaining a cumulative mixture distribution from a Laplace transform and introducing into it the integrated intensity of a binary logistic, continuation ratio or proportional odds model. Then, a multivariate distribution is constructed by a procedure that is similar to the updating process of the Kalman filter. Several types of history dependences are proposed. 相似文献

5.

A generalized estimating equation method for fitting autocorrelated ordinal score data with an application in horticultural research

N. R. Parsons R. N. Edmondson S. G. Gilmour 《Journal of the Royal Statistical Society. Series C, Applied statistics》2006,55(4):507-524

Summary. Generalized estimating equations for correlated repeated ordinal score data are developed assuming a proportional odds model and a working correlation structure based on a first-order autoregressive process. Repeated ordinal scores on the same experimental units, not necessarily with equally spaced time intervals, are assumed and a new algorithm for the joint estimation of the model regression parameters and the correlation coefficient is developed. Approximate standard errors for the estimated correlation coefficient are developed and a simulation study is used to compare the new methodology with existing methodology. The work was part of a project on post-harvest quality of pot-plants and the generalized estimating equation model is used to analyse data on poinsettia and begonia pot-plant quality deterioration over time. The relationship between the key attributes of plant quality and the quality and longevity of ornamental pot-plants during shelf and after-sales life is explored. 相似文献

6.

An Interest-rate Model Analysis Based on Data Augmentation Bayesian Forecasting

Eiji Minemura 《Journal of applied statistics》2006,33(10):1085-1104

In this paper, the author presents an efficient method of analyzing an interest-rate model using a new approach called 'data augmentation Bayesian forecasting.' First, a dynamic linear model estimation was constructed with a hierarchically-incorporated model. Next, an observational replication was generated based on the one-step forecast distribution derived from the model. A Markov-chain Monte Carlo sampling method was conducted on it as a new observation and unknown parameters were estimated. At that time, the EM algorithm was applied to establish initial values of unknown parameters while the 'quasi Bayes factor' was used to appreciate parameter candidates. 'Data augmentation Bayesian forecasting' is a method of evaluating the transition and history of 'future,' 'present' and 'past' of an arbitrary stochastic process by which an appropriate evaluation is conducted based on the probability measure that has been sequentially modified with additional information. It would be possible to use future prediction results for modifying the model to grasp the present state or re-evaluate the past state. It would be also possible to raise the degree of precision in predicting the future through the modification of the present and the past. Thus, 'data augmentation Bayesian forecasting' is applicable not only in the field of financial data analysis but also in forecasting and controlling the stochastic process. 相似文献

7.

Modelling Uncertainty and Overdispersion in Ordinal Data

Maria Iannario 《统计学通讯:理论与方法》2014,43(4):771-786

In this article we introduce a probability distribution generated by a mixture of discrete random variables to capture uncertainty, feeling, and overdispersion, possibly present in ordinal data surveys. The choice of the components of the new model is motivated by a study on the data generating process. Inferential issues concerning the maximum likelihood estimates and the validation steps are presented; then, some empirical analyses are given to support the usefulness of the approach. Discussion on further extensions of the model ends the article. 相似文献

8.

基于稀疏结构连续比率模型的消费金融风控研究

张晶等《统计研究》2020,37(11):57-67

近年来,我国消费金融发展迅速,但同时也面临着更加复杂的欺诈和信用风险,为了更好地对消费金融中借贷客户的信用风险进行监测,本文提出了基于稀疏结构连续比率模型的风控方法。相对于传统的二分类模型,该模型的特点是可以处理借贷客户被分为三类或三类以上的有序数据,估计系数的同时能从众多纷繁复杂的数据中自动筛选重要变量,并在变量筛选过程中考虑不同子模型系数的结构特征。通过蒙特卡洛模拟发现,本文所提出的稀疏结构连续比率模型在分类泛化误差和变量筛选上的表现都较好。最后将本文提出的模型应用到实际的消费金融信用风险分析中,针对传统征信信息不足的借款人,通过引入高频电商消费行为数据,利用本文提出的高维有序多分类模型能有效识别借款人的信用风险,可以弥补传统征信方法的不足。相似文献

9.

A Framework for Time Varying Parameter Regression Modeling

Joseph A. Machak W. Allen Spivey William J. Wrobleski 《商业与经济统计学杂志》2013,31(2):104-111

A framework for time varying parameter regression models is developed and employed in modeling and forecasting price expectations, using the Livingston data. Alternative model formulations, which include various choices for both the stochastic processes generating the varying parameters and the sets of explanatory variables, are examined and compared by using this framework. These models, some of which have appeared elsewhere and some of which are new, are estimated and used to assess the expectations formation process. 相似文献

10.

Efficient initial designs for binary response data

Juha Karvanen 《Statistical Methodology》2008,5(5):462-473

In this paper we introduce a binary search algorithm that efficiently finds initial maximum likelihood estimates for sequential experiments where a binary response is modeled by a continuous factor. The problem is motivated by switching measurements on superconducting Josephson junctions. In this quantum mechanical experiment, the current is the factor controlled by the experimenter and a binary response indicating the presence or the absence of a voltage response is measured. The prior knowledge on the model parameters is typically poor, which may cause the common approaches of initial estimation to fail. The binary search algorithm is designed to work reliably even when the prior information is very poor. The properties of the algorithm are studied in simulations and an advantage over the initial estimation with equally spaced factor levels is demonstrated. We also study the cost-efficiency of the binary search algorithm and find the approximately optimal number of measurements per stage when there is a cost related to the number of stages in the experiment. 相似文献

11.

Bayesian variable selection with strong heredity constraints

Joungyoun Kim Johan Lim Yongdai Kim Woncheol Jang 《Journal of the Korean Statistical Society》2018,47(3):314-329

In this paper, we propose a Bayesian variable selection method for linear regression models with high-order interactions. Our method automatically enforces the heredity constraint, that is, a higher order interaction term can exist in the model only if both of its parent terms are in the model. Based on the stochastic search variable selection George and McCulloch (1993), we propose a novel hierarchical prior that fully considers the heredity constraint and controls the degree of sparsity simultaneously. We develop a Markov chain Monte Carlo (MCMC) algorithm to explore the model space efficiently while accounting for the heredity constraint by modifying the shotgun stochastic search algorithm Hans et al. (2007). The performance of the new model is demonstrated through comparisons with other methods. Numerical studies on both real data analysis and simulations show that our new method tends to find relevant variable more effectively when higher order interaction terms are considered. 相似文献

12.

Stochastic EM algorithm of a finite mixture model from hurdle Poisson distribution with missing responses

Ying-zi Fu 《统计学通讯:理论与方法》2013,42(20):5918-5932

ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology. 相似文献

13.

Model selection for mixture‐based clustering for ordinal data

下载免费PDF全文

D. Fernández R. Arnold 《Australian & New Zealand Journal of Statistics》2016,58(4):437-472

One of the key questions in the use of mixture models concerns the choice of the number of components most suitable for a given data set. In this paper we investigate answers to this problem in the context of likelihood‐based clustering of the rows of a matrix of ordinal data modelled by the ordered stereotype model. Two methodologies for selecting the best model are demonstrated and compared. The first approach fits a separate model to the data for each possible number of clusters, and then uses an information criterion to select the best model. The second approach uses a Bayesian construction in which the parameters and the number of clusters are estimated simultaneously from their joint posterior distribution. Simulation studies are presented which include a variety of scenarios in order to test the reliability of both approaches. Finally, the results of the application of model selection to two real data sets are shown. 相似文献

14.

Maximum likelihood estimation of polychoric correlations in r×s ×t contingency tables

《Journal of Statistical Computation and Simulation》2012,82(1-2):53-67

This paper discusses the maximum likelihood estimation of the polychoric correlation coefficient based on observed frequencies of three polytomous ordinal variables. The underlying latent variables are assumed to have a standardized trivariate normal distribution. The thresholds and correlations are estimated simultaneously via the scoring algorithm. Some practical applications of the method are discussed. An example is reported to illustrate the theory and some technical details are presented in the Appendix. 相似文献

15.

An extended Gompertz-Makeham distribution with application to lifetime data

Ahmed M. T. Abd El-Bar 《统计学通讯:模拟与计算》2018,47(8):2454-2475

In this paper, we propose an extension of the Gompertz-Makeham distribution. This distribution is called the transmuted Gompertz-Makeham (TGM). The new model which can handle bathtub-shaped, increasing, increasing-constant and constant hazard rate functions. This property makes TGM is useful in survival analysis. Various statistical and reliability measures of the model are obtained, including hazard rate function, moments, moment generating function (mgf), quantile function, random number generating, skewness, kurtosis, conditional moments, mean deviations, Bonferroni curve, Lorenz curve, Gini index, mean inactivity time, mean residual lifetime and stochastic ordering; we also obtain the density of the ith order statistic. Estimation of the model parameters is justified by the method of maximum likelihood. An application to real data demonstrates that the TGM distribution can provides a better fit than some other very well known distributions. 相似文献

16.

Bayesian quantile regression for ordinal longitudinal data

Rahim Alhamzawi Haithem Taha Mohammad Ali 《Journal of applied statistics》2018,45(5):815-828

Since the pioneering work by Koenker and Bassett [27], quantile regression models and its applications have become increasingly popular and important for research in many areas. In this paper, a random effects ordinal quantile regression model is proposed for analysis of longitudinal data with ordinal outcome of interest. An efficient Gibbs sampling algorithm was derived for fitting the model to the data based on a location-scale mixture representation of the skewed double-exponential distribution. The proposed approach is illustrated using simulated data and a real data example. This is the first work to discuss quantile regression for analysis of longitudinal data with ordinal outcome. 相似文献

17.

A Monte Carlo comparison of alternative methods of maximum likelihood ranking in racing sports

Aaron Anderson 《Journal of applied statistics》2015,42(8):1740-1756

Applications of maximum likelihood techniques to rank competitors in sports are commonly based on the assumption that each competitor's performance is a function of a deterministic component that represents inherent ability and a stochastic component that the competitor has limited control over. Perhaps based on an appeal to the central limit theorem, the stochastic component of performance has often been assumed to be a normal random variable. However, in the context of a racing sport, this assumption is problematic because the resulting model is the computationally difficult rank-ordered probit. Although a rank-ordered logit is a viable alternative, a Thurstonian paired-comparison model could also be applied. The purpose of this analysis was to compare the performance of the rank-ordered logit and Thurstonian paired-comparison models given the objective of ranking competitors based on ability. Monte Carlo simulations were used to generate race results based on a known ranking of competitors, assign rankings from the results of the two models, and judge performance based on Spearman's rank correlation coefficient. Results suggest that in many applications, a Thurstonian model can outperform a rank-ordered logit if each competitor's performance is normally distributed. 相似文献

18.

Modeling a mixture of ordinal and continuous repeated measures

《Journal of Statistical Computation and Simulation》2012,82(10):873-886

We study the correlation structure for a mixture of ordinal and continuous repeated measures using a Bayesian approach. We assume a multivariate probit model for the ordinal variables and a normal linear regression for the continuous variables, where latent normal variables underlying the ordinal data are correlated with continuous variables in the model. Due to the probit model assumption, we are required to sample a covariance matrix with some of the diagonal elements equal to one. The key computational idea is to use parameter-extended data augmentation, which involves applying the Metropolis-Hastings algorithm to get a sample from the posterior distribution of the covariance matrix incorporating the relevant restrictions. The methodology is illustrated through a simulated example and through an application to data from the UCLA Brain Injury Research Center. 相似文献

19.

Modeling of semi-competing risks by means of first passage times of a stochastic process

Beate Sildnes Bo Henry Lindqvist 《Lifetime data analysis》2018,24(1):153-175

In semi-competing risks one considers a terminal event, such as death of a person, and a non-terminal event, such as disease recurrence. We present a model where the time to the terminal event is the first passage time to a fixed level c in a stochastic process, while the time to the non-terminal event is represented by the first passage time of the same process to a stochastic threshold S, assumed to be independent of the stochastic process. In order to be explicit, we let the stochastic process be a gamma process, but other processes with independent increments may alternatively be used. For semi-competing risks this appears to be a new modeling approach, being an alternative to traditional approaches based on illness-death models and copula models. In this paper we consider a fully parametric approach. The likelihood function is derived and statistical inference in the model is illustrated on both simulated and real data. 相似文献

20.

Alignment and Sub-pixel Interpolation of Images using Fourier Methods

C. A. Glasbey G. W. A. M. Van Der Heijden 《Journal of applied statistics》2007,34(2):217-230

A method is proposed for both estimating and correcting a translational mis-alignment between digital images, taking account of aliasing of high-frequency information. A parametric model is proposed for the power- and cross-spectra of the multivariate stochastic process that is assumed to have generated a continuous-space version of the images. Parameters, including those that specify misalignment, are estimated by numerical maximum likelihood. The effectiveness of the interpolant is confirmed by simulation and illustrated using multi-band Landsat images. 相似文献