首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
We propose a study of asymptotic properties in an extension of the Nelder-Wedderburn generalized linear models (GLM) and we apply the results to a model choice in Mandel's models of analysis of variance (ANOVA). This study concerns the behaviour of the maximum likelihood estimators (MLE) when the scale parameter of the GLM tends to zero. Finally, to allow the use of our results, we give some specifications of this limit in three cases of the GLM.  相似文献   

2.
A pivotal characteristic of credit defaults that is ignored by most credit scoring models is the rarity of the event. The most widely used model to estimate the probability of default is the logistic regression model. Since the dependent variable represents a rare event, the logistic regression model shows relevant drawbacks, for example, underestimation of the default probability, which could be very risky for banks. In order to overcome these drawbacks, we propose the generalized extreme value regression model. In particular, in a generalized linear model (GLM) with the binary-dependent variable we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. The estimation procedure used is the maximum-likelihood method. This model accommodates skewness and it presents a generalisation of GLMs with complementary log–log link function. We analyse its performance by simulation studies. Finally, we apply the proposed model to empirical data on Italian small and medium enterprises.  相似文献   

3.
The Tweedie compound Poisson distribution is a subclass of the exponential dispersion family with a power variance function, in which the value of the power index lies in the interval (1,2). It is well known that the Tweedie compound Poisson density function is not analytically tractable, and numerical procedures that allow the density to be accurately and fast evaluated did not appear until fairly recently. Unsurprisingly, there has been little statistical literature devoted to full maximum likelihood inference for Tweedie compound Poisson mixed models. To date, the focus has been on estimation methods in the quasi-likelihood framework. Further, Tweedie compound Poisson mixed models involve an unknown variance function, which has a significant impact on hypothesis tests and predictive uncertainty measures. The estimation of the unknown variance function is thus of independent interest in many applications. However, quasi-likelihood-based methods are not well suited to this task. This paper presents several likelihood-based inferential methods for the Tweedie compound Poisson mixed model that enable estimation of the variance function from the data. These algorithms include the likelihood approximation method, in which both the integral over the random effects and the compound Poisson density function are evaluated numerically; and the latent variable approach, in which maximum likelihood estimation is carried out via the Monte Carlo EM algorithm, without the need for approximating the density function. In addition, we derive the corresponding Markov Chain Monte Carlo algorithm for a Bayesian formulation of the mixed model. We demonstrate the use of the various methods through a numerical example, and conduct an array of simulation studies to evaluate the statistical properties of the proposed estimators.  相似文献   

4.
A new numerical method to solve the downdating problem (and variants thereof), namely removing the effect of some observations from the generalized least squares (GLS) estimator of the general linear model (GLM) after it has been estimated, is extensively investigated. It is verified that the solution of the downdated least squares problem can be obtained from the estimation of an equivalent GLM, where the original model is updated with the imaginary deleted observations. This updated GLM has a non positive definite dispersion matrix which comprises complex covariance values and it is proved herein to yield the same normal equations as the downdated model. Additionally, the problem of deleting observations from the seemingly unrelated regressions model is addressed, demonstrating the direct applicability of this method to other multivariate linear models. The algorithms which implement the novel downdating method utilize efficiently the previous computations from the estimation of the original model. As a result, the computational cost is significantly reduced. This shows the great usability potential of the downdating method in computationally intensive problems. The downdating algorithms have been applied to real and synthetic data to illustrate their efficiency.  相似文献   

5.
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

6.
Model-based clustering is a method that clusters data with an assumption of a statistical model structure. In this paper, we propose a novel model-based hierarchical clustering method for a finite statistical mixture model based on the Fisher distribution. The main foci of the proposed method are: (a) provide efficient solution to estimate the parameters of a Fisher mixture model (FMM); (b) generate a hierarchy of FMMs and (c) select the optimal model. To this aim, we develop a Bregman soft clustering method for FMM. Our model estimation strategy exploits Bregman divergence and hierarchical agglomerative clustering. Whereas, our model selection strategy comprises a parsimony-based approach and an evaluation graph-based approach. We empirically validate our proposed method by applying it on simulated data. Next, we apply the method on real data to perform depth image analysis. We demonstrate that the proposed clustering method can be used as a potential tool for unsupervised depth image analysis.  相似文献   

7.
Tweedie regression models (TRMs) provide a flexible family of distributions to deal with non-negative right-skewed data and can handle continuous data with probability mass at zero. Estimation and inference of TRMs based on the maximum likelihood (ML) method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting TRMs, namely quasi-likelihood (QML) and pseudo-likelihood (PML). We discuss their asymptotic properties and perform simulation studies to compare our methods with the ML method. We show that the QML method provides asymptotically efficient estimation for regression parameters. Simulation studies showed that the QML and PML approaches present estimates, standard errors and coverage rates similar to the ML method. Furthermore, the second-moment assumptions required by the QML and PML methods enable us to extend the TRMs to the class of quasi-TRMs in Wedderburn's style. It allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide an R implementation and illustrate the application of TRMs using three data sets.  相似文献   

8.
The generalized linear model (GLM) is a class of regression models where the means of the response variables and the linear predictors are joined through a link function. Standard GLM assumes the link function is fixed, and one can form more flexible GLM by either estimating the flexible link function from a parametric family of link functions or estimating it nonparametically. In this paper, we propose a new algorithm that uses P-spline for nonparametrically estimating the link function which is guaranteed to be monotone. It is equivalent to fit the generalized single index model with monotonicity constraint. We also conduct extensive simulation studies to compare our nonparametric approach for estimating link function with various parametric approaches, including traditional logit, probit and robit link functions, and two recently developed link functions, the generalized extreme value link and the symmetric power logit link. The simulation study shows that the link function estimated nonparametrically by our proposed algorithm performs well under a wide range of different true link functions and outperforms parametric approaches when they are misspecified. A real data example is used to illustrate the results.  相似文献   

9.
Modeling the relationship between multiple financial markets has had a great deal of attention in both literature and real-life applications. One state-of-the-art technique is that the individual financial market is modeled by generalized autoregressive conditional heteroskedasticity (GARCH) process, while market dependence is modeled by copula, e.g. dynamic asymmetric copula-GARCH. As an extension, we propose a dynamic double asymmetric copula (DDAC)-GARCH model to allow for the joint asymmetry caused by the negative shocks as well as by the copula model. Furthermore, our model adopts a more intuitive way of constructing the sample correlation matrix. Our new model yet satisfies the positive-definite condition as found in dynamic conditional correlation-GARCH and constant conditional correlation-GARCH models. The simulation study shows the performance of the maximum likelihood estimate for DDAC-GARCH model. As a case study, we apply this model to examine the dependence between China and US stock markets since 1990s. We conduct a series of likelihood ratio test tests that demonstrate our extension (dynamic double joint asymmetry) is adequate in dynamic dependence modeling. Also, we propose a simulation method involving the DDAC-GARCH model to estimate value at risk (VaR) of a portfolio. Our study shows that the proposed method depicts VaR much better than well-established variance–covariance method.  相似文献   

10.
提出Knight不确定环境下的银行存款保险定价模型,在该模型下存款保费率不再是一个固定的值,而是一个区间。运用该模型真实测算了中国16家A股上市银行的存款保险费率区间,并利用数值分析的方法,研究不确定性参数对存款保费率区间的重要影响。结果表明:Knight不确定风险对中国银行保费的厘定影响显著,具体表现为随着不确定参数的增大,各银行保险费率区间长度都有增大的趋势,但增大幅度各不相同,因此在进行保费厘定时,不能一概而论,而要"因行而异"。  相似文献   

11.
Recursive partitioning algorithms separate a feature space into a set of disjoint rectangles. Then, usually, a constant in every partition is fitted. While this is a simple and intuitive approach, it may still lack interpretability as to how a specific relationship between dependent and independent variables may look. Or it may be that a certain model is assumed or of interest and there is a number of candidate variables that may non-linearly give rise to different model parameter values. We present an approach that combines generalized linear models (GLM) with recursive partitioning that offers enhanced interpretability of classical trees as well as providing an explorative way to assess a candidate variable's influence on a parametric model. This method conducts recursive partitioning of a GLM by (1) fitting the model to the data set, (2) testing for parameter instability over a set of partitioning variables, (3) splitting the data set with respect to the variable associated with the highest instability. The outcome is a tree where each terminal node is associated with a GLM. We will show the method's versatility and suitability to gain additional insight into the relationship of dependent and independent variables by two examples, modelling voting behaviour and a failure model for debt amortization, and compare it to alternative approaches.  相似文献   

12.
We introduce a new family of integer-valued distributions by considering a tempered version of the Discrete Linnik law. The proposal is actually a generalization of the well-known Poisson–Tweedie law. The suggested family is extremely flexible since it contains a wide spectrum of distributions ranging from light-tailed laws (such as the Binomial) to heavy-tailed laws (such as the Discrete Linnik). The main theoretical features of the Tempered Discrete Linnik distribution are explored by providing a series of identities in law, which describe its genesis in terms of mixture Poisson distribution and compound Negative Binomial distribution—as well as in terms of mixture Poisson–Tweedie distribution. Moreover, we give a manageable expression and a suitable recursive relationship for the corresponding probability function. Finally, an application to scientometric data—which deals with the scientific output of the researchers of the University of Siena—is considered.  相似文献   

13.
A method for robustness in linear models is to assume that there is a mixture of standard and outlier observations with a different error variance for each class. For generalised linear models (GLMs) the mixture model approach is more difficult as the error variance for many distributions has a fixed relationship to the mean. This model is extended to GLMs by changing the classes to one where the standard class is a standard GLM and the outlier class which is an overdispersed GLM achieved by including a random effect term in the linear predictor. The advantages of this method are it can be extended to any model with a linear predictor, and outlier observations can be easily identified. Using simulation the model is compared to an M-estimator, and found to have improved bias and coverage. The method is demonstrated on three examples.  相似文献   

14.
ABSTRACT

We develop here an alternative information theoretic method of inference of problems in which all of the observed information is in terms of intervals. We focus on the unconditional case in which the observed information is in terms the minimal and maximal values at each period. Given interval data, we infer the joint and marginal distributions of the interval variable and its range. Our inferential procedure is based on entropy maximization subject to multidimensional moment conditions and normalization in which the entropy is defined over discretized intervals. The discretization is based on theory or empirically observed quantities. The number of estimated parameters is independent of the discretization so the level of discretization does not change the fundamental level of complexity of our model. As an example, we apply our method to study the weather pattern for Los Angeles and New York City across the last century.  相似文献   

15.
风险保费预测是非寿险费率厘定的重要组成部分。在传统的分位回归厘定风险保费中,通常假设分位数水平是事先给定的,缺乏一定的客观性。为此,提出了一种应用分位回归厘定风险保费的新方法。基于破产概率确定保单组合的总风险保费,建立个体保单的分位回归模型,并与总风险保费建立等式关系,通过数值方法求解出分位数水平,实现对个体保单风险保费的预测。通过一组实际数据分析表明,该方法具有良好的预测效果。  相似文献   

16.
Sequential analyses in clinical trials have ethical and economic advantages over fixed sample size methods. The sequential probability ratio test (SPRT) is a hypothesis testing procedure which evaluates data as it is collected. The original SPRT was developed by Wald for one-parameter families of distributions and later extended by Bartlett to handle the case of nuisance parameters. However, Bartlett's SPRT requires independent and identically distributed observations. In this paper we show that Bartlett's SPRT can be applied to generalized linear model (GLM) contexts. Then we propose an SPRT analysis methodology for a Poisson generalized linear mixed model (GLMM) that is suitable for our application to the design of a multicenter randomized clinical trial that compares two preventive treatments for surgical site infections. We validate the methodology with a simulation study that includes a comparison to Neyman–Pearson and Bayesian fixed sample size test designs and the Wald SPRT.  相似文献   

17.
Abstract.  We consider a two-component mixture model where one component distribution is known while the mixing proportion and the other component distribution are unknown. These kinds of models were first introduced in biology to study the differences in expression between genes. The various estimation methods proposed till now have all assumed that the unknown distribution belongs to a parametric family. In this paper, we show how this assumption can be relaxed. First, we note that generally the above model is not identifiable, but we show that under moment and symmetry conditions some 'almost everywhere' identifiability results can be obtained. Where such identifiability conditions are fulfilled we propose an estimation method for the unknown parameters which is shown to be strongly consistent under mild conditions. We discuss applications of our method to microarray data analysis and to the training data problem. We compare our method to the parametric approach using simulated data and, finally, we apply our method to real data from microarray experiments.  相似文献   

18.
In this paper, we present a statistical inference procedure for the step-stress accelerated life testing (SSALT) model with Weibull failure time distribution and interval censoring via the formulation of generalized linear model (GLM). The likelihood function of an interval censored SSALT is in general too complicated to obtain analytical results. However, by transforming the failure time to an exponential distribution and using a binomial random variable for failure counts occurred in inspection intervals, a GLM formulation with a complementary log-log link function can be constructed. The estimations of the regression coefficients used for the Weibull scale parameter are obtained through the iterative weighted least square (IWLS) method, and the shape parameter is updated by a direct maximum likelihood (ML) estimation. The confidence intervals for these parameters are estimated through bootstrapping. The application of the proposed GLM approach is demonstrated by an industrial example.  相似文献   

19.
In real‐data analysis, deciding the best subset of variables in regression models is an important problem. Akaike's information criterion (AIC) is often used in order to select variables in many fields. When the sample size is not so large, the AIC has a non‐negligible bias that will detrimentally affect variable selection. The present paper considers a bias correction of AIC for selecting variables in the generalized linear model (GLM). The GLM can express a number of statistical models by changing the distribution and the link function, such as the normal linear regression model, the logistic regression model, and the probit model, which are currently commonly used in a number of applied fields. In the present study, we obtain a simple expression for a bias‐corrected AIC (corrected AIC, or CAIC) in GLMs. Furthermore, we provide an ‘R’ code based on our formula. A numerical study reveals that the CAIC has better performance than the AIC for variable selection.  相似文献   

20.
In this paper we firstly develop a Sarmanov–Lee bivariate family of distributions with the beta and gamma as marginal distributions. We obtain the linear correlation coefficient showing that, although it is not a strong family of correlation, it can be greater than the value of this coefficient in the Farlie–Gumbel–Morgenstern family. We also determine other measures for this family: the coefficient of median concordance and the relative entropy, which are analyzed by comparison with the case of independence. Secondly, we consider the problem of premium calculation in a Poisson–Lindley and exponential collective risk model, where the Sarmanov–Lee family is used as a structure function. We determine the collective and Bayes premiums whose values are analyzed when independence and dependence between the risk profiles are considered, obtaining that notable variations in premiums values are obtained even when low levels of correlation are considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号