首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Insurance and economic data are often positive, and we need to take into account this peculiarity in choosing a statistical model for their distribution. An example is the inverse Gaussian (IG), which is one of the most famous and considered distributions with positive support. With the aim of increasing the use of the IG distribution on insurance and economic data, we propose a convenient mode-based parameterization yielding the reparametrized IG (rIG) distribution; it allows/simplifies the use of the IG distribution in various branches of statistics, and we give some examples. In nonparametric statistics, we define a smoother based on rIG kernels. By construction, the estimator is well-defined and does not allocate probability mass to unrealistic negative values. We adopt likelihood cross-validation to select the smoothing parameter. In robust statistics, we propose the contaminated IG distribution, a heavy-tailed generalization of the rIG distribution to accommodate mild outliers. Finally, for model-based clustering and semiparametric density estimation, we present finite mixtures of rIG distributions. We use the EM algorithm to obtain maximum likelihood estimates of the parameters of the mixture and contaminated models. We use insurance data about bodily injury claims, and economic data about incomes of Italian households, to illustrate the models.  相似文献   

2.
Intensity functions—which describe the spatial distribution of the occurrences of point processes—are useful for risk assessment. This paper deals with the robust nonparametric estimation of the intensity function of space–time data from events such as earthquakes. The basic approach consists of smoothing the frequency histograms with the local polynomial regression (LPR) estimator. This method allows for automatic boundary corrections, and its jump-preserving ability can be improved with robustness. We derive a robust local smoother from the weighted-average approach to M-estimation and we select its bandwidths with robust cross-validation (RCV). Further, we develop a robust recursive algorithm for sequential processing of the data binned in time. An extensive application to the Northern California earthquake catalog in the San Francisco, CA, area illustrates the method and proves its validity.  相似文献   

3.
In a single index Poisson regression model with unknown link function, the index parameter can be root- n consistently estimated by the method of pseudo maximum likelihood. In this paper, we study, by simulation arguments, the practical validity of the asymptotic behaviour of the pseudo maximum likelihood index estimator and of some associated cross-validation bandwidths. A robust practical rule for implementing the pseudo maximum likelihood estimation method is suggested, which uses the bootstrap for estimating the variance of the index estimator and a variant of bagging for numerically stabilizing its variance. Our method gives reasonable results even for moderate sized samples; thus, it can be used for doing statistical inference in practical situations. The procedure is illustrated through a real data example.  相似文献   

4.
This paper focuses on the inference of the normal mixture model with unequal variances. A feature of the model is flexibility of density shape, but its flexibility causes the unboundedness of the likelihood function and excessive sensitivity of the maximum likelihood estimator to outliers. A modified likelihood approach suggested in Basu et al. [1998, Biometrika 85, 549–559] can overcome these drawbacks. It is shown that the modified likelihood function is bounded above under a mild condition on mixing proportions and the resultant estimator is robust to outliers. A relationship between robustness and efficiency is investigated and an adaptive method for selecting the tuning parameter of the modified likelihood is suggested, based on the robust model selection criterion and the cross-validation. An EM-like algorithm is also constructed. Numerical studies are presented to evaluate the performance. The robust method is applied to single nuleotide polymorphism typing for the purpose of outlier detection and clustering.  相似文献   

5.
Generalized linear mixed models (GLMMs) are widely used to analyse non-normal response data with extra-variation, but non-robust estimators are still routinely used. We propose robust methods for maximum quasi-likelihood and residual maximum quasi-likelihood estimation to limit the influence of outlying observations in GLMMs. The estimation procedure parallels the development of robust estimation methods in linear mixed models, but with adjustments in the dependent variable and the variance component. The methods proposed are applied to three data sets and a comparison is made with the nonparametric maximum likelihood approach. When applied to a set of epileptic seizure data, the methods proposed have the desired effect of limiting the influence of outlying observations on the parameter estimates. Simulation shows that one of the residual maximum quasi-likelihood proposals has a smaller bias than those of the other estimation methods. We further discuss the equivalence of two GLMM formulations when the response variable follows an exponential family. Their extensions to robust GLMMs and their comparative advantages in modelling are described. Some possible modifications of the robust GLMM estimation methods are given to provide further flexibility for applying the method.  相似文献   

6.
This paper demonstrates that well-known parameter estimation methods for Gaussian fields place different emphasis on the high and low frequency components of the data. As a consequence, the relative importance of the frequencies under the objective of the analysis should be taken into account when selecting an estimation method, in addition to other considerations such as statistical and computational efficiency. The paper also shows that when noise is added to the Gaussian field, maximum pseudolikelihood automatically sets the smoothing parameter of the model equal to one. A simulation study then indicates that generalised cross-validation is more robust than maximum likelihood un-

der model misspecification in smoothing and image restoration problems. This has implications for Bayesian procedures since these use the same weightings of the frequencies as the likelihood.  相似文献   

7.
Parameter estimation of the generalized Pareto distribution—Part II   总被引:1,自引:0,他引:1  
This is the second part of a paper which focuses on reviewing methods for estimating the parameters of the generalized Pareto distribution (GPD). The GPD is a very important distribution in the extreme value context. It is commonly used for modeling the observations that exceed very high thresholds. The ultimate success of the GPD in applications evidently depends on the parameter estimation process. Quite a few methods exist in the literature for estimating the GPD parameters. Estimation procedures, such as the maximum likelihood (ML), the method of moments (MOM) and the probability weighted moments (PWM) method were described in Part I of the paper. We shall continue to review methods for estimating the GPD parameters, in particular methods that are robust and procedures that use the Bayesian methodology. As in Part I, we shall focus on those that are relatively simple and straightforward to be applied to real world data.  相似文献   

8.
In this paper, a robust estimator is proposed for partially linear regression models. We first estimate the nonparametric component using the penalized regression spline, then we construct an estimator of parametric component by using robust S-estimator. We propose an iterative algorithm to solve the proposed optimization problem, and introduce a robust generalized cross-validation to select the penalized parameter. Simulation studies and a real data analysis illustrate that the our proposed method is robust against outliers in the dataset or errors with heavy tails.  相似文献   

9.
We propose a new modified (biased) cross-validation method for adaptively determining the bandwidth in a nonparametric density estimation setup. It is shown that the method provides consistent minimizers. Some simulation results are reported on which compare the small sample behavior of the new and the classical cross-validation selectors.  相似文献   

10.
The EM algorithm is often used for finding the maximum likelihood estimates in generalized linear models with incomplete data. In this article, the author presents a robust method in the framework of the maximum likelihood estimation for fitting generalized linear models when nonignorable covariates are missing. His robust approach is useful for downweighting any influential observations when estimating the model parameters. To avoid computational problems involving irreducibly high‐dimensional integrals, he adopts a Metropolis‐Hastings algorithm based on a Markov chain sampling method. He carries out simulations to investigate the behaviour of the robust estimates in the presence of outliers and missing covariates; furthermore, he compares these estimates to the classical maximum likelihood estimates. Finally, he illustrates his approach using data on the occurrence of delirium in patients operated on for abdominal aortic aneurysm.  相似文献   

11.
In this paper, we consider the problem of robust estimation of the fractional parameter, d, in long memory autoregressive fractionally integrated moving average processes, when two types of outliers, i.e. additive and innovation, are taken into account without knowing their number, position or intensity. The proposed method is a weighted likelihood estimation (WLE) approach for which needed definitions and algorithm are given. By an extensive Monte Carlo simulation study, we compare the performance of the WLE method with the performance of both the approximated maximum likelihood estimation (MLE) and the robust M-estimator proposed by Beran (Statistics for Long-Memory Processes, Chapman & Hall, London, 1994). We find that robustness against the two types of considered outliers can be achieved without loss of efficiency. Moreover, as a byproduct of the procedure, we can classify the suspicious observations in different kinds of outliers. Finally, we apply the proposed methodology to the Nile River annual minima time series.  相似文献   

12.
A class of predictive densities is derived by weighting the observed samples in maximizing the log-likelihood function. This approach is effective in cases such as sample surveys or design of experiments, where the observed covariate follows a different distribution than that in the whole population. Under misspecification of the parametric model, the optimal choice of the weight function is asymptotically shown to be the ratio of the density function of the covariate in the population to that in the observations. This is the pseudo-maximum likelihood estimation of sample surveys. The optimality is defined by the expected Kullback–Leibler loss, and the optimal weight is obtained by considering the importance sampling identity. Under correct specification of the model, however, the ordinary maximum likelihood estimate (i.e. the uniform weight) is shown to be optimal asymptotically. For moderate sample size, the situation is in between the two extreme cases, and the weight function is selected by minimizing a variant of the information criterion derived as an estimate of the expected loss. The method is also applied to a weighted version of the Bayesian predictive density. Numerical examples as well as Monte-Carlo simulations are shown for polynomial regression. A connection with the robust parametric estimation is discussed.  相似文献   

13.
We propose a vector generalized additive modeling framework for taking into account the effect of covariates on angular density functions in a multivariate extreme value context. The proposed methods are tailored for settings where the dependence between extreme values may change according to covariates. We devise a maximum penalized log‐likelihood estimator, discuss details of the estimation procedure, and derive its consistency and asymptotic normality. The simulation study suggests that the proposed methods perform well in a wealth of simulation scenarios by accurately recovering the true covariate‐adjusted angular density. Our empirical analysis reveals relevant dynamics of the dependence between extreme air temperatures in two alpine resorts during the winter season.  相似文献   

14.
In this paper we present the construction of robust designs for a possibly misspecified generalized linear regression model when the data are censored. The minimax designs and unbiased designs are found for maximum likelihood estimation in the context of both prediction and extrapolation problems. This paper extends preceding work of robust designs for complete data by incorporating censoring and maximum likelihood estimation. It also broadens former work of robust designs for censored data from others by considering both nonlinearity and much more arbitrary uncertainty in the fitted regression response and by dropping all restrictions on the structure of the regressors. Solutions are derived by a nonsmooth optimization technique analytically and given in full generality. A typical example in accelerated life testing is also demonstrated. We also investigate implementation schemes which are utilized to approximate a robust design having a density. Some exact designs are obtained using an optimal implementation scheme.  相似文献   

15.
We propose a new nonparametric estimator for the density function of multivariate bounded data. As frequently observed in practice, the variables may be partially bounded (e.g. nonnegative) or completely bounded (e.g. in the unit interval). In addition, the variables may have a point mass. We reduce the conditions on the underlying density to a minimum by proposing a nonparametric approach. By using a gamma, a beta, or a local linear kernel (also called boundary kernels), in a product kernel, the suggested estimator becomes simple in implementation and robust to the well known boundary bias problem. We investigate the mean integrated squared error properties, including the rate of convergence, uniform strong consistency and asymptotic normality. We establish consistency of the least squares cross-validation method to select optimal bandwidth parameters. A detailed simulation study investigates the performance of the estimators. Applications using lottery and corporate finance data are provided.  相似文献   

16.
Maximum penalized likelihood estimation is applied in non(semi)-para-metric regression problems, and enables us exploratory identification and diagnostics of nonlinear regression relationships. The smoothing parameter A controls trade-off between the smoothness and the goodness-of-fit of a function. The method of cross-validation is used for selecting A, but the generalized cross-validation, which is based on the squared error criterion, shows bad be¬havior in non-normal distribution and can not often select reasonable A. The purpose of this study is to propose a method which gives more suitable A and to evaluate the performance of it.

A method of simple calculation for the delete-one estimates in the likeli¬hood-based cross-validation (LCV) score is described. A score of similar form to the Akaike information criterion (AIC) is also derived. The proposed scores are compared with the ones of standard procedures by using data sets in liter¬atures. Simulations are performed to compare the patterns of selecting A and overall goodness-of-fit and to evaluate the effects of some factors. The LCV-scores by the simple calculation provide good approximation to the exact one if λ is not extremeiy smaii Furthermore the LCV scores by the simple size it possible to select X adaptively They have the effect, of reducing the bias of estimates and provide better performance in the sense of overall goodness-of fit. These scores are useful especially in the case of small sample size and in the case of binary logistic regression.  相似文献   

17.
ABSTRACT

This work treats non-parametric estimation of multivariate probability mass functions, using multivariate discrete associated kernels. We propose a Bayesian local approach to select the matrix of bandwidths considering the multivariate Dirac Discrete Uniform and the product of binomial kernels, and treating the bandwidths as a diagonal matrix of parameters with some prior distribution. The performances of this approach and the cross-validation method are compared using simulations and real count data sets. The obtained results show that the Bayes local method performs better than cross-validation in terms of integrated squared error.  相似文献   

18.
Estimation in the multivariate context when the number of observations available is less than the number of variables is a classical theoretical problem. In order to ensure estimability, one has to assume certain constraints on the parameters. A method for maximum likelihood estimation under constraints is proposed to solve this problem. Even in the extreme case where only a single multivariate observation is available, this may provide a feasible solution. It simultaneously provides a simple, straightforward methodology to allow for specific structures within and between covariance matrices of several populations. This methodology yields exact maximum likelihood estimates.  相似文献   

19.
We study the one-dimensional Ornstein–Uhlenbeck (OU) processes with marginal law given by tempered stable and tempered infinitely divisible distributions. We investigate the transition law between consecutive observations of these processes and evaluate the characteristic function of integrated tempered OU processes with a view toward practical applications. We then analyze how to draw a random sample from this class of processes by considering both the classical inverse transform algorithm and an acceptance–rejection method based on simulating a stable random sample. Using a maximum likelihood estimation method based on the fast Fourier transform, we empirically assess the simulation algorithm performance.  相似文献   

20.
A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model. The procedure admits bandwidth selection which is flexible in terms of the amount of smoothing required. In addition, the basic model can be extended to incorporate local smoothing of the density estimate. The method is shown to perform well in both theoretical and practical situations, and we compare our method with those of Abramson (The Annals of Statistics 10: 1217–1223) and Sain and Scott (Journal of the American Statistical Association 91: 1525–1534). In particular, we note that in certain cases, the Sain and Scott method performs poorly even with relatively large sample sizes.We compare various bandwidth selection methods using standard mean integrated square error criteria to assess the quality of the density estimates. We study situations where the underlying density is assumed both known and unknown, and note that in practice, our method performs well when sample sizes are small. In addition, we also apply the methods to real data, and again we believe our methods perform at least as well as existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号