首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We investigate the Bayes estimation of the means in Poisson decomposable graphical models. Some classes of Bayes estimators are provided which improve on the maximum likelihood estimator under the normalized squared error loss. Both proper and improper priors are included in the proposed classes of priors. Concerning the generalized Bayes estimators with respect to the improper priors, we address their admissibility.  相似文献   

2.
Abstract

Covariance estimation and selection for multivariate datasets in a high-dimensional regime is a fundamental problem in modern statistics. Gaussian graphical models are a popular class of models used for this purpose. Current Bayesian methods for inverse covariance matrix estimation under Gaussian graphical models require the underlying graph and hence the ordering of variables to be known. However, in practice, such information on the true underlying model is often unavailable. We therefore propose a novel permutation-based Bayesian approach to tackle the unknown variable ordering issue. In particular, we utilize multiple maximum a posteriori estimates under the DAG-Wishart prior for each permutation, and subsequently construct the final estimate of the inverse covariance matrix. The proposed estimator has smaller variability and yields order-invariant property. We establish posterior convergence rates under mild assumptions and illustrate that our method outperforms existing approaches in estimating the inverse covariance matrices via simulation studies.  相似文献   

3.
We investigate three interval estimators for binomial misclassification rates in a complementary Poisson model where the data are possibly misclassified: a Wald-based interval, a score-based interval, and an interval based on the profile log-likelihood statistic. We investigate the coverage and average width properties of these intervals via a simulation study. For small Poisson counts and small misclassification rates, the intervals can perform poorly in terms of coverage. The profile log-likelihood confidence interval (CI) is often proved to outperform the other intervals with good coverage and width properties. Lastly, we apply the CIs to a real data set involving traffic accident data that contain misclassified counts.  相似文献   

4.
In many financial applications, Poisson mixture regression models are commonly used to analyze heterogeneous count data. When fitting these models, the observed counts are supposed to come from two or more subpopulations and parameter estimation is typically performed by means of maximum likelihood via the Expectation–Maximization algorithm. In this study, we discuss briefly the procedure for fitting Poisson mixture regression models by means of maximum likelihood, the model selection and goodness-of-fit tests. These models are applied to a real data set for credit-scoring purposes. We aim to reveal the impact of demographic and financial variables in creating different groups of clients and to predict the group to which each client belongs, as well as his expected number of defaulted payments. The model's conclusions are very interesting, revealing that the population consists of three groups, contrasting with the traditional good versus bad categorization approach of the credit-scoring systems.  相似文献   

5.
The analysis of traffic accident data is crucial to address numerous concerns, such as understanding contributing factors in an accident''s chain-of-events, identifying hotspots, and informing policy decisions about road safety management. The majority of statistical models employed for analyzing traffic accident data are logically count regression models (commonly Poisson regression) since a count – like the number of accidents – is used as the response. However, features of the observed data frequently do not make the Poisson distribution a tenable assumption. For example, observed data rarely demonstrate an equal mean and variance and often times possess excess zeros. Sometimes, data may have heterogeneous structure consisting of a mixture of populations, rather than a single population. In such data analyses, mixtures-of-Poisson-regression models can be used. In this study, the number of injuries resulting from casualties of traffic accidents registered by the General Directorate of Security (Turkey, 2005–2014) are modeled using a novel mixture distribution with two components: a Poisson and zero-truncated-Poisson distribution. Such a model differs from existing mixture models in literature where the components are either all Poisson distributions or all zero-truncated Poisson distributions. The proposed model is compared with the Poisson regression model via simulation and in the analysis of the traffic data.  相似文献   

6.
The zero-inflated regression models such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) or zero-inflated generalized Poisson (ZIGP) regression models can model the count data with excess zeros. The ZINB model can handle over-dispersed and the ZIGP model can handle the over or under-dispersed count data with excess zeros as well. Moreover, the count data may be correlated because of data collection procedure or special study design. The clustered sampling approach is one of the examples in which the correlation among subjects could be defined. In such situations, a marginal model using generalized estimating equation (GEE) approach can incorporate these correlations and lead up to the relationships at the population level. In this study, the GEE-based zero-inflated generalized Poisson regression model was proposed to fit over and under-dispersed clustered count data with excess zeros.  相似文献   

7.
Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology.  相似文献   

8.
Multivariate zero-inflated Poisson (ZIP) distributions are important tools for modelling and analysing correlated count data with extra zeros. Unfortunately, existing multivariate ZIP distributions consider only the overall zero-inflation while the component zero-inflation is not well addressed. This paper proposes a flexible multivariate ZIP distribution, called the multivariate component ZIP distribution, in which both the overall and component zero-inflations are taken into account. Likelihood-based inference procedures including the calculation of maximum likelihood estimates of parameters in the model without and with covariates are provided. Simulation studies indicate that the performance of the proposed methods on the multivariate component ZIP model is satisfactory. The Australia health care utilisation data set is analysed to demonstrate that the new distribution is more appropriate than the existing multivariate ZIP distributions.  相似文献   

9.
We propose a model for count data from two-stage cluster sampling, where observations within each cluster are subjected simultaneously to internal influences and external factors at the cluster level. This model can be seen as a two-stage hierarchical model with local and global predictors. This parameter-driven model causes the counts within a cluster to share a common latent factor and to be correlated. Maximum likelihood (ml) estimation based on an EM algorithm for the model is discussed. Simulation study is carried out to assess the benefit of using ml estimates compared to a standard Poisson regression analysis that ignores the within cluster correlation.  相似文献   

10.
In manpower planning it is cornmoniy tue case tnat employees withuraw from active service for a period of time before returning to take up post at a later date. Such periods of absence are frequently of major concern to employers who are anxious to ensure that employees return as soon as possible. The distribution of duration of such periods of absence are therefore of considerable interest as is the probability that such employees will ever return to active service. In this paper we derive a nonparametric estimator for such a lifetime distribution based on renewal data which are subject to various forms of incompleteness, namely right censoring, left and right truncation, and forward recurrence. Artificial truncation is used to ensure that the data are time homogeneous. A nonparametric maximum likelihood estimator for the lifetime.  相似文献   

11.
ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

12.
We present a model for data in the form of matched pairs of counts. Our work is motivated by a problem in fission-track analysis, where the determination of a crystal's age is based on the ratio of counts of spontaneous and induced tracks. It is often reasonable to assume that the counts follow a Poisson distribution, but typically they are overdispersed and there exists a positive correlation between the numbers of spontaneous and induced tracks in the same crystal. We propose a model that allows for both overdispersion and correlation by assuming that the mean densities follow a bivariate Wishart distribution. Our model is quite general, having the usual negative-binomial and Poisson models as special cases. We propose a maximum-likelihood estimation method based on a stochastic implementation of the EM algorithm, and we derive the asymptotic standard errors of the parameter estimates. We illustrate the method with a data set of fission-track counts in matched areas of zircon crystals.  相似文献   

13.
The barely known continuous reciprocal inverse Gaussian distribution is used in this paper to introduce the Poisson-reciprocal inverse Gaussian discrete distribution. Several of its most relevant statistical properties are examined, some of them directly inherited from the reciprocal of the inverse Gaussian distribution. Furthermore, a mixed Poisson regression model that uses the reciprocal inverse Gaussian as mixing distribution is presented. Parameters estimation in this regression model is performed via an EM type algorithm. In light of the numerical results displayed in the paper, the distributions introduced in this work are competitive with the classical negative binomial and Poisson-inverse Gaussian distributions.  相似文献   

14.
We review the Fisher scoring and EM algorithms for incomplete multivariate data from an estimating function point of view, and examine the corresponding quasi-score functions under second-moment assumptions. A bias-corrected REML-type estimator for the covariance matrix is derived, and the Fisher, Godambe and empirical sandwich information matrices are compared. We make a numerical investigation of the two algorithms, and compare with a hybrid algorithm, where Fisher scoring is used for the mean vector and the EM algorithm for the covariance matrix.  相似文献   

15.
We introduce in this paper, the shrinkage estimation method in the lognormal regression model for censored data involving many predictors, some of which may not have any influence on the response of interest. We develop the asymptotic properties of the shrinkage estimators (SEs) using the notion of asymptotic distributional biases and risks. We show that if the shrinkage dimension exceeds two, the asymptotic risk of the SEs is strictly less than the corresponding classical estimators. Furthermore, we study the penalty (LASSO and adaptive LASSO) estimation methods and compare their relative performance with the SEs. A simulation study for various combinations of the inactive predictors and censoring percentages shows that the SEs perform better than the penalty estimators in certain parts of the parameter space, especially when there are many inactive predictors in the model. It also shows that the shrinkage and penalty estimators outperform the classical estimators. A real-life data example using Worcester heart attack study is used to illustrate the performance of the suggested estimators.  相似文献   

16.
17.
We propose a method for estimating parameters in generalized linear models when the outcome variable is missing for some subjects and the missing data mechanism is non-ignorable. We assume throughout that the covariates are fully observed. One possible method for estimating the parameters is maximum likelihood with a non-ignorable missing data model. However, caution must be used when fitting non-ignorable missing data models because certain parameters may be inestimable for some models. Instead of fitting a non-ignorable model, we propose the use of auxiliary information in a likelihood approach to reduce the bias, without having to specify a non-ignorable model. The method is applied to a mental health study.  相似文献   

18.
The Zero-inflated Poisson distribution has been used in the modeling of count data in different contexts. This model tends to be influenced by outliers because of the excessive occurrence of zeroes, thus outlier identification and robust parameter estimation are important for such distribution. Some outlier identification methods are studied in this paper, and their applications and results are also presented with an example. To eliminate the effect of outliers, two robust parameter estimates are proposed based on the trimmed mean and the Winsorized mean. Simulation results show the robustness of our proposed parameter estimates.  相似文献   

19.
In this paper, we assume the number of competing causes to follow an exponentially weighted Poisson distribution. By assuming the initial number of competing causes can undergo destruction and that the population of interest has a cure fraction, we develop the EM algorithm for the determination of the MLEs of the model parameters of such a general cure model. This model is more flexible than the promotion time cure model and also provides an interesting and realistic interpretation of the biological mechanism of the occurrence of an event of interest. Instead of assuming a particular parametric distribution for the lifetime, we assume the lifetime to belong to the wider class of generalized gamma distribution. This allows us to carry out a model discrimination to select a parsimonious lifetime distribution that provides the best fit to the data. Within the EM framework, a two-way profile likelihood approach is proposed to estimate the shape parameters. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination is carried out by means of the likelihood ratio test and information-based methods. Finally, a data on melanoma is analyzed for illustrative purpose.  相似文献   

20.
For longitudinal data, the within-subject dependence structure and covariance parameters may be of practical and theoretical interests. The estimation of covariance parameters has received much attention and been studied mainly in the framework of generalized estimating equations (GEEs). The GEEs method, however, is sensitive to outliers. In this paper, an alternative set of robust generalized estimating equations for both the mean and covariance parameters are proposed in the partial linear model for longitudinal data. The asymptotic properties of the proposed estimators of regression parameters, non-parametric function and covariance parameters are obtained. Simulation studies are conducted to evaluate the performance of the proposed estimators under different contaminations. The proposed method is illustrated with a real data analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号