首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

We introduce a new parsimonious bimodal distribution, referred to as the bimodal skew-symmetric Normal (BSSN) distribution, which is potentially effective in capturing bimodality, excess kurtosis, and skewness. Explicit expressions for the moment-generating function, mean, variance, skewness, and excess kurtosis were derived. The shape properties of the proposed distribution were investigated in regard to skewness, kurtosis, and bimodality. Maximum likelihood estimation was considered and an expression for the observed information matrix was provided. Illustrative examples using medical and financial data as well as simulated data from a mixture of normal distributions were worked.  相似文献   

2.
Bimodal truncated count distributions are frequently observed in aggregate survey data and in user ratings when respondents are mixed in their opinion. They also arise in censored count data, where the highest category might create an additional mode. Modeling bimodal behavior in discrete data is useful for various purposes, from comparing shapes of different samples (or survey questions) to predicting future ratings by new raters. The Poisson distribution is the most common distribution for fitting count data and can be modified to achieve mixtures of truncated Poisson distributions. However, it is suitable only for modeling equidispersed distributions and is limited in its ability to capture bimodality. The Conway–Maxwell–Poisson (CMP) distribution is a two-parameter generalization of the Poisson distribution that allows for over- and underdispersion. In this work, we propose a mixture of CMPs for capturing a wide range of truncated discrete data, which can exhibit unimodal and bimodal behavior. We present methods for estimating the parameters of a mixture of two CMP distributions using an EM approach. Our approach introduces a special two-step optimization within the M step to estimate multiple parameters. We examine computational and theoretical issues. The methods are illustrated for modeling ordered rating data as well as truncated count data, using simulated and real examples.  相似文献   

3.
Summary.  Advances in understanding the biological underpinnings of many cancers have led increasingly to the use of molecularly targeted anticancer therapies. Because the platelet-derived growth factor receptor (PDGFR) has been implicated in the progression of prostate cancer bone metastases, it is of great interest to examine possible relationships between PDGFR inhibition and therapeutic outcomes. We analyse the association between change in activated PDGFR (phosphorylated PDGFR) and progression-free survival time based on large within-patient samples of cell-specific phosphorylated PDGFR values taken before and after treatment from each of 88 prostate cancer patients. To utilize these paired samples as covariate data in a regression model for progression-free survival time, and be cause the phosphorylated PDGFR distributions are bimodal, we first employ a Bayesian hierarchical mixture model to obtain a deconvolution of the pretreatment and post-treatment within-patient phosphorylated PDGFR distributions. We evaluate fits of the mixture model and a non-mixture model that ignores the bimodality by using a supnorm metric to compare the empirical distribution of each phosphorylated PDGFR data set with the corresponding fitted distribution under each model. Our results show that first using the mixture model to account for the bimodality of the within-patient phosphorylated PDGFR distributions, and then using the posterior within-patient component mean changes in phosphorylated PDGFR so obtained as covariates in the regression model for progression-free survival time, provides an improved estimation.  相似文献   

4.
Over the years, many papers used parametric distributions to model crop yields, such as: normal (N), Beta, Log-normal and the Skew-normal (SN). These models are well-defined, mathematically and also computationally, but its do not incorporate bimodality. Therefore, it is necessary to study distributions which are more flexible in modeling, since most of crop yield data in Brazil presents evidence of asymmetry or bimodality. Thus, the aim of this study was to model and forecast soybean yields for municipalities in the State of Paran, in the period from 1980 to 2014, using the Odd log normal logistic (OLLN) distribution for the bimodal data and the Beta, SN and Skew-t distributions for the symmetrical and asymmetrical series. The OLLN model was the one which best fit the data. The results were discussed in the context of crop insurance pricing.  相似文献   

5.
Approximating the distribution of mobile communications expenditures (MCE) is complicated by zero observations in the sample. To deal with the zero observations by allowing a point mass at zero, a mixture model of MCE distributions is proposed and applied. The MCE distribution is specified as a mixture of two distributions, one with a point mass at zero and the other with full support on the positive half of the real line. The model is empirically verified for individual MCE survey data collected in Seoul, Korea. The mixture model can easily capture the common bimodality feature of the MCE distribution. In addition, when covariates were added to the model, it was found that the probability that an individual has non-expenditure significantly varies with some variables. Finally, the goodness-of-fit test suggests that the data are well represented by the mixture model.  相似文献   

6.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

7.
One of the fundamental issues in analyzing microarray data is to determine which genes are expressed and which ones are not for a given group of subjects. In datasets where many genes are expressed and many are not expressed (i.e., underexpressed), a bimodal distribution for the gene expression levels often results, where one mode of the distribution represents the expressed genes and the other mode represents the underexpressed genes. To model this bimodality, we propose a new class of mixture models that utilize a random threshold value for accommodating bimodality in the gene expression distribution. Theoretical properties of the proposed model are carefully examined. We use this new model to examine the problem of differential gene expression between two groups of subjects, develop prior distributions, and derive a new criterion for determining which genes are differentially expressed between the two groups. Prior elicitation is carried out using empirical Bayes methodology in order to estimate the threshold value as well as elicit the hyperparameters for the two component mixture model. The new gene selection criterion is demonstrated via several simulations to have excellent false positive rate and false negative rate properties. A gastric cancer dataset is used to motivate and illustrate the proposed methodology.  相似文献   

8.
In this paper, we discuss the class of generalized Birnbaum–Saunders distributions, which is a very flexible family suitable for modeling lifetime data as it allows for different degrees of kurtosis and asymmetry and unimodality as well as bimodality. We describe the theoretical developments on this model including properties, transformations and related distributions, lifetime analysis, and shape analysis. We also discuss methods of inference based on uncensored and censored data, diagnostics methods, goodness-of-fit tests, and random number generation algorithms for the generalized Birnbaum–Saunders model. Finally, we present some illustrative examples and show that this distribution fits the data better than the classical Birnbaum–Saunders model.  相似文献   

9.
We introduce a multivariate heteroscedastic measurement error model for replications under scale mixtures of normal distribution. The model can provide a robust analysis and can be viewed as a generalization of multiple linear regression from both model structure and distribution assumption. An efficient method based on Markov Chain Monte Carlo is developed for parameter estimation. The deviance information criterion and the conditional predictive ordinates are used as model selection criteria. Simulation studies show robust inference behaviours of the model against both misspecification of distributions and outliers. We work out an illustrative example with a real data set on measurements of plant root decomposition.  相似文献   

10.
We propose a prior probability model for two distributions that are ordered according to a stochastic precedence constraint, a weaker restriction than the more commonly utilized stochastic order constraint. The modeling approach is based on structured Dirichlet process mixtures of normal distributions. Full inference for functionals of the stochastic precedence constrained mixture distributions is obtained through a Markov chain Monte Carlo posterior simulation method. A motivating application involves study of the discriminatory ability of continuous diagnostic tests in epidemiologic research. Here, stochastic precedence provides a natural restriction for the distributions of test scores corresponding to the non-infected and infected groups. Inference under the model is illustrated with data from a diagnostic test for Johne’s disease in dairy cattle. We also apply the methodology to the comparison of survival distributions associated with two distinct conditions, and illustrate with analysis of data on survival time after bone marrow transplantation for treatment of leukemia.  相似文献   

11.
We introduce two new families of univariate distributions that we call hyperminimal and hypermaximal distributions. These families have interesting applications in the context of reliability theory in that they contain that of coherent system lifetime distributions. For these families, we obtain distributions, bounds, and moments. We also define the minimal and maximal signatures of a coherent system with exchangeable components which allow us to represent the system distribution as generalized mixtures (i.e., mixtures with possibly negative weights) of series and parallel systems. These results can also be applied to order statistics (k-out-of-n systems). Finally, we give some applications studying coherent systems with different multivariate exponential joint distributions.  相似文献   

12.
In this article, we propose mixtures of skew Laplace normal (SLN) distributions to model both skewness and heavy-tailedness in the neous data set as an alternative to mixtures of skew Student-t-normal (STN) distributions. We give the expectation–maximization (EM) algorithm to obtain the maximum likelihood (ML) estimators for the parameters of interest. We also analyze the mixture regression model based on the SLN distribution and provide the ML estimators of the parameters using the EM algorithm. The performance of the proposed mixture model is illustrated by a simulation study and two real data examples.  相似文献   

13.
We consider here a generalization of the skew-normal distribution, GSN(λ1,λ2,ρ), defined through a standard bivariate normal distribution with correlation ρ, which is a special case of the unified multivariate skew-normal distribution studied recently by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574]. We then present some simple and useful properties of this distribution and also derive its moment generating function in an explicit form. Next, we show that distributions of order statistics from the trivariate normal distribution are mixtures of these generalized skew-normal distributions; thence, using the established properties of the generalized skew-normal distribution, we derive the moment generating functions of order statistics, and also present expressions for means and variances of these order statistics.Next, we introduce a generalized skew-tν distribution, which is a special case of the unified multivariate skew-elliptical distribution presented by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574] and is in fact a three-parameter generalization of Azzalini and Capitanio's [2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution. J. Roy. Statist. Soc. Ser. B 65, 367–389] univariate skew-tν form. We then use the relationship between the generalized skew-normal and skew-tν distributions to discuss some properties of generalized skew-tν as well as distributions of order statistics from bivariate and trivariate tν distributions. We show that these distributions of order statistics are indeed mixtures of generalized skew-tν distributions, and then use this property to derive explicit expressions for means and variances of these order statistics.  相似文献   

14.
This paper introduces W-tests for assessing homogeneity in mixtures of discrete probability distributions. A W-test statistic depends on the data solely through parameter estimators and, if a penalized maximum likelihood estimation framework is used, has a tractable asymptotic distribution under the null hypothesis of homogeneity. The large-sample critical values are quantiles of a chi-square distribution multiplied by an estimable constant for which we provide an explicit formula. In particular, the estimation of large-sample critical values does not involve simulation experiments or random field theory. We demonstrate that W-tests are generally competitive with a benchmark test in terms of power to detect heterogeneity. Moreover, in many situations, the large-sample critical values can be used even with small to moderate sample sizes. The main implementation issue (selection of an underlying measure) is thoroughly addressed, and we explain why W-tests are well-suited to problems involving large and online data sets. Application of a W-test is illustrated with an epidemiological data set.  相似文献   

15.
A problem of estimating regression coefficients is considered when the distribution of error terms is unknown but symmetric. We propose the use of reference distributions having various kurtosis values. It is assumed that the true error distribution is one of the reference distributions, but the indicator variable for the true distribution is missing. The generalized expectation–maximization algorithm combined with a line search is developed for estimating regression coefficients. Simulation experiments are carried out to compare the performance of the proposed approach with some existing robust regression methods including least absolute deviation, Lp, Huber M regression and an approximation using normal mixtures under various error distributions. As the error distribution is far from a normal distribution, the proposed method is observed to show better performance than other methods.  相似文献   

16.
This paper focuses on the development of a new extension of the generalized skew-normal distribution introduced in Gómez et al. [Generalized skew-normal models: properties and inference. Statistics. 2006;40(6):495–505]. To produce the generalization a new parameter is introduced, the signal of which has the flexibility of yielding unimodal as well as bimodal distributions. We study its properties, derive a stochastic representation and state some expressions that facilitate moments derivation. Maximum likelihood is implemented via the EM algorithm which is based on the stochastic representation derived. We show that the Fisher information matrix is singular and discuss ways of getting round this problem. An illustration using real data reveals that the model can capture well special data features such as bimodality and asymmetry.  相似文献   

17.
We present a new class of models to fit longitudinal data, obtained with a suitable modification of the classical linear mixed-effects model. For each sample unit, the joint distribution of the random effect and the random error is a finite mixture of scale mixtures of multivariate skew-normal distributions. This extension allows us to model the data in a more flexible way, taking into account skewness, multimodality and discrepant observations at the same time. The scale mixtures of skew-normal form an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal, skew-Student-t, skew-slash and the skew-contaminated normal distributions as special cases, being a flexible alternative to the use of the corresponding symmetric distributions in this type of models. A simple efficient MCMC Gibbs-type algorithm for posterior Bayesian inference is employed. In order to illustrate the usefulness of the proposed methodology, two artificial and two real data sets are analyzed.  相似文献   

18.
The testing problem for the order of finite mixture models has a long history and remains an active research topic. Since Ghosh & Sen (1985) revealed the hard-to-manage asymptotic properties of the likelihood ratio test, many successful alternative approaches have been developed. The most successful attempts include the modified likelihood ratio test and the EM-test, which lead to neat solutions for finite mixtures of univariate normal distributions, finite mixtures of single-parameter distributions, and several mixture-like models. The problem remains challenging, and there is still no generic solution for location-scale mixtures. In this article, we provide an EM-test solution for homogeneity for finite mixtures of location-scale family distributions. This EM-test has nonstandard limiting distributions, but we are able to find the critical values numerically. We use computer experiments to obtain appropriate values for the tuning parameters. A simulation study shows that the fine-tuned EM-test has close to nominal type I errors and very good power properties. Two application examples are included to demonstrate the performance of the EM-test.  相似文献   

19.
This paper describes a Bayesian approach to make inference for risk reserve processes with an unknown claim‐size distribution. A flexible model based on mixtures of Erlang distributions is proposed to approximate the special features frequently observed in insurance claim sizes, such as long tails and heterogeneity. A Bayesian density estimation approach for the claim sizes is implemented using reversible jump Markov chain Monte Carlo methods. An advantage of the considered mixture model is that it belongs to the class of phase‐type distributions, and thus explicit evaluations of the ruin probabilities are possible. Furthermore, from a statistical point of view, the parametric structure of the mixtures of the Erlang distribution offers some advantages compared with the whole over‐parametrized family of phase‐type distributions. Given the observed claim arrivals and claim sizes, we show how to estimate the ruin probabilities, as a function of the initial capital, and predictive intervals that give a measure of the uncertainty in the estimations.  相似文献   

20.
We consider ways to estimate the mixing proportions in a finite mixture distribution or to estimate the number of components of the mixture distribution without making parametric assumptions about the component distributions. We require a vector of observations on each subject. This vector is mapped into a vector of 0s and 1s and summed. The resulting distribution of sums can be modelled as a mixture of binomials. We then work with the binomial mixture. The efficiency and robustness of this method are compared with the strategy of assuming multivariate normal mixtures when, typically, the true underlying mixture distribution is different. It is shown that in many cases the approach based on simple binomial mixtures is superior.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号