首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The K-means algorithm and the normal mixture model method are two common clustering methods. The K-means algorithm is a popular heuristic approach which gives reasonable clustering results if the component clusters are ball-shaped. Currently, there are no analytical results for this algorithm if the component distributions deviate from the ball-shape. This paper analytically studies how the K-means algorithm changes its classification rule as the normal component distributions become more elongated under the homoscedastic assumption and compares this rule with that of the Bayes rule from the mixture model method. We show that the classification rules of both methods are linear, but the slopes of the two classification lines change in the opposite direction as the component distributions become more elongated. The classification performance of the K-means algorithm is then compared to that of the mixture model method via simulation. The comparison, which is limited to two clusters, shows that the K-means algorithm provides poor classification performances consistently as the component distributions become more elongated while the mixture model method can potentially, but not necessarily, take advantage of this change and provide a much better classification performance.  相似文献   

2.
In this paper, we consider the maximum likelihood and Bayes estimation of the scale parameter of the half-logistic distribution based on a multiply type II censored sample. However, the maximum likelihood estimator(MLE) and Bayes estimator do not exist in an explicit form for the scale parameter. We consider a simple method of deriving an explicit estimator by approximating the likelihood function and discuss the asymptotic variances of MLE and approximate MLE. Also, an approximation based on the Laplace approximation (Tierney & Kadane, 1986) is used to obtain the Bayes estimator. In order to compare the MLE, approximate MLE and Bayes estimates of the scale parameter, Monte Carlo simulation is used.  相似文献   

3.
The present study deals with the method of estimation of the parameters of k-components load-sharing parallel system model in which each component’s failure time distribution is assumed to be geometric. The maximum likelihood estimates of the load-share parameters with their standard errors are obtained. (1 − γ) 100% joint, Bonferroni simultaneous and two bootstrap confidence intervals for the parameters have been constructed. Further, recognizing the fact that life testing experiments are time consuming, it seems realistic to consider the load-share parameters to be random variable. Therefore, Bayes estimates along with their standard errors of the parameters are obtained by assuming Jeffrey’s invariant and gamma priors for the unknown parameters. Since, Bayes estimators can not be found in closed form expressions, Tierney and Kadane’s approximation method have been used to compute Bayes estimates and standard errors of the parameters. Markov Chain Monte Carlo technique such as Gibbs sampler is also used to obtain Bayes estimates and highest posterior density credible intervals of the load-share parameters. Metropolis–Hastings algorithm is used to generate samples from the posterior distributions of the unknown parameters.  相似文献   

4.
In this article two-stage hierarchical Bayesian models are used for the observed occurrences of events in a rectangular region. Two Bayesian variable window scan statistics are introduced to test the null hypothesis that the observed events follow a specified two-stage hierarchical model vs an alternative that indicates a local increase in the average number of observed events in a subregion (clustering). Both procedures are based on a sequence of Bayes factors and their pp-values that have been generated via simulation of posterior samples of the parameters, under the null and alternative hypotheses. The posterior samples of the parameters have been generated by employing Gibbs sampling via introduction of auxiliary variables. Numerical results are presented to evaluate the performance of these variable window scan statistics.  相似文献   

5.
In this paper, maximum likelihood and Bayes estimators of the parameters, reliability and hazard functions have been obtained for two-parameter bathtub-shaped lifetime distribution when sample is available from progressive Type-II censoring scheme. The Markov chain Monte Carlo (MCMC) method is used to compute the Bayes estimates of the model parameters. It has been assumed that the parameters have gamma priors and they are independently distributed. Gibbs within the Metropolis–Hasting algorithm has been applied to generate MCMC samples from the posterior density function. Based on the generated samples, the Bayes estimates and highest posterior density credible intervals of the unknown parameters as well as reliability and hazard functions have been computed. The results of Bayes estimators are obtained under both the balanced-squared error loss and balanced linear-exponential (BLINEX) loss. Moreover, based on the asymptotic normality of the maximum likelihood estimators the approximate confidence intervals (CIs) are obtained. In order to construct the asymptotic CI of the reliability and hazard functions, we need to find the variance of them, which are approximated by delta and Bootstrap methods. Two real data sets have been analyzed to demonstrate how the proposed methods can be used in practice.  相似文献   

6.
Based on progressive Type II censored samples, we have derived the maximum likelihood and Bayes estimators for the two shape parameters and the reliability function of the exponentiated Weibull lifetime model. We obtained Bayes estimators using both the symmetric and asymmetric loss functions via squared error loss and linex loss functions. This was done with respect to the conjugate priors for two shape parameters. We used an approximation based on the Lindley (Trabajos de Stadistca 21, 223–237, 1980) method for obtaining Bayes estimates under these loss functions. We made comparisons between these estimators and the maximum likelihood estimators using a Monte Carlo simulation study.  相似文献   

7.
In the present paper we examine finite mixtures of multivariate Poisson distributions as an alternative class of models for multivariate count data. The proposed models allow for both overdispersion in the marginal distributions and negative correlation, while they are computationally tractable using standard ideas from finite mixture modelling. An EM type algorithm for maximum likelihood (ML) estimation of the parameters is developed. The identifiability of this class of mixtures is proved. Properties of ML estimators are derived. A real data application concerning model based clustering for multivariate count data related to different types of crime is presented to illustrate the practical potential of the proposed class of models.  相似文献   

8.
Starting with a decision theoretic formulation of simultaneous testing of null hypotheses against two-sided alternatives, a procedure controlling the Bayesian directional false discovery rate (BDFDR) is developed through controlling the posterior directional false discovery rate (PDFDR). This is an alternative to Lewis and Thayer [2004. A loss function related to the FDR for random effects multiple comparison. J. Statist. Plann. Inference 125, 49–58.] with a better control of the BDFDR. Moreover, it is optimum in the sense of being the non-randomized part of the procedure maximizing the posterior expectation of the directional per-comparison power rate given the data, while controlling the PDFDR. A corresponding empirical Bayes method is proposed in the context of one-way random effects model. Simulation study shows that the proposed Bayes and empirical Bayes methods perform much better from a Bayesian perspective than the procedures available in the literature.  相似文献   

9.
The EM algorithm is the standard method for estimating the parameters in finite mixture models. Yang and Pan [25] proposed a generalized classification maximum likelihood procedure, called the fuzzy c-directions (FCD) clustering algorithm, for estimating the parameters in mixtures of von Mises distributions. Two main drawbacks of the EM algorithm are its slow convergence and the dependence of the solution on the initial value used. The choice of initial values is of great importance in the algorithm-based literature as it can heavily influence the speed of convergence of the algorithm and its ability to locate the global maximum. On the other hand, the algorithmic frameworks of EM and FCD are closely related. Therefore, the drawbacks of FCD are the same as those of the EM algorithm. To resolve these problems, this paper proposes another clustering algorithm, which can self-organize local optimal cluster numbers without using cluster validity functions. These numerical results clearly indicate that the proposed algorithm is superior in performance of EM and FCD algorithms. Finally, we apply the proposed algorithm to two real data sets.  相似文献   

10.
We discuss the general form of a first-order correction to the maximum likelihood estimator which is expressed in terms of the gradient of a function, which could for example be the logarithm of a prior density function. In terms of Kullback–Leibler divergence, the correction gives an asymptotic improvement over maximum likelihood under rather general conditions. The theory is illustrated for Bayes estimators with conjugate priors. The optimal choice of hyper-parameter to improve the maximum likelihood estimator is discussed. The results based on Kullback–Leibler risk are extended to a wide class of risk functions.  相似文献   

11.
This article deals with a semisupervised learning based on naive Bayes assumption. A univariate Gaussian mixture density is used for continuous input variables whereas a histogram type density is adopted for discrete input variables. The EM algorithm is used for the computation of maximum likelihood estimators of parameters in the model when we fix the number of mixing components for each continuous input variable. We carry out a model selection for choosing a parsimonious model among various fitted models based on an information criterion. A common density method is proposed for the selection of significant input variables. Simulated and real datasets are used to illustrate the performance of the proposed method.  相似文献   

12.
The mean vector associated with several independent variates from the exponential subclass of Hudson (1978) is estimated under weighted squared error loss. In particular, the formal Bayes and “Stein-like” estimators of the mean vector are given. Conditions are also given under which these estimators dominate any of the “natural estimators”. Our conditions for dominance are motivated by a result of Stein (1981), who treated the Np (θ, I) case with p ≥ 3. Stein showed that formal Bayes estimators dominate the usual estimator if the marginal density of the data is superharmonic. Our present exponential class generalization entails an elliptic differential inequality in some natural variables. Actually, we assume that each component of the data vector has a probability density function which satisfies a certain differential equation. While the densities of Hudson (1978) are particular solutions of this equation, other solutions are not of the exponential class if certain parameters are unknown. Our approach allows for the possibility of extending the parametric Stein-theory to useful nonexponential cases, but the problem of nuisance parameters is not treated here.  相似文献   

13.
The Gompertz distribution has been used as a growth model, especially in epidemiological and biomedical studies. Based on Type I and II censored samples from a heterogeneous population that can be represented by a finite mixture of two-component Gompertz lifetime model, the maximum likelihood and Bayes estimates of the parameters, reliability and hazard rate functions are obtained. An approximation form due to Lindley (1980) is used in obtaining the corresponding Bayes estimates. The maximum likelihood and Bayes estimates are comparedvia a Monte Carlo simulation study.  相似文献   

14.
A new family of mixture models for the model‐based clustering of longitudinal data is introduced. The covariance structures of eight members of this new family of models are given and the associated maximum likelihood estimates for the parameters are derived via expectation–maximization (EM) algorithms. The Bayesian information criterion is used for model selection and a convergence criterion based on the Aitken acceleration is used to determine the convergence of these EM algorithms. This new family of models is applied to yeast sporulation time course data, where the models give good clustering performance. Further constraints are then imposed on the decomposition to allow a deeper investigation of the correlation structure of the yeast data. These constraints greatly extend this new family of models, with the addition of many parsimonious models. The Canadian Journal of Statistics 38:153–168; 2010 © 2010 Statistical Society of Canada  相似文献   

15.
A problem of selecting populations better than a control is considered. When the populations are uniformly distributed, empirical Bayes rules are derived for a linear loss function for both the known control parameter and the unknown control parameter cases. When the priors are assumed to have bounded supports, empirical Bayes rules for selecting good populations are derived for distributions with truncation parameters (i.e. the form of the pdf is f(x|θ)= pi(x)ci(θ)I(0, θ)(x)). Monte Carlo studies are carried out which determine the minimum sample sizes needed to make the relative errors less than ε for given ε-values.  相似文献   

16.
We present an algorithm for multivariate robust Bayesian linear regression with missing data. The iterative algorithm computes an approximative posterior for the model parameters based on the variational Bayes (VB) method. Compared to the EM algorithm, the VB method has the advantage that the variance for the model parameters is also computed directly by the algorithm. We consider three families of Gaussian scale mixture models for the measurements, which include as special cases the multivariate t distribution, the multivariate Laplace distribution, and the contaminated normal model. The observations can contain missing values, assuming that the missing data mechanism can be ignored. A Matlab/Octave implementation of the algorithm is presented and applied to solve three reference examples from the literature.  相似文献   

17.
Summary.  The retrieval of wind vectors from satellite scatterometer observations is a non-linear inverse problem. A common approach to solving inverse problems is to adopt a Bayesian framework and to infer the posterior distribution of the parameters of interest given the observations by using a likelihood model relating the observations to the parameters, and a prior distribution over the parameters. We show how Gaussian process priors can be used efficiently with a variety of likelihood models, using local forward (observation) models and direct inverse models for the scatterometer. We present an enhanced Markov chain Monte Carlo method to sample from the resulting multimodal posterior distribution. We go on to show how the computational complexity of the inference can be controlled by using a sparse, sequential Bayes algorithm for estimation with Gaussian processes. This helps to overcome the most serious barrier to the use of probabilistic, Gaussian process methods in remote sensing inverse problems, which is the prohibitively large size of the data sets. We contrast the sampling results with the approximations that are found by using the sparse, sequential Bayes algorithm.  相似文献   

18.
In this paper, the problem of estimating unknown parameters of a two-parameter Kumaraswamy-Exponential (Kw-E) distribution is considered based on progressively type-II censored sample. The maximum likelihood (ML) estimators of the parameters are obtained. Bayes estimates are also obtained using different loss functions such as squared error, LINEX and general entropy. Lindley's approximation method is used to evaluate these Bayes estimates. Monte Carlo simulation is used for numerical comparison between various estimates developed in this paper.  相似文献   

19.
The expectation maximization (EM) algorithm is a widely used parameter approach for estimating the parameters of multivariate multinomial mixtures in a latent class model. However, this approach has unsatisfactory computing efficiency. This study proposes a fuzzy clustering algorithm (FCA) based on both the maximum penalized likelihood (MPL) for the latent class model and the modified penalty fuzzy c-means (PFCM) for normal mixtures. Numerical examples confirm that the FCA-MPL algorithm is more efficient (that is, requires fewer iterations) and more computationally effective (measured by the approximate relative ratio of accurate classification) than the EM algorithm.  相似文献   

20.
This article considers inference for the log-normal distribution based on progressive Type I interval censored data by both frequentist and Bayesian methods. First, the maximum likelihood estimates (MLEs) of the unknown model parameters are computed by expectation-maximization (EM) algorithm. The asymptotic standard errors (ASEs) of the MLEs are obtained by applying the missing information principle. Next, the Bayes’ estimates of the model parameters are obtained by Gibbs sampling method under both symmetric and asymmetric loss functions. The Gibbs sampling scheme is facilitated by adopting a similar data augmentation scheme as in EM algorithm. The performance of the MLEs and various Bayesian point estimates is judged via a simulation study. A real dataset is analyzed for the purpose of illustration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号