首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 618 毫秒
1.
The paper is focussing on some recent developments in nonparametric mixture distributions. It discusses nonparametric maximum likelihood estimation of the mixing distribution and will emphasize gradient type results, especially in terms of global results and global convergence of algorithms such as vertex direction or vertex exchange method. However, the NPMLE (or the algorithms constructing it) provides also an estimate of the number of components of the mixing distribution which might be not desirable for theoretical reasons or might be not allowed from the physical interpretation of the mixture model. When the number of components is fixed in advance, the before mentioned algorithms can not be used and globally convergent algorithms do not exist up to now. Instead, the EM algorithm is often used to find maximum likelihood estimates. However, in this case multiple maxima are often occuring. An example from a meta-analyis of vitamin A and childhood mortality is used to illustrate the considerable, inferential importance of identifying the correct global likelihood. To improve the behavior of the EM algorithm we suggest a combination of gradient function steps and EM steps to achieve global convergence leading to the EM algorithm with gradient function update (EMGFU). This algorithms retains the number of components to be exactly k and typically converges to the global maximum. The behavior of the algorithm is highlighted at hand of several examples.  相似文献   

2.
Abstract.  This paper studies the representation and large-sample consistency for non-parametric maximum likelihood estimators (NPMLEs) of an unknown baseline continuous cumulative-hazard-type function and parameter of group survival difference, based on right-censored two-sample survival data with marginal survival function assumed to follow a transformation model, a slight generalization of the class of frailty survival regression models. The paper's main theoretical results are existence and unique a.s. limit, characterized variationally, for large data samples of the NPMLE of baseline nuisance function in an appropriately defined neighbourhood of the true function when the group difference parameter is fixed, leading to consistency of the NPMLE when the difference parameter is fixed at a consistent estimator of its true value. The joint NPMLE is also shown to be consistent. An algorithm for computing it numerically, based directly on likelihood equations in place of the expectation-maximization (EM) algorithm, is illustrated with real data.  相似文献   

3.
Generalized Gibbs samplers simulate from any direction, not necessarily limited to the coordinate directions of the parameters of the objective function. We study how to optimally choose such directions in a random scan Gibbs sampler setting. We consider that optimal directions will be those that minimize the Kullback–Leibler divergence of two Markov chain Monte Carlo steps. Two distributions over direction are proposed for the multivariate Normal objective function. The resulting algorithms are used to simulate from a truncated multivariate Normal distribution, and the performance of our algorithms is compared with the performance of two algorithms based on the Gibbs sampler.  相似文献   

4.
The computation of rectangular probabilities of multivariate discrete integer distributions such as the multinomial, multivariate hypergeometric or multivariate Pólya distributions is of great interest both for statistical applications and for probabilistic modeling purpose. All these distributions are members of a broader family of multivariate discrete integer distributions for which computationaly efficient approximate methods have been proposed for the evaluation of such probabilities, but with no control over their accuracy. Recently, exact algorithms have been proposed for computing such probabilities, but they are either dedicated to a specific distribution or to very specific rectangular probabilities. We propose a new algorithm that allows to perform the computation of arbitrary rectangular probabilities in the most general case. Its accuracy matches or even outperforms the accuracy exact algorithms when the rounding errors are taken into account. In the worst case, its computational cost is the same as the most efficient exact method published so far, and is much lower in many situations of interest. It does not need any additional storage than the one for the parameters of the distribution, which allows to deal with large dimension/large counting parameter applications at no extra memory cost and with an acceptable computation time, which is a major difference with respect to the methods published so far.  相似文献   

5.
We propose a new stochastic approximation (SA) algorithm for maximum-likelihood estimation (MLE) in the incomplete-data setting. This algorithm is most useful for problems when the EM algorithm is not possible due to an intractable E-step or M-step. Compared to other algorithm that have been proposed for intractable EM problems, such as the MCEM algorithm of Wei and Tanner (1990), our proposed algorithm appears more generally applicable and efficient. The approach we adopt is inspired by the Robbins-Monro (1951) stochastic approximation procedure, and we show that the proposed algorithm can be used to solve some of the long-standing problems in computing an MLE with incomplete data. We prove that in general O(n) simulation steps are required in computing the MLE with the SA algorithm and O(n log n) simulation steps are required in computing the MLE using the MCEM and/or the MCNR algorithm, where n is the sample size of the observations. Examples include computing the MLE in the nonlinear error-in-variable model and nonlinear regression model with random effects.  相似文献   

6.
In this paper, we propose the MulticlusterKDE algorithm applied to classify elements of a database into categories based on their similarity. MulticlusterKDE is centered on the multiple optimization of the kernel density estimator function with multivariate Gaussian kernel. One of the main features of the proposed algorithm is that the number of clusters is an optional input parameter. Furthermore, it is very simple, easy to implement, well defined and stops at a finite number of steps and it always converges regardless of the data set. We illustrate our findings by implementing the algorithm in R software. The results indicate that the MulticlusterKDE algorithm is competitive when compared to K-means, K-medoids, CLARA, DBSCAN and PdfCluster algorithms. Features such as simplicity and efficiency make the proposed algorithm an attractive and promising research field that can be used as basis for its improvement and also for the development of new density-based clustering algorithms.  相似文献   

7.
A fast new algorithm is proposed for numerical computation of (approximate) D-optimal designs. This cocktail algorithm extends the well-known vertex direction method (VDM; Fedorov in Theory of Optimal Experiments, 1972) and the multiplicative algorithm (Silvey et al. in Commun. Stat. Theory Methods 14:1379–1389, 1978), and shares their simplicity and monotonic convergence properties. Numerical examples show that the cocktail algorithm can lead to dramatically improved speed, sometimes by orders of magnitude, relative to either the multiplicative algorithm or the vertex exchange method (a variant of VDM). Key to the improved speed is a new nearest neighbor exchange strategy, which acts locally and complements the global effect of the multiplicative algorithm. Possible extensions to related problems such as nonparametric maximum likelihood estimation are mentioned.  相似文献   

8.
Mixture model-based clustering is widely used in many applications. In certain real-time applications the rapid increase of data size with time makes classical clustering algorithms too slow. An online clustering algorithm based on mixture models is presented in the context of a real-time flaw-diagnosis application for pressurized containers which uses data from acoustic emission signals. The proposed algorithm is a stochastic gradient algorithm derived from the classification version of the EM algorithm (CEM). It provides a model-based generalization of the well-known online k-means algorithm, able to handle non-spherical clusters. Using synthetic and real data sets, the proposed algorithm is compared with the batch CEM algorithm and the online EM algorithm. The three approaches generate comparable solutions in terms of the resulting partition when clusters are relatively well separated, but online algorithms become faster as the size of the available observations increases.  相似文献   

9.
We present new algorithms for computing the exact distributions and p-values of quadratic t-sample distribution-free statistics of Kruskal–Wallis type. These algorithms are presented in terms of generating functions. We show that our algorithm also works for cases with ties and that it is much faster than existing algorithms. Moreover, we show how to use the results for the Kruskal–Wallis type statistics to compute the exact null distribution of the Chacko–Shorack statistic.  相似文献   

10.
Markov chain Monte Carlo (MCMC) is an important computational technique for generating samples from non-standard probability distributions. A major challenge in the design of practical MCMC samplers is to achieve efficient convergence and mixing properties. One way to accelerate convergence and mixing is to adapt the proposal distribution in light of previously sampled points, thus increasing the probability of acceptance. In this paper, we propose two new adaptive MCMC algorithms based on the Independent Metropolis–Hastings algorithm. In the first, we adjust the proposal to minimize an estimate of the cross-entropy between the target and proposal distributions, using the experience of pre-runs. This approach provides a general technique for deriving natural adaptive formulae. The second approach uses multiple parallel chains, and involves updating chains individually, then updating a proposal density by fitting a Bayesian model to the population. An important feature of this approach is that adapting the proposal does not change the limiting distributions of the chains. Consequently, the adaptive phase of the sampler can be continued indefinitely. We include results of numerical experiments indicating that the new algorithms compete well with traditional Metropolis–Hastings algorithms. We also demonstrate the method for a realistic problem arising in Comparative Genomics.  相似文献   

11.
Three general algorithms that use different strategies are proposed for computing the maximum likelihood estimate of a semiparametric mixture model. They seek to maximize the likelihood function by, respectively, alternating the parameters, profiling the likelihood and modifying the support set. All three algorithms make a direct use of the recently proposed fast and stable constrained Newton method for computing the nonparametric maximum likelihood of a mixing distribution and employ additionally an optimization algorithm for unconstrained problems. The performance of the algorithms is numerically investigated and compared for solving the Neyman-Scott problem, overcoming overdispersion in logistic regression models and fitting two-level mixed effects logistic regression models. Satisfactory results have been obtained.  相似文献   

12.
The noncentral beta and the related noncentral F distributions have received much attention during the last decade, as is evident from the works of Norton, Lenth, Frick, Lee, Posten, Chattamvelli, and Chattamvelli and Shanmugam. This article reviews the existing algorithms for computing the cumulative distribution function (cdf) of a noncentral beta random variable, and proposes a simple algorithm, based on a sharp error bound, for computing the cdf. A variation of the noncentral beta random variable when the noncentrality is associated only with the denominator χ2 and its computational details are also discussed.  相似文献   

13.
Parallel multivariate slice sampling   总被引:2,自引:0,他引:2  
Slice sampling provides an easily implemented method for constructing a Markov chain Monte Carlo (MCMC) algorithm. However, slice sampling has two major drawbacks: (i) it requires repeated evaluation of likelihoods for each update, which can make it impractical when evaluations are expensive or as the number of evaluations grows (geometrically) with the dimension of the slice sampler, and (ii) since it can be challenging to construct multivariate updates, the updates are typically univariate, which often results in slow mixing samplers. We propose an approach to multivariate slice sampling that naturally lends itself to a parallel implementation. Our approach takes advantage of recent advances in computer architectures, for instance, the newest generation of graphics cards can execute roughly 30,000 threads simultaneously. We demonstrate that it is possible to construct a multivariate slice sampler that has good mixing properties and is efficient in terms of computing time. The contributions of this article are therefore twofold. We study approaches for constructing a multivariate slice sampler, and we show how parallel computing can be useful for making MCMC algorithms computationally efficient. We study various implementations of our algorithm in the context of real and simulated data.  相似文献   

14.
Convergence of Heavy-tailed Monte Carlo Markov Chain Algorithms   总被引:1,自引:0,他引:1  
Abstract.  In this paper, we use recent results of Jarner & Roberts ( Ann. Appl. Probab., 12, 2002, 224) to show polynomial convergence rates of Monte Carlo Markov Chain algorithms with polynomial target distributions, in particular random-walk Metropolis algorithms, Langevin algorithms and independence samplers. We also use similar methodology to consider polynomial convergence of the Gibbs sampler on a constrained state space. The main result for the random-walk Metropolis algorithm is that heavy-tailed proposal distributions lead to higher rates of convergence and thus to qualitatively better algorithms as measured, for instance, by the existence of central limit theorems for higher moments. Thus, the paper gives for the first time a theoretical justification for the common belief that heavy-tailed proposal distributions improve convergence in the context of random-walk Metropolis algorithms. Similar results are shown to hold for Langevin algorithms and the independence sampler, while results for the mixing of Gibbs samplers on uniform distributions on constrained spaces are rather different in character.  相似文献   

15.
In this paper, we present an algorithm for clustering based on univariate kernel density estimation, named ClusterKDE. It consists of an iterative procedure that in each step a new cluster is obtained by minimizing a smooth kernel function. Although in our applications we have used the univariate Gaussian kernel, any smooth kernel function can be used. The proposed algorithm has the advantage of not requiring a priori the number of cluster. Furthermore, the ClusterKDE algorithm is very simple, easy to implement, well-defined and stops in a finite number of steps, namely, it always converges independently of the initial point. We also illustrate our findings by numerical experiments which are obtained when our algorithm is implemented in the software Matlab and applied to practical applications. The results indicate that the ClusterKDE algorithm is competitive and fast when compared with the well-known Clusterdata and K-means algorithms, used by Matlab to clustering data.  相似文献   

16.
A new acceleration scheme for optimization procedures is defined through geometric considerations and applied to the EM algorithm. In many cases it is able to circumvent the problem of stagnation. No modification of the original algorithm is required. It is simply used as a software component. Thus the new scheme can be easily implemented to accelerate a fixed point algorithm maximizing some objective function. Some practical examples and simulations are presented to show its ability to accelerate EM-type algorithms converging slowly.  相似文献   

17.
In this article, we discuss the maximum likelihood estimates (MLEs) for the exponential and Weibull distributions by considering progressive Type-I interval censored data. For exponential distribution, the explicit expression of MLE of failure rate cannot be obtained when the intervals are not equal in length. The direct application of some numerical algorithms, such as the Newton–Raphson algorithm, is non-ideal because of the cumbersome second derivative. We apply some equivalent quantities to obtain the MLE of failure rate of exponential distribution. Based on the equivalent quantities and the Weibull-to-exponential transformation technique, we propose a new algorithm to obtain the MLEs for the parameters of progressive Type-I interval Weibull data. An example reanalysis and some simulation studies are carried out to illustrate the performance of the estimations using the new algorithm.  相似文献   

18.
Exponential regression model is important in analyzing data from heterogeneous populations. In this paper we propose a simple method to estimate the regression parameters using binary data. Under certain design distributions, including ellipticaily symmetric distributions, for the explanatory variables, the estimators are shown to be consistent and asymptotically normal when sample size is large. For finite samples, the new estimates were shown to behave reasonably well. They are competitive with the maximum likelihood estimates and more importantly, according to our simulation results, the cost of CPU time for computing new estimates is only 1/7 of that required for computing the usual maximum likelihood estimates. We expect the savings in CPU time would be more dramatic with larger dimension of the regression parameter space.  相似文献   

19.
Independent component analysis (ICA) is a popular blind source separation technique used in many scientific disciplines. Current ICA approaches have focused on developing efficient algorithms under specific ICA models, such as instantaneous or convolutive mixing conditions, intrinsically assuming temporal independence or autocorrelation of the sources. In practice, the true model is not known and different ICA algorithms can produce very different results. Although it is critical to choose an ICA model, there has not been enough research done on evaluating mixing models and assumptions, and how the associated algorithms may perform under different scenarios. In this paper, we investigate the performance of multiple ICA algorithms under various mixing conditions. We also propose a convolutive ICA algorithm for echoic mixing cases. Our simulation studies show that the performance of ICA algorithms is highly dependent on mixing conditions and temporal independence of the sources. Most instantaneous ICA algorithms fail to separate autocorrelated sources, while convolutive ICA algorithms depend highly on the model specification and approximation accuracy of unmixing filters.  相似文献   

20.
A nonparametric mixture model specifies that observations arise from a mixture distribution, ∫ f(x, θ) dG(θ), where the mixing distribution G is completely unspecified. A number of algorithms have been developed to obtain unconstrained maximum-likelihood estimates of G, but none of these algorithms lead to estimates when functional constraints are present. In many cases, there is a natural interest in functional ?(G), such as the mean and variance, of the mixing distribution, and profile likelihoods and confidence intervals for ?(G) are desired. In this paper we develop a penalized generalization of the ISDM algorithm of Kalbfleisch and Lesperance (1992) that can be used to solve the problem of constrained estimation. We also discuss its use in various different applications. Convergence results and numerical examples are given for the generalized ISDM algorithm, and asymptotic results are developed for the likelihood-ratio test statistics in the multinomial case.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号