首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The EM algorithm is the standard method for estimating the parameters in finite mixture models. Yang and Pan [25] proposed a generalized classification maximum likelihood procedure, called the fuzzy c-directions (FCD) clustering algorithm, for estimating the parameters in mixtures of von Mises distributions. Two main drawbacks of the EM algorithm are its slow convergence and the dependence of the solution on the initial value used. The choice of initial values is of great importance in the algorithm-based literature as it can heavily influence the speed of convergence of the algorithm and its ability to locate the global maximum. On the other hand, the algorithmic frameworks of EM and FCD are closely related. Therefore, the drawbacks of FCD are the same as those of the EM algorithm. To resolve these problems, this paper proposes another clustering algorithm, which can self-organize local optimal cluster numbers without using cluster validity functions. These numerical results clearly indicate that the proposed algorithm is superior in performance of EM and FCD algorithms. Finally, we apply the proposed algorithm to two real data sets.  相似文献   

2.
The semiparametric LABROC approach of fitting binormal model for estimating AUC as a global index of accuracy has been justified (except for bimodal forms), while for estimating a local index of accuracy such as TPF, it may lead to a bias in severe departure of data from binormality. We extended parametric ROC analysis for quantitative data when one or both pair members are mixture of Gaussian (MG) in particular for bimodal forms. We analytically showed that AUC and TPF are a mixture of weighting parameters of different components of AUCs and TPFs of a mixture of underlying distributions. In a simulation study of six configurations of MG distributions:{bimodal, normal} and {bimodal, bimodal} pairs, the parameters of MG distributions were estimated using the EM algorithm. The results showed that the estimated AUC from our proposed model was essentially unbiased, and that the bias in the estimated TPF at a clinically relevant range of FPF was roughly 0.01 for a sample size of n=100/100. In practice, with severe departures from binormality, we recommend an extension of the LABROC and software development for future research to allow for each member of the pair of distributions to be a mixture of Gaussian that is a more flexible parametric form.  相似文献   

3.
Optimal designs for estimating the parameters and also the optimum factor combinations in multiresponse experiments have been considered by various authors. However, till date, in mixture experiments optimum designs have been studied only in the single response case. In this article, attempt has been made to investigate optimum designs for estimating optimum mixing proportions in a multiresponse mixture experiment.  相似文献   

4.
Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets highlight the good performance of the proposed approach as compared to existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data.  相似文献   

5.
A finite mixture model is considered in which the mixing probabilities vary from observation to observation. A parametric model is assumed for one mixture component distribution, while the others are nonparametric nuisance parameters. Generalized estimating equations (GEE) are proposed for the semi‐parametric estimation. Asymptotic normality of the GEE estimates is demonstrated and the lower bound for their dispersion (asymptotic covariance) matrix is derived. An adaptive technique is developed to derive estimates with nearly optimal small dispersion. An application to the sociological analysis of voting results is discussed. The Canadian Journal of Statistics 41: 217–236; 2013 © 2013 Statistical Society of Canada  相似文献   

6.
In this article, we consider the problem of estimating certain “parameters” in a mixture of probability measures. We show that a single sample is typically suitable for estimating the component measures, but not suitable for estimating the mixing measures, especially when consistency is required. To have consistent estimators of the mixing measure, several samples with increasing size are needed in general.  相似文献   

7.
A general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student t densities with covariate-dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewness, are all modeled as functions of the covariates. Inference is Bayesian and the computation is carried out using Markov chain Monte Carlo simulation. To enable model parsimony, a variable selection prior is used in each set of covariates and among the covariates in the mixing weights. The model is used to analyze the distribution of daily stock market returns, and shown to more accurately forecast the distribution of returns than other widely used models for financial data.  相似文献   

8.
Providing certain parameters are known, almost any linear map from RP to R1 can be adjusted to yield a consistent and unbiased estimator in the context of estimating the mixing proportion θ on the basis of an unclassified sample of observations taken from a mixture of two p-dimensional distributions in proportions θ and 1-θ. Attention is focused on an estimator proposed recently, θ, which has minimum variance over all such linear maps. Unfortunately, the form of θ depends on the means of the component distributions and the covariance matrix of the mixture distribution. The effect of using appropriate sample estimates for these unknown parameters in forming θ is investigated by deriving the asymptotic mean and variance of the resulting estimator. The relative efficiency of this estimator under normality is derived. Also, a study is undertaken of the performance of a similar type of estimator appropriate in the context where an observed data vector is not an observation from either one or the other onent distributions, but is recorded as an integrated measurement over a surface area which is a mixture of two categories whose characteristics have different statistical distributions.The asymptotic bias in this case is compared with some available practical results.  相似文献   

9.
For longitudinal data, the within-subject dependence structure and covariance parameters may be of practical and theoretical interests. The estimation of covariance parameters has received much attention and been studied mainly in the framework of generalized estimating equations (GEEs). The GEEs method, however, is sensitive to outliers. In this paper, an alternative set of robust generalized estimating equations for both the mean and covariance parameters are proposed in the partial linear model for longitudinal data. The asymptotic properties of the proposed estimators of regression parameters, non-parametric function and covariance parameters are obtained. Simulation studies are conducted to evaluate the performance of the proposed estimators under different contaminations. The proposed method is illustrated with a real data analysis.  相似文献   

10.
In the longitudinal studies, the mixture generalized estimation equation (mix-GEE) was proposed to improve the efficiency of the fixed-effects estimator for addressing the working correlation structure misspecification. When the subject-specific effect is one of interests, mixed-effects models were widely used to analyze longitudinal data. However, most of the existing approaches assume a normal distribution for the random effects, and this could affect the efficiency of the fixed-effects estimator. In this article, a conditional mixture generalized estimating equation (cmix-GEE) approach based on the advantage of mix-GEE and conditional quadratic inference function (CQIF) method is developed. The advantage of our new approach is that it does not require the normality assumption for random effects and can accommodate the serial correlation between observations within the same cluster. The feature of our proposed approach is that the estimators of the regression parameters are more efficient than CQIF even if the working correlation structure is not correctly specified. In addition, according to the estimates of some mixture proportions, the true working correlation matrix can be identified. We establish the asymptotic results for the fixed-effects parameter estimators. Simulation studies were conducted to evaluate our proposed method.  相似文献   

11.
A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study.  相似文献   

12.
Summary A simple procedure for numerical solution of the likelihood equations for estimating the regression parameters of a first-order response surface model for the treatment parameters of mixture paired comparison experiments is developed. It is demonstrated that, for defined rotatable designs, those regression parameters are simple functions of the main effect parameters of a corresponding factorial model with no interactions. The maximum likelihood estimators of those main effect parameters, and hence of their corresponding regression parameters, are obtained through using procedures of treatment contrasts, factorial and iterations. A numerical example is given to illustrate applications of the procedures developed in this paper.  相似文献   

13.
In recent years, regression models have been shown to be useful for predicting the long-term survival probabilities of patients in clinical trials. The importance of a regression model is that once the regression parameters are estimated information about the regressed quantity is immediate. A simple estimator is proposed for the regression parameters in a model for the long-term survival rate. The proposed estimator is seen to arise from an estimating function that has the missing information principle underlying its construction. When the covariate takes values in a finite set, the proposed estimating function is equivalent to an ad hoc estimating function proposed in the literature. However, in general, the two estimating functions lead to different estimators of the regression parameter. For discrete covariates, the asymptotic covariance matrix of the proposed estimator is simple to calculate using standard techniques involving the predictable covariation process of martingale transforms. An ad hoc extension to the case of a one-dimensional continuous covariate is proposed. Simplicity and generalizability are two attractive features of the proposed approach. The last mentioned feature is not enjoyed by the other estimator.  相似文献   

14.
The property of identifiability is an important consideration on estimating the parameters in a mixture of distributions. Also classification of a random variable based on a mixture can be meaning fully discussed only if the class of all finite mixtures is identifiable. The problem of identifiability of finite mixture of Gompertz distributions is studied. A procedure is presented for finding maximum likelihood estimates of the parameters of a mixture of two Gompertz distributions, using classified and unclassified observations. Based on small sample size, estimation of a nonlinear discriminant function is considered. Throughout simulation experiments, the performance of the corresponding estimated nonlinear discriminant function is investigated.  相似文献   

15.
The estimation of parameters of the log normal distribution based on complete and censored samples are considered in the literature. In this article, the problem of estimating the parameters of log normal mixture model is considered. The Expectation Maximization algorithm is used to obtain maximum likelihood estimators for the parameters, as the likelihood equation does not yield closed form expression. The standard errors of the estimates are obtained. The methodology developed here is then illustrated through simulation studies. The confidence interval based on large-sample theory is obtained.  相似文献   

16.
Two families of closed form estimators are proposed for estimating the single parameter of the log-series distribution(LSD)and for estimating the two parameters of a generalization of the LSD distribution(GLSD)presented by Tripathi and Gupta(1985). These families are based on the recurrence relations obtained from these distributions, are of closed form, and have very high asymptotic relative effi¬ciencies. Some two-stage procedures are suggested.  相似文献   

17.
In this paper, we present growth curve models with an auxiliary variable which contains an uncertain data distribution based on mixtures of standard components, such as normal distributions. The multimodality of the auxiliary random variable motivates and necessitates the use of mixtures of normal distributions in our model. We have observed that Dirichlet process priors, composed of discrete and continuous components, are appropriate in addressing the two problems of determining the number of components and estimating the parameters simultaneously and are especially useful in the aforementioned multimodal scenario. A model for the application of Dirichlet mixture of normals (DMN) in growth curve models under Bayesian formulation is presented and algorithms for computing the number of components, as well as estimating the parameters are also rendered. The simulation results show that our model gives improved goodness of fit statistics over models without DMN and the estimates for the number of components and for parameters are reasonably accurate.  相似文献   

18.
Conformance proportions are important numerical indices for quality assessments. When the population is characterized by a normal mixture model, estimating conformance proportions can be a practical issue. To account for the inherent structure of normal mixture models, universal and individual conformance proportions are first defined for the purpose of evaluating the overall population and specific subpopulations of interest, respectively. On the basis of generalized fiducial quantities, a systematic method is then proposed in this paper to obtain confidence limits for the two classes of conformance proportions. The simulation results demonstrate that the proposed method can maintain the empirical coverage rate sufficiently close to the nominal level. In addition, two examples are given to illustrate the proposed method.  相似文献   

19.
We investigate the problem of estimating the association between two related survival variables when they follow a copula model and bivariate left-truncated and right-censored data are available. By expressing truncation probability as the functional of marginal survival functions, we propose a two-stage estimation procedure for estimating the parameters of Archimedean copulas. The asymptotic properties of the proposed estimators are established. Simulation studies are conducted to investigate the finite sample properties of the proposed estimators. The proposed method is applied to a bivariate RNA data.  相似文献   

20.
Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号