共查询到20条相似文献,搜索用时 11 毫秒
1.
2.
Jean-Luc Dortet-Bernadet 《统计学通讯:理论与方法》2017,46(21):10768-10787
The existence of a dimension reduction (DR) subspace is a common assumption in regression analysis when dealing with high-dimensional predictors. The estimation of such a DR subspace has received considerable attention in the past few years, the most popular method being undoubtedly the sliced inverse regression. In this paper, we propose a new estimation procedure of the DR subspace by assuming that the joint distribution of the predictor and the response variables is a finite mixture of distributions. The new method is compared through a simulation study to some classical methods. 相似文献
3.
A mixture model for random graphs 总被引:1,自引:0,他引:1
The Erdös–Rényi model of a network is simple and possesses many explicit expressions for average and asymptotic properties, but it does not fit well to real-world networks. The vertices of those networks are often structured in unknown classes (functionally related proteins or social communities) with different connectivity properties. The stochastic block structures model was proposed for this purpose in the context of social sciences, using a Bayesian approach. We consider the same model in a frequentest statistical framework. We give the degree distribution and the clustering coefficient associated with this model, a variational method to estimate its parameters and a model selection criterion to select the number of classes. This estimation procedure allows us to deal with large networks containing thousands of vertices. The method is used to uncover the modular structure of a network of enzymatic reactions. 相似文献
4.
Angela D'Elia 《Statistical Methods and Applications》2001,10(1-3):157-174
A variance components model with response variable depending on both fixed effects of explanatory variables and random components is specified to model longitudinal circular data, in order to study the directional behaviour of small animals, as insects, crustaceans, amphipods, etc. Unknown parameter estimators are obtained using a simulated maximum likelihood approach. Issues concerning log-likelihood variability and the related problems in the optimization algorithm are also addressed. The procedure is applied to the analysis of directional choices under full natural conditions ofTalitrus saltator from Castiglione della Pescaia (Italy) beaches. 相似文献
5.
G. K. Kanji 《Journal of applied statistics》1985,12(1):49-58
Wind shear is known to be an important factor affecting the safety of aircraft during take-off and landing period. Records of its measurement obtained from a civil airlines are presented and discussed. A mixture model with variable proportionality constant is used to describe this data. The method of minimum chi-squared is used to estimate the mixture proportionality constant and the suitability of the model is considered. 相似文献
6.
In this paper, we propose a model for image segmentation based on a finite mixture of Gaussian distributions. For each pixel of the image, prior probabilities of class memberships are specified through a Gibbs distribution, where association between labels of adjacent pixels is modeled by a class-specific term allowing for different interaction strengths across classes. We show how model parameters can be estimated in a maximum likelihood framework using Mean Field theory. Experimental performance on perturbed phantom and on real benchmark images shows that the proposed method performs well in a wide variety of empirical situations. 相似文献
7.
B. J. T. Morgan M. S. Ridout 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(4):433-446
Summary. We propose a mixture of binomial and beta–binomial distributions for estimating the size of closed populations. The new mixture model is applied to several real capture–recapture data sets and is shown to provide a convenient, objective framework for model selection. The new model is compared with three alternative models in a simulation study, and the results shed light on the general performance of models in this area. The new model provides a robust flexible analysis, which automatically deals with small capture probabilities. 相似文献
8.
This paper is meant to introduce a significant extension of the flexible Dirichlet (FD) distribution, which is a quite tractable special mixture model for compositional data, i.e. data representing vectors of proportions of a whole. The FD model displays several theoretical properties which make it suitable for inference, and fairly easy to handle from a computational viewpoint. However, the rigid type of mixture structure implied by the FD makes it unsuitable to describe many compositional datasets. Furthermore, the FD only allows for negative correlations. The new extended model, by considerably relaxing the strict constraints among clusters entailed by the FD, allows for a more general dependence structure (including positive correlations) and greatly expands its applicative potential. At the same time, it retains, to a large extent, its good properties. EM-type estimation procedures can be developed for this more complex model, including ad hoc reliable initialization methods, which permit to keep the computational issues at a rather uncomplicated level. Accurate evaluation of standard error estimates can be provided as well. 相似文献
9.
A Bayesian mixture model for differential gene expression 总被引:3,自引:0,他引:3
Kim-Anh Do Peter Müller Feng Tang 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(3):627-644
Summary. We propose model-based inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under various conditions. The probability model is a mixture of normal distributions. The resulting inference is similar to a popular empirical Bayes approach that is used for the same inference problem. The use of fully model-based inference mitigates some of the necessary limitations of the empirical Bayes method. We argue that inference is no more difficult than posterior simulation in traditional nonparametric mixture-of-normal models. The approach proposed is motivated by a microarray experiment that was carried out to identify genes that are differentially expressed between normal tissue and colon cancer tissue samples. Additionally, we carried out a small simulation study to verify the methods proposed. In the motivating case-studies we show how the nonparametric Bayes approach facilitates the evaluation of posterior expected false discovery rates. We also show how inference can proceed even in the absence of a null sample of known non-differentially expressed scores. This highlights the difference from alternative empirical Bayes approaches that are based on plug-in estimates. 相似文献
10.
Getachew A. Dagne 《Journal of applied statistics》2016,43(7):1174-1185
This paper presents an alternative analysis approach to modeling data where a lower detection limit (LOD) and unobserved population heterogeneity exist in a longitudinal data set. Longitudinal data on viral loads in HIV/AIDS studies, for instance, show strong positive skewness and left-censoring. Normalizing such data using a logarithmic transformation seems to be unsuccessful. An alternative to such a transformation is to use a finite mixture model which is suitable for analyzing data which have skewed or multi-modal distributions. There is little work done to simultaneously take into account these features of longitudinal data. This paper develops a growth mixture Tobit model that deals with a LOD and heterogeneity among growth trajectories. The proposed methods are illustrated using simulated and real data from an AIDS clinical study. 相似文献
11.
Conglian Yu 《统计学通讯:理论与方法》2020,49(18):4347-4366
AbstractIn this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis. 相似文献
12.
We describe a selection model for multivariate counts, where association between the primary outcomes and the endogenous selection source is modeled through outcome-specific latent effects which are assumed to be dependent
across equations. Parametric specifications of this model already exist in the literature; in this paper, we show how model
parameters can be estimated in a finite mixture context. This approach helps us to consider overdispersed counts, while allowing
for multivariate association and endogeneity of the selection variable. In this context, attention is focused both on bias
in estimated effects when exogeneity of selection (treatment) variable is assumed, as well as on consistent estimation of
the association between the random effects in the primary and in the treatment effect models, when the latter is assumed endogeneous.
The model behavior is investigated through a large scale simulation experiment. An empirical example on health care utilization
data is provided. 相似文献
13.
Dongbing Lai Huiping Xu Daniel Koller Tatiana Foroud 《Journal of applied statistics》2016,43(14):2503-2523
Dementia patients exhibit considerable heterogeneity in individual trajectories of cognitive decline, with some patients showing rapid decline following diagnoses while others exhibiting slower decline or remaining stable for several years. Dementia studies often collect longitudinal measures of multiple neuropsychological tests aimed to measure patients’ decline across a number of cognitive domains. We propose a multivariate finite mixture latent trajectory model to identify distinct longitudinal patterns of cognitive decline simultaneously in multiple cognitive domains, each of which is measured by multiple neuropsychological tests. EM algorithm is used for parameter estimation and posterior probabilities are used to predict latent class membership. We present results of a simulation study demonstrating adequate performance of our proposed approach and apply our model to the Uniform Data Set from the National Alzheimer's Coordinating Center to identify cognitive decline patterns among dementia patients. 相似文献
14.
Neosporosis is a bovine disease caused by the parasite Neospora caninum. It is not yet sufficiently studied, and it is supposed to cause an important number of abortions. Its clinical symptoms do not yet allow the reliable identification of infected animals. Its study and treatment would improve if a test based on antibody counts were available. Knowing the distribution functions of observed counts of uninfected and infected cows would allow the determination of a cutoff value. These distributions cannot be estimated directly. This paper deals with the indirect estimation of these distributions based on a data set consisting of the antibody counts for some 200 pairs of cows and their calves. The desired distributions are estimated through a mixture model based on simple assumptions that describe the relationship between each cow and its calf. The model then allows the estimation of the cutoff value and of the error probabilities. 相似文献
15.
Social network data represent the interactions between a group of social actors. Interactions between colleagues and friendship networks are typical examples of such data.The latent space model for social network data locates each actor in a network in a latent (social) space and models the probability of an interaction between two actors as a function of their locations. The latent position cluster model extends the latent space model to deal with network data in which clusters of actors exist — actor locations are drawn from a finite mixture model, each component of which represents a cluster of actors.A mixture of experts model builds on the structure of a mixture model by taking account of both observations and associated covariates when modeling a heterogeneous population. Herein, a mixture of experts extension of the latent position cluster model is developed. The mixture of experts framework allows covariates to enter the latent position cluster model in a number of ways, yielding different model interpretations.Estimates of the model parameters are derived in a Bayesian framework using a Markov Chain Monte Carlo algorithm. The algorithm is generally computationally expensive — surrogate proposal distributions which shadow the target distributions are derived, reducing the computational burden.The methodology is demonstrated through an illustrative example detailing relationships between a group of lawyers in the USA. 相似文献
16.
The Health and Retirement Study (HRS) is funded by the National Institute on Aging of US with the aim of investigating the health, social and economic implications of the aging of the American population. The participants of the study receive a thorough in-home clinical and neuropsychological assessment leading to a diagnosis of normal, cognitive impairment but not demented, or dementia. Due to the heterogeneity of the participants into three classes, we analyze some overall cognitive functioning responses through a factor mixture analysis model. The model extends recent proposals developed for binary and continuous data to general mixed data and to the situation of observed heterogeneity, typical of the HRS study. 相似文献
17.
In this note, we focus on estimating the false discovery rate (FDR) of a multiple testing method with a common, non-random rejection threshold under a mixture model. We develop a new class of estimates of the FDR and prove that it is less conservatively biased than what is traditionally used. Numerical evidence is presented to show that the mean squared error (MSE) is also often smaller for the present class of estimates, especially in small-scale multiple testings. A similar class of estimates of the positive false discovery rate (pFDR) less conservatively biased than what is usually used is then proposed. When modified using our estimate of the pFDR and applied to a gene-expression data, Storey's q-value method identifies a few more significant genes than his original q-value method at certain thresholds. The BH like method developed by thresholding our estimate of the FDR is shown to control the FDR in situations where the p -values have the same dependence structure as required by the BH method and, for lack of information about the proportion π0 of true null hypotheses, it is reasonable to assume that π0 is uniformly distributed over (0,1). 相似文献
18.
Application of the characterization theory to the mixture model 总被引:1,自引:0,他引:1
This paper indicates the potential application of the characterization theory to the mixture model. It discusses a particular example of a mixture model and concludes that successful application of this theory can be made in order to choose the appropriate mixture model and the mixing parameter. 相似文献
19.
Experiments that involve the blending of several components are known as mixture experiments. In some mixture experiments, the response depends not only on the proportion of the mixture components, but also on the processing conditions, A new combined model is proposed which is based on Taylor series approximation and is intended to be a compromise between standard mixture models and standard response surface models. Cost and/or time constraints often limit the size of industrial experiments. With this in mind, we present a new class of designs that will accommodate the fitting of the new combined model. 相似文献
20.
A partially adaptive estimator for the censored regression model based on a mixture of normal distributions 总被引:1,自引:0,他引:1
Steven B. Caudill 《Statistical Methods and Applications》2012,21(2):121-137
The goal of this paper is to introduce a partially adaptive estimator for the censored regression model based on an error
structure described by a mixture of two normal distributions. The model we introduce is easily estimated by maximum likelihood
using an EM algorithm adapted from the work of Bartolucci and Scaccia (Comput Stat Data Anal 48:821–834, 2005). A Monte Carlo study is conducted to compare the small sample properties of this estimator to the performance of some common
alternative estimators of censored regression models including the usual tobit model, the CLAD estimator of Powell (J Econom
25:303–325, 1984), and the STLS estimator of Powell (Econometrica 54:1435–1460, 1986). In terms of RMSE, our partially adaptive estimator performed well. The partially adaptive estimator is applied to data
on wife’s hours worked from Mroz (1987). In this application we find support for the partially adaptive estimator over the usual tobit model. 相似文献