首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
In this paper, we propose a model with a Dirichlet process mixture of gamma densities in the bulk part below threshold and a generalized Pareto density in the tail for extreme value estimation. The proposed model is simple and flexible for posterior density estimation and posterior inference for high quantiles. The model works well even for small sample sizes and in the absence of prior information. We evaluate the performance of the proposed model through a simulation study. Finally, the proposed model is applied to a real environmental data.  相似文献   

2.
We propose a semiparametric modeling approach for mixtures of symmetric distributions. The mixture model is built from a common symmetric density with different components arising through different location parameters. This structure ensures identifiability for mixture components, which is a key feature of the model as it allows applications to settings where primary interest is inference for the subpopulations comprising the mixture. We focus on the two-component mixture setting and develop a Bayesian model using parametric priors for the location parameters and for the mixture proportion, and a nonparametric prior probability model, based on Dirichlet process mixtures, for the random symmetric density. We present an approach to inference using Markov chain Monte Carlo posterior simulation. The performance of the model is studied with a simulation experiment and through analysis of a rainfall precipitation data set as well as with data on eruptions of the Old Faithful geyser.  相似文献   

3.
Summary. We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).  相似文献   

4.
Summary.  The evaluation of the performance of a continuous diagnostic measure is a commonly encountered task in medical research. We develop Bayesian non-parametric models that use Dirichlet process mixtures and mixtures of Polya trees for the analysis of continuous serologic data. The modelling approach differs from traditional approaches to the analysis of receiver operating characteristic curve data in that it incorporates a stochastic ordering constraint for the distributions of serologic values for the infected and non-infected populations. Biologically such a constraint is virtually always feasible because serologic values from infected individuals tend to be higher than those for non-infected individuals. The models proposed provide data-driven inferences for the infected and non-infected population distributions, and for the receiver operating characteristic curve and corresponding area under the curve. We illustrate and compare the predictive performance of the Dirichlet process mixture and mixture of Polya trees approaches by using serologic data for Johne's disease in dairy cattle.  相似文献   

5.
Abstract. We propose a Bayesian semiparametric methodology for quantile regression modelling. In particular, working with parametric quantile regression functions, we develop Dirichlet process mixture models for the error distribution in an additive quantile regression formulation. The proposed non‐parametric prior probability models allow the shape of the error density to adapt to the data and thus provide more reliable predictive inference than models based on parametric error distributions. We consider extensions to quantile regression for data sets that include censored observations. Moreover, we employ dependent Dirichlet processes to develop quantile regression models that allow the error distribution to change non‐parametrically with the covariates. Posterior inference is implemented using Markov chain Monte Carlo methods. We assess and compare the performance of our models using both simulated and real data sets.  相似文献   

6.
Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient’s information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.  相似文献   

7.
Using the ‘grouping vector’ notion and employing a Dirichlet prior to the unknown mixing parameters viz., the unknown mixing proportiona, the Bayee estimates of the mixing proportions in finite mixtures of known distributions are obtained. These estimates are based on the optimal grouping of the sample data. An algorithm is proposed to obtain the optimal grouping of the eample observations when the component densities belong to the family of densities possessing the monotone likelihood ratio property. A numerical study is carried out for the case of mixtures of two normal densities.  相似文献   

8.
We consider Dirichlet process mixture models in which the observed clusters in any particular dataset are not viewed as belonging to a finite set of possible clusters but rather as representatives of a latent structure in which objects belong to one of a potentially infinite number of clusters. As more information is revealed the number of inferred clusters is allowed to grow. The precision parameter of the Dirichlet process is a crucial parameter that controls the number of clusters. We develop a framework for the specification of the hyperparameters associated with the prior for the precision parameter that can be used both in the presence or absence of subjective prior information about the level of clustering. Our approach is illustrated in an analysis of clustering brands at the magazine Which?. The results are compared with the approach of Dorazio (2009) via a simulation study.  相似文献   

9.
Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.  相似文献   

10.
In this paper, we propose a mixture of beta–Dirichlet processes as a nonparametric prior for the cumulative intensity functions of a Markov process. This family of priors is a natural extension of a mixture of Dirichlet processes or a mixture of beta processes which are devised to compromise advantages of parametric and nonparametric approaches. They give most of their prior mass to the small neighborhood of a specific parametric model. We show that a mixture of beta–Dirichlet processes prior is conjugate with Markov processes. Formulas for computing the posterior distribution are derived. Finally, results of analyzing credit history data are given.  相似文献   

11.
We develop clustering procedures for longitudinal trajectories based on a continuous-time hidden Markov model (CTHMM) and a generalized linear observation model. Specifically, in this article we carry out finite and infinite mixture model-based clustering for a CTHMM and achieve inference using Markov chain Monte Carlo (MCMC). For a finite mixture model with a prior on the number of components, we implement reversible-jump MCMC to facilitate the trans-dimensional move between models with different numbers of clusters. For a Dirichlet process mixture model, we utilize restricted Gibbs sampling split–merge proposals to improve the performance of the MCMC algorithm. We apply our proposed algorithms to simulated data as well as a real-data example, and the results demonstrate the desired performance of the new sampler.  相似文献   

12.
We propose a method for the analysis of a spatial point pattern, which is assumed to arise as a set of observations from a spatial nonhomogeneous Poisson process. The spatial point pattern is observed in a bounded region, which, for most applications, is taken to be a rectangle in the space where the process is defined. The method is based on modeling a density function, defined on this bounded region, that is directly related with the intensity function of the Poisson process. We develop a flexible nonparametric mixture model for this density using a bivariate Beta distribution for the mixture kernel and a Dirichlet process prior for the mixing distribution. Using posterior simulation methods, we obtain full inference for the intensity function and any other functional of the process that might be of interest. We discuss applications to problems where inference for clustering in the spatial point pattern is of interest. Moreover, we consider applications of the methodology to extreme value analysis problems. We illustrate the modeling approach with three previously published data sets. Two of the data sets are from forestry and consist of locations of trees. The third data set consists of extremes from the Dow Jones index over a period of 1303 days.  相似文献   

13.
The purpose of this paper is to present a nonparametric Bayesian procedure for estimating a survival curve in a double censoring situation. Assuming a proportional hazard rates model, we propose a consistent estimation of lifetime, based on a Dirichlet process prior knowledge on the observable random vector. Some large sample properties of this estimator are also derived, We prove strong consistency and asymptotic weak convergence to a Gaussian pro cess. Finally, a simulation study is presented in order to analyze the behavior of the proposed estimator, and establish some comparisons to other estimators.  相似文献   

14.
Shi  Yushu  Laud  Purushottam  Neuner  Joan 《Lifetime data analysis》2021,27(1):156-176

In this paper, we first propose a dependent Dirichlet process (DDP) model using a mixture of Weibull models with each mixture component resembling a Cox model for survival data. We then build a Dirichlet process mixture model for competing risks data without regression covariates. Next we extend this model to a DDP model for competing risks regression data by using a multiplicative covariate effect on subdistribution hazards in the mixture components. Though built on proportional hazards (or subdistribution hazards) models, the proposed nonparametric Bayesian regression models do not require the assumption of constant hazard (or subdistribution hazard) ratio. An external time-dependent covariate is also considered in the survival model. After describing the model, we discuss how both cause-specific and subdistribution hazard ratios can be estimated from the same nonparametric Bayesian model for competing risks regression. For use with the regression models proposed, we introduce an omnibus prior that is suitable when little external information is available about covariate effects. Finally we compare the models’ performance with existing methods through simulations. We also illustrate the proposed competing risks regression model with data from a breast cancer study. An R package “DPWeibull” implementing all of the proposed methods is available at CRAN.

  相似文献   

15.
In this paper, we propose a penalized likelihood method to simultaneous select covariate, and mixing component and obtain parameter estimation in the localized mixture of experts models. We develop an expectation maximization algorithm to solve the proposed penalized likelihood procedure, and introduce a data-driven procedure to select the tuning parameters. Extensive numerical studies are carried out to compare the finite sample performances of our proposed method and other existing methods. Finally, we apply the proposed methodology to analyze the Boston housing price data set and the baseball salaries data set.  相似文献   

16.
We propose data generating structures which can be represented as the nonlinear autoregressive models with single and finite mixtures of scale mixtures of skew normal innovations. This class of models covers symmetric/asymmetric and light/heavy-tailed distributions, so provide a useful generalization of the symmetrical nonlinear autoregressive models. As semiparametric and nonparametric curve estimation are the approaches for exploring the structure of a nonlinear time series data set, in this article the semiparametric estimator for estimating the nonlinear function of the model is investigated based on the conditional least square method and nonparametric kernel approach. Also, an Expectation–Maximization-type algorithm to perform the maximum likelihood (ML) inference of unknown parameters of the model is proposed. Furthermore, some strong and weak consistency of the semiparametric estimator in this class of models are presented. Finally, to illustrate the usefulness of the proposed model, some simulation studies and an application to real data set are considered.  相似文献   

17.
Summary.  A Bayesian non-parametric methodology has been recently proposed to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on a sample of size n , of the species variety featured by an additional sample of size m . Genomic applications pose the additional challenge of having to deal with large values of both n and m . In such a case the computation of the Bayesian non-parametric estimators is cumbersome and prevents their implementation. We focus on the two-parameter Poisson–Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m . We also study the asymptotic behaviour of the number of new species conditionally on the observed sample: such an asymptotic result, combined with a suitable simulation scheme, allows us to derive asymptotic highest posterior density intervals for the estimates of interest. Finally, we illustrate the implementation of the proposed methodology by the analysis of five expressed sequence tags data sets.  相似文献   

18.
We will pursue a Bayesian nonparametric approach in the hierarchical mixture modelling of lifetime data in two situations: density estimation, when the distribution is a mixture of parametric densities with a nonparametric mixing measure, and accelerated failure time (AFT) regression modelling, when the same type of mixture is used for the distribution of the error term. The Dirichlet process is a popular choice for the mixing measure, yielding a Dirichlet process mixture model for the error; as an alternative, we also allow the mixing measure to be equal to a normalized inverse-Gaussian prior, built from normalized inverse-Gaussian finite dimensional distributions, as recently proposed in the literature. Markov chain Monte Carlo techniques will be used to estimate the predictive distribution of the survival time, along with the posterior distribution of the regression parameters. A comparison between the two models will be carried out on the grounds of their predictive power and their ability to identify the number of components in a given mixture density.  相似文献   

19.
Summary.  The primary goal of multivariate statistical process performance monitoring is to identify deviations from normal operation within a manufacturing process. The basis of the monitoring schemes is historical data that have been collected when the process is running under normal operating conditions. These data are then used to establish confidence bounds to detect the onset of process deviations. In contrast with the traditional approaches that are based on the Gaussian assumption, this paper proposes the application of the infinite Gaussian mixture model (GMM) for the calculation of the confidence bounds, thereby relaxing the previous restrictive assumption. The infinite GMM is a special case of Dirichlet process mixtures and is introduced as the limit of the finite GMM, i.e. when the number of mixtures tends to ∞. On the basis of the estimation of the probability density function, via the infinite GMM, the confidence bounds are calculated by using the bootstrap algorithm. The methodology proposed is demonstrated through its application to a simulated continuous chemical process, and a batch semiconductor manufacturing process.  相似文献   

20.
A common assumption in fitting panel data models is normality of stochastic subject effects. This can be extremely restrictive, making vague most potential features of true distributions. The objective of this article is to propose a modeling strategy, from a semi-parametric Bayesian perspective, to specify a flexible distribution for the random effects in dynamic panel data models. This is addressed here by assuming the Dirichlet process mixture model to introduce Dirichlet process prior for the random-effects distribution. We address the role of initial conditions in dynamic processes, emphasizing on joint modeling of start-up and subsequent responses. We adopt Gibbs sampling techniques to approximate posterior estimates. These important topics are illustrated by a simulation study and also by testing hypothetical models in two empirical contexts drawn from economic studies. We use modified versions of information criteria to compare the fitted models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号