首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we study the change-point inference problem motivated by the genomic data that were collected for the purpose of monitoring DNA copy number changes. DNA copy number changes or copy number variations (CNVs) correspond to chromosomal aberrations and signify abnormality of a cell. Cancer development or other related diseases are usually relevant to DNA copy number changes on the genome. There are inherited random noises in such data, therefore, there is a need to employ an appropriate statistical model for identifying statistically significant DNA copy number changes. This type of statistical inference is evidently crucial in cancer researches, clinical diagnostic applications, and other related genomic researches. For the high-throughput genomic data resulting from DNA copy number experiments, a mean and variance change point model (MVCM) for detecting the CNVs is appropriate. We propose to use a Bayesian approach to study the MVCM for the cases of one change and propose to use a sliding window to search for all CNVs on a given chromosome. We carry out simulation studies to evaluate the estimate of the locus of the DNA copy number change using the derived posterior probability. These simulation results show that the approach is suitable for identifying copy number changes. The approach is also illustrated on several chromosomes from nine fibroblast cancer cell line data (array-based comparative genomic hybridization data). All DNA copy number aberrations that have been identified and verified by karyotyping are detected by our approach on these cell lines.  相似文献   

2.
We propose a new iterative algorithm, called model walking algorithm, to the Bayesian model averaging method on the longitudinal regression models with AR(1) random errors within subjects. The Markov chain Monte Carlo method together with the model walking algorithm are employed. The proposed method is successfully applied to predict the progression rates on a myopia intervention trial in children.  相似文献   

3.
Abstract

The generalized extreme value (GEV) distribution is known as the limiting result for the modeling of maxima blocks of size n, which is used in the modeling of extreme events. However, it is possible for the data to present an excessive number of zeros when dealing with extreme data, making it difficult to analyze and estimate these events by using the usual GEV distribution. The Zero-Inflated Distribution (ZID) is widely known in literature for modeling data with inflated zeros, where the inflator parameter w is inserted. The present work aims to create a new approach to analyze zero-inflated extreme values, that will be applied in data of monthly maximum precipitation, that can occur during months where there was no precipitation, being these computed as zero. An inference was made on the Bayesian paradigm, and the parameter estimation was made by numerical approximations of the posterior distribution using Markov Chain Monte Carlo (MCMC) methods. Time series of some cities in the northeastern region of Brazil were analyzed, some of them with predominance of non-rainy months. The results of these applications showed the need to use this approach to obtain more accurate and with better adjustment measures results when compared to the standard distribution of extreme value analysis.  相似文献   

4.
This paper presents a comprehensive review and comparison of five computational methods for Bayesian model selection, based on MCMC simulations from posterior model parameter distributions. We apply these methods to a well-known and important class of models in financial time series analysis, namely GARCH and GARCH-t models for conditional return distributions (assuming normal and t-distributions). We compare their performance with the more common maximum likelihood-based model selection for simulated and real market data. All five MCMC methods proved reliable in the simulation study, although differing in their computational demands. Results on simulated data also show that for large degrees of freedom (where the t-distribution becomes more similar to a normal one), Bayesian model selection results in better decisions in favor of the true model than maximum likelihood. Results on market data show the instability of the harmonic mean estimator and reliability of the advanced model selection methods.  相似文献   

5.
The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.  相似文献   

6.
In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data, which often occur in studies, for example, involving the cross-sectional sampling scheme. It is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. For inference, a Bayesian estimation approach is proposed and in the method, the piecewise function is used to approximate the baseline hazards function. Also the conditional marginal prior, whose marginal part is free of any constraints, is employed to deal with many computational challenges caused by the constraints on the parameters, and a MCMC sampling procedure is developed. A simulation study is conducted to assess the finite sample performance of the proposed method and indicates that it works well for practical situations. We apply the approach to a set of data arising from a retirement center.  相似文献   

7.
Motivated by the Singapore Longitudinal Aging Study (SLAS), we propose a Bayesian approach for the estimation of semiparametric varying-coefficient models for longitudinal continuous and cross-sectional binary responses. These models have proved to be more flexible than simple parametric regression models. Our development is a new contribution towards their Bayesian solution, which eases computational complexity. We also consider adapting all kinds of familiar statistical strategies to address the missing data issue in the SLAS. Our simulation results indicate that a Bayesian imputation (BI) approach performs better than complete-case (CC) and available-case (AC) approaches, especially under small sample designs, and may provide more useful results in practice. In the real data analysis for the SLAS, the results for longitudinal outcomes from BI are similar to AC analysis, differing from those with CC analysis.  相似文献   

8.
In this paper we consider the impact of both missing data and measurement errors on a longitudinal analysis of participation in higher education in Australia. We develop a general method for handling both discrete and continuous measurement errors that also allows for the incorporation of missing values and random effects in both binary and continuous response multilevel models. Measurement errors are allowed to be mutually dependent and their distribution may depend on further covariates. We show that our methodology works via two simple simulation studies. We then consider the impact of our measurement error assumptions on the analysis of the real data set.  相似文献   

9.
This paper develops a new Bayesian approach to change-point modeling that allows the number of change-points in the observed autocorrelated times series to be unknown. The model we develop assumes that the number of change-points have a truncated Poisson distribution. A genetic algorithm is used to estimate a change-point model, which allows for structural changes with autocorrelated errors. We focus considerable attention on the construction of autocorrelated structure for each regime and for the parameters that characterize each regime. Our techniques are found to work well in the simulation with a few change-points. An empirical analysis is provided involving the annual flow of the Nile River and the monthly total energy production in South Korea to lead good estimates for structural change-points.  相似文献   

10.
Discrete data are collected in many application areas and are often characterised by highly-skewed distributions. An example of this, which is considered in this paper, is the number of visits to a specialist, often taken as a measure of demand in healthcare. A discrete Weibull regression model was recently proposed for regression problems with a discrete response and it was shown to possess desirable properties. In this paper, we propose the first Bayesian implementation of this model. We consider a general parametrization, where both parameters of the discrete Weibull distribution can be conditioned on the predictors, and show theoretically how, under a uniform non-informative prior, the posterior distribution is proper with finite moments. In addition, we consider closely the case of Laplace priors for parameter shrinkage and variable selection. Parameter estimates and their credible intervals can be readily calculated from their full posterior distribution. A simulation study and the analysis of four real datasets of medical records show promises for the wide applicability of this approach to the analysis of count data. The method is implemented in the R package BDWreg.  相似文献   

11.
This work examines the problem of locating changes in the distribution of a Compound Poisson Process where the variables being summed are iid normal and the number of variable follows the Poisson distribution. A Bayesian approach is developed to identify the location of significant changes in any of the parameters of the distribution, and a sliding window algorithm is used to identify multiple change points. These results can be applied in any field of study where an interest in locating changes not only in the parameter of a normally distributed data set but also in the rate of their occurrence. It has direct application to the study of DNA copy number variations in cancer research, where it is known that the distances between the genes can affect their intensity level.  相似文献   

12.
In this paper the issue of making inferences with misclassified data from a noisy multinomial process is addressed. A Bayesian model for making inferences about the proportions and the noise parameters is developed. The problem is reformulated in a more tractable form by introducing auxiliary or latent random vectors. This allows for an easy-to-implement Gibbs sampling-based algorithm to generate samples from the distributions of interest. An illustrative example related to elections is also presented.  相似文献   

13.
Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions.  相似文献   

14.
In this paper, we consider a model for repeated count data, with within-subject correlation and/or overdispersion. It extends both the generalized linear mixed model and the negative-binomial model. This model, proposed in a likelihood context [17 G. Molenberghs, G. Verbeke, and C.G.B. Demétrio, An extended random-effects approach to modeling repeated, overdispersion count data, Lifetime Data Anal. 13 (2007), pp. 457511.[Web of Science ®] [Google Scholar],18 G. Molenberghs, G. Verbeke, C.G.B. Demétrio, and A. Vieira, A family of generalized linear models for repeated measures with normal and conjugate random effects, Statist. Sci. 25 (2010), pp. 325347. doi: 10.1214/10-STS328[Crossref], [Web of Science ®] [Google Scholar]] is placed in a Bayesian inferential framework. An important contribution takes the form of Bayesian model assessment based on pivotal quantities, rather than the often less adequate DIC. By means of a real biological data set, we also discuss some Bayesian model selection aspects, using a pivotal quantity proposed by Johnson [12 V.E. Johnson, Bayesian model assessment using pivotal quantities, Bayesian Anal. 2 (2007), pp. 719734. doi: 10.1214/07-BA229[Crossref], [Web of Science ®] [Google Scholar]].  相似文献   

15.
In this paper, we study a new Bayesian approach for the analysis of linearly mixed structures. In particular, we consider the case of hyperspectral images, which have to be decomposed into a collection of distinct spectra, called endmembers, and a set of associated proportions for every pixel in the scene. This problem, often referred to as spectral unmixing, is usually considered on the basis of the linear mixing model (LMM). In unsupervised approaches, the endmember signatures have to be calculated by an endmember extraction algorithm, which generally relies on the supposition that there are pure (unmixed) pixels contained in the image. In practice, this assumption may not hold for highly mixed data and consequently the extracted endmember spectra differ from the true ones. A way out of this dilemma is to consider the problem under the normal compositional model (NCM). Contrary to the LMM, the NCM treats the endmembers as random Gaussian vectors and not as deterministic quantities. Existing Bayesian approaches for estimating the proportions under the NCM are restricted to the case that the covariance matrix of the Gaussian endmembers is a multiple of the identity matrix. The self-evident conclusion is that this model is not suitable when the variance differs from one spectral channel to the other, which is a common phenomenon in practice. In this paper, we first propose a Bayesian strategy for the estimation of the mixing proportions under the assumption of varying variances in the spectral bands. Then we generalize this model to handle the case of a completely unknown covariance structure. For both algorithms, we present Gibbs sampling strategies and compare their performance with other, state of the art, unmixing routines on synthetic as well as on real hyperspectral fluorescence spectroscopy data.  相似文献   

16.
17.
Concordance correlation coefficient (CCC) is one of the most popular scaled indices used to evaluate agreement. Most commonly, it is used under the assumption that data is normally distributed. This assumption, however, does not apply to skewed data sets. While methods for the estimation of the CCC of skewed data sets have been introduced and studied, the Bayesian approach and its comparison with the previous methods has been lacking. In this study, we propose a Bayesian method for the estimation of the CCC of skewed data sets and compare it with the best method previously investigated. The proposed method has certain advantages. It tends to outperform the best method studied before when the variation of the data is mainly from the random subject effect instead of error. Furthermore, it allows for greater flexibility in application by enabling incorporation of missing data, confounding covariates, and replications, which was not considered previously. The superiority of this new approach is demonstrated using simulation as well as real‐life biomarker data sets used in an electroencephalography clinical study. The implementation of the Bayesian method is accessible through the Comprehensive R Archive Network. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

18.
In many experiments, several measurements on the same variable are taken over time, a geographic region, or some other index set. It is often of interest to know if there has been a change over the index set in the parameters of the distribution of the variable. Frequently, the data consist of a sequence of correlated random variables, and there may also be several experimental units under observation, each providing a sequence of data. A problem in ascertaining the boundaries between the layers in geological sedimentary beds is used to introduce the model and then to illustrate the proposed methodology. It is assumed that, conditional on the change point, the data from each sequence arise from an autoregressive process that undergoes a change in one or more of its parameters. Unconditionally, the model then becomes a mixture of nonstationary autoregressive processes. Maximum-likelihood methods are used, and results of simulations to evaluate the performance of these estimators under practical conditions are given.  相似文献   

19.
Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号