首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Algebraic Markov Bases and MCMC for Two-Way Contingency Tables   总被引:3,自引:0,他引:3  
ABSTRACT.  The Diaconis–Sturmfels algorithm is a method for sampling from conditional distributions, based on the algebraic theory of toric ideals. This algorithm is applied to categorical data analysis through the notion of Markov basis. An application of this algorithm is a non-parametric Monte Carlo approach to the goodness of fit tests for contingency tables. In this paper, we characterize or compute the Markov bases for some log-linear models for two-way contingency tables using techniques from Computational Commutative Algebra, namely Gröbner bases. This applies to a large set of cases including independence, quasi-independence, symmetry, quasi-symmetry. Three examples of quasi-symmetry and quasi-independence from Fingleton ( Models of category counts , Cambridge University Press, Cambridge, 1984) and Agresti ( An Introduction to categorical data analysis , Wiley, New York, 1996) illustrate the practical applicability and the relevance of this algebraic methodology.  相似文献   

2.
Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.  相似文献   

3.
Markov chain Monte Carlo (MCMC) algorithms have been shown to be useful for estimation of complex item response theory (IRT) models. Although an MCMC algorithm can be very useful, it also requires care in use and interpretation of results. In particular, MCMC algorithms generally make extensive use of priors on model parameters. In this paper, MCMC estimation is illustrated using a simple mixture IRT model, a mixture Rasch model (MRM), to demonstrate how the algorithm operates and how results may be affected by some commonly used priors. Priors on the probabilities of mixtures, label switching, model selection, metric anchoring, and implementation of the MCMC algorithm using WinBUGS are described, and their effects illustrated on parameter recovery in practical testing situations. In addition, an example is presented in which an MRM is fitted to a set of educational test data using the MCMC algorithm and a comparison is illustrated with results from three existing maximum likelihood estimation methods.  相似文献   

4.
Dealing with incomplete data is a pervasive problem in statistical surveys. Bayesian networks have been recently used in missing data imputation. In this research, we propose a new methodology for the multivariate imputation of missing data using discrete Bayesian networks and conditional Gaussian Bayesian networks. Results from imputing missing values in coronary artery disease data set and milk composition data set as well as a simulation study from cancer-neapolitan network are presented to demonstrate and compare the performance of three Bayesian network-based imputation methods with those of multivariate imputation by chained equations (MICE) and the classical hot-deck imputation method. To assess the effect of the structure learning algorithm on the performance of the Bayesian network-based methods, two methods called Peter-Clark algorithm and greedy search-and-score have been applied. Bayesian network-based methods are: first, the method introduced by Di Zio et al. [Bayesian networks for imputation, J. R. Stat. Soc. Ser. A 167 (2004), 309–322] in which, each missing item of a variable is imputed using the information given in the parents of that variable; second, the method of Di Zio et al. [Multivariate techniques for imputation based on Bayesian networks, Neural Netw. World 15 (2005), 303–310] which uses the information in the Markov blanket set of the variable to be imputed and finally, our new proposed method which applies the whole available knowledge of all variables of interest, consisting the Markov blanket and so the parent set, to impute a missing item. Results indicate the high quality of our new proposed method especially in the presence of high missingness percentages and more connected networks. Also the new method have shown to be more efficient than the MICE method for small sample sizes with high missing rates.  相似文献   

5.
This paper sets out to implement the Bayesian paradigm for fractional polynomial models under the assumption of normally distributed error terms. Fractional polynomials widen the class of ordinary polynomials and offer an additive and transportable modelling approach. The methodology is based on a Bayesian linear model with a quasi-default hyper-g prior and combines variable selection with parametric modelling of additive effects. A Markov chain Monte Carlo algorithm for the exploration of the model space is presented. This theoretically well-founded stochastic search constitutes a substantial improvement to ad hoc stepwise procedures for the fitting of fractional polynomial models. The method is applied to a data set on the relationship between ozone levels and meteorological parameters, previously analysed in the literature.  相似文献   

6.
The magnitude-frequency distribution (MFD) of earthquake is a fundamental statistic in seismology. The so-called b-value in the MFD is of particular interest in geophysics. A continuous time hidden Markov model (HMM) is proposed for characterizing the variability of b-values. The HMM-based approach to modeling the MFD has some appealing properties over the widely used sliding-window approach. Often, large variability appears in the estimation of b-value due to window size tuning, which may cause difficulties in interpretation of b-value heterogeneities. Continuous-time hidden Markov models (CT-HMMs) are widely applied in various fields. It bears some advantages over its discrete time counterpart in that it can characterize heterogeneities appearing in time series in a finer time scale, particularly for highly irregularly-spaced time series, such as earthquake occurrences. We demonstrate an expectation–maximization algorithm for the estimation of general exponential family CT-HMM. In parallel with discrete-time hidden Markov models, we develop a continuous time version of Viterbi algorithm to retrieve the overall optimal path of the latent Markov chain. The methods are applied to New Zealand deep earthquakes. Before the analysis, we first assess the completeness of catalogue events to assure the analysis is not biased by missing data. The estimation of b-value is stable over the selection of magnitude thresholds, which is ideal for the interpretation of b-value variability.  相似文献   

7.
In this paper, we study the statistical inference based on the Bayesian approach for regression models with the assumption that independent additive errors follow normal, Student-t, slash, contaminated normal, Laplace or symmetric hyperbolic distribution, where both location and dispersion parameters of the response variable distribution include nonparametric additive components approximated by B-splines. This class of models provides a rich set of symmetric distributions for the model error. Some of these distributions have heavier or lighter tails than the normal as well as different levels of kurtosis. In order to draw samples of the posterior distribution of the interest parameters, we propose an efficient Markov Chain Monte Carlo (MCMC) algorithm, which combines Gibbs sampler and Metropolis–Hastings algorithms. The performance of the proposed MCMC algorithm is assessed through simulation experiments. We apply the proposed methodology to a real data set. The proposed methodology is implemented in the R package BayesGESM using the function gesm().  相似文献   

8.
A new threshold regression model for survival data with a cure fraction   总被引:1,自引:0,他引:1  
Due to the fact that certain fraction of the population suffering a particular type of disease get cured because of advanced medical treatment and health care system, we develop a general class of models to incorporate a cure fraction by introducing the latent number N of metastatic-competent tumor cells or infected cells caused by bacteria or viral infection and the latent antibody level R of immune system. Various properties of the proposed models are carefully examined and a Markov chain Monte Carlo sampling algorithm is developed for carrying out Bayesian computation for model fitting and comparison. A real data set from a prostate cancer clinical trial is analyzed in detail to demonstrate the proposed methodology.  相似文献   

9.
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway–Maxwell–Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.  相似文献   

10.
Skew scale mixtures of normal distributions are often used for statistical procedures involving asymmetric data and heavy-tailed. The main virtue of the members of this family of distributions is that they are easy to simulate from and they also supply genuine expectation-maximization (EM) algorithms for maximum likelihood estimation. In this paper, we extend the EM algorithm for linear regression models and we develop diagnostics analyses via local influence and generalized leverage, following Zhu and Lee's approach. This is because Cook's well-known approach cannot be used to obtain measures of local influence. The EM-type algorithm has been discussed with an emphasis on the skew Student-t-normal, skew slash, skew-contaminated normal and skew power-exponential distributions. Finally, results obtained for a real data set are reported, illustrating the usefulness of the proposed method.  相似文献   

11.
Nonlinear mixed-effects (NLME) models are flexible enough to handle repeated-measures data from various disciplines. In this article, we propose both maximum-likelihood and restricted maximum-likelihood estimations of NLME models using first-order conditional expansion (FOCE) and the expectation–maximization (EM) algorithm. The FOCE-EM algorithm implemented in the ForStat procedure SNLME is compared with the Lindstrom and Bates (LB) algorithm implemented in both the SAS macro NLINMIX and the S-Plus/R function nlme in terms of computational efficiency and statistical properties. Two realworld data sets an orange tree data set and a Chinese fir (Cunninghamia lanceolata) data set, and a simulated data set were used for evaluation. FOCE-EM converged for all mixed models derived from the base model in the two realworld cases, while LB did not, especially for the models in which random effects are simultaneously considered in several parameters to account for between-subject variation. However, both algorithms had identical estimated parameters and fit statistics for the converged models. We therefore recommend using FOCE-EM in NLME models, particularly when convergence is a concern in model selection.  相似文献   

12.
We propose a two-stage algorithm for computing maximum likelihood estimates for a class of spatial models. The algorithm combines Markov chain Monte Carlo methods such as the Metropolis–Hastings–Green algorithm and the Gibbs sampler, and stochastic approximation methods such as the off-line average and adaptive search direction. A new criterion is built into the algorithm so stopping is automatic once the desired precision has been set. Simulation studies and applications to some real data sets have been conducted with three spatial models. We compared the algorithm proposed with a direct application of the classical Robbins–Monro algorithm using Wiebe's wheat data and found that our procedure is at least 15 times faster.  相似文献   

13.
This paper develops an objective Bayesian analysis method for estimating unknown parameters of the half-logistic distribution when a sample is available from the progressively Type-II censoring scheme. Noninformative priors such as Jeffreys and reference priors are derived. In addition, derived priors are checked to determine whether they satisfy probability-matching criteria. The Metropolis–Hasting algorithm is applied to generate Markov chain Monte Carlo samples from these posterior density functions because marginal posterior density functions of each parameter cannot be expressed in an explicit form. Monte Carlo simulations are conducted to investigate frequentist properties of estimated models under noninformative priors. For illustration purposes, a real data set is presented, and the quality of models under noninformative priors is evaluated through posterior predictive checking.  相似文献   

14.
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.  相似文献   

15.
ABSTRACT

Phased-mission systems (PMS) can be widely found in a lot of practical application areas. Reliability evaluations and analysis for this kind of systems become important issues. The reliability of PMS is typically defined as the probability that the system successfully accomplishes the missions of all phases. However, the k-out-of-n system success criterion for PMS has not been investigated. In this paper, according to this criterion, we develop two new models, which are static and dynamic, respectively. The assumptions for these two models are described in detail as well. The system reliabilities for both models are presented for the first time by employing finite Markov chain imbedding approach (FMCIA). In terms of FMCIA, we define different state spaces for the two models, and transition probability matrices are obtained. Then some numerical examples are given to illustrate the application of FMCIA. Finally, some discussions are made and conclusions are summarized.  相似文献   

16.
Gu MG  Sun L  Zuo G 《Lifetime data analysis》2005,11(4):473-488
An important property of Cox regression model is that the estimation of regression parameters using the partial likelihood procedure does not depend on its baseline survival function. We call such a procedure baseline-free. Using marginal likelihood, we show that an baseline-free procedure can be derived for a class of general transformation models under interval censoring framework. The baseline-free procedure results a simplified and stable computation algorithm for some complicated and important semiparametric models, such as frailty models and heteroscedastic hazard/rank regression models, where the estimation procedures so far available involve estimation of the infinite dimensional baseline function. A detailed computational algorithm using Markov Chain Monte Carlo stochastic approximation is presented. The proposed procedure is demonstrated through extensive simulation studies, showing the validity of asymptotic consistency and normality. We also illustrate the procedure with a real data set from a study of breast cancer. A heuristic argument showing that the score function is a mean zero martingale is provided.  相似文献   

17.
In this paper, we extend the censored linear regression model with normal errors to Student-t errors. A simple EM-type algorithm for iteratively computing maximum-likelihood estimates of the parameters is presented. To examine the performance of the proposed model, case-deletion and local influence techniques are developed to show its robust aspect against outlying and influential observations. This is done by the analysis of the sensitivity of the EM estimates under some usual perturbation schemes in the model or data and by inspecting some proposed diagnostic graphics. The efficacy of the method is verified through the analysis of simulated data sets and modelling a real data set first analysed under normal errors. The proposed algorithm and methods are implemented in the R package CensRegMod.  相似文献   

18.
ABSTRACT

In this paper, we propose modified spline estimators for nonparametric regression models with right-censored data, especially when the censored response observations are converted to synthetic data. Efficient implementation of these estimators depends on the set of knot points and an appropriate smoothing parameter. We use three algorithms, the default selection method (DSM), myopic algorithm (MA), and full search algorithm (FSA), to select the optimum set of knots in a penalized spline method based on a smoothing parameter, which is chosen based on different criteria, including the improved version of the Akaike information criterion (AICc), generalized cross validation (GCV), restricted maximum likelihood (REML), and Bayesian information criterion (BIC). We also consider the smoothing spline (SS), which uses all the data points as knots. The main goal of this study is to compare the performance of the algorithm and criteria combinations in the suggested penalized spline fits under censored data. A Monte Carlo simulation study is performed and a real data example is presented to illustrate the ideas in the paper. The results confirm that the FSA slightly outperforms the other methods, especially for high censoring levels.  相似文献   

19.
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum–Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum–Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback–Leibler divergence. The developed procedures are illustrated with a real data set.  相似文献   

20.
A new optimization algorithm is presented to solve the stratification problem. Assuming the number L of strata and the total sample size n are fixed, we obtain strata boundaries by using an objective function associated with the variance. In this problem, strata boundaries must be determined so that the elements in each stratum are more homogeneous among themselves. To produce more homogeneous strata, this paper proposes a new algorithm that uses the Greedy Randomized Adaptive Search Procedure (GRASP) methodology. Computational results are presented for a set of problems, with the application of the new algorithm and some algorithms from literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号