首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In binary regression, imbalanced data result from the presence of values equal to zero (or one) in a proportion that is significantly greater than the corresponding real values of one (or zero). In this work, we evaluate two methods developed to deal with imbalanced data and compare them to the use of asymmetric links. The results based on simulation study show, that correction methods do not adequately correct bias in the estimation of regression coefficients and that the models with power links and reverse power considered produce better results for certain types of imbalanced data. Additionally, we present an application for imbalanced data, identifying the best model among the various ones proposed. The parameters are estimated using a Bayesian approach, considering the Hamiltonian Monte-Carlo method, utilizing the No-U-Turn Sampler algorithm and the comparisons of models were developed using different criteria for model comparison, predictive evaluation and quantile residuals.  相似文献   

2.
In the statistical literature, several discrete distributions have been developed so far. However, in this progressive technological era, the data generated from different fields is getting complicated day by day, making it difficult to analyze this real data through the various discrete distributions available in the existing literature. In this context, we have proposed a new flexible family of discrete models named discrete odd Weibull-G (DOW-G) family. Its several impressive distributional characteristics are derived. A key feature of the proposed family is its failure rate function that can take a variety of shapes for distinct values of the unknown parameters, like decreasing, increasing, constant, J-, and bathtub-shaped. Furthermore, the presented family not only adequately captures the skewed and symmetric data sets, but it can also provide a better fit to equi-, over-, under-dispersed data. After producing the general class, two particular distributions of the DOW-G family are extensively studied. The parameters estimation of the proposed family, are explored by the method of maximum likelihood and Bayesian approach. A compact Monte Carlo simulation study is performed to assess the behavior of the estimation methods. Finally, we have explained the usefulness of the proposed family by using two different real data sets.  相似文献   

3.
A number of models have been proposed in the literature to model data reflecting bathtub-shaped hazard rate functions. Mixture distributions provide the obvious choice for modelling such data sets but these contain too many parameters and hamper the accuracy of the inferential procedures particularly when the data are meagre. Recently, a few distributions have been proposed which are simply generalizations of the two-parameter Weibull model and are capable of producing bathtub behaviour of the hazard rate function. The Weibull extension and the modified Weibull models are two such families. This study focuses on comparing these two distributions for data sets exhibiting bathtub shape of the hazard rate. Bayesian tools are preferred due to their wide range of applicability in various nested and non-nested model comparison problems. Real data illustrations are provided so that a particular model can be recommended based on various tools of model comparison discussed in the paper.  相似文献   

4.
Interval-censored data arise when a failure time say, T cannot be observed directly but can only be determined to lie in an interval obtained from a series of inspection times. The frequentist approach for analysing interval-censored data has been developed for some time now. It is very common due to unavailability of software in the field of biological, medical and reliability studies to simplify the interval censoring structure of the data into that of a more standard right censoring situation by imputing the midpoints of the censoring intervals. In this research paper, we apply the Bayesian approach by employing Lindley's 1980, and Tierney and Kadane 1986 numerical approximation procedures when the survival data under consideration are interval-censored. The Bayesian approach to interval-censored data has barely been discussed in literature. The essence of this study is to explore and promote the Bayesian methods when the survival data been analysed are is interval-censored. We have considered only a parametric approach by assuming that the survival data follow a loglogistic distribution model. We illustrate the proposed methods with two real data sets. A simulation study is also carried out to compare the performances of the methods.  相似文献   

5.
In this paper, we propose a new Bayesian inference approach for classification based on the traditional hinge loss used for classical support vector machines, which we call the Bayesian Additive Machine (BAM). Unlike existing approaches, the new model has a semiparametric discriminant function where some feature effects are nonlinear and others are linear. This separation of features is achieved automatically during model fitting without user pre-specification. Following the literature on sparse regression of high-dimensional models, we can also identify the irrelevant features. By introducing spike-and-slab priors using two sets of indicator variables, these multiple goals are achieved simultaneously and automatically, without any parameter tuning such as cross-validation. An efficient partially collapsed Markov chain Monte Carlo algorithm is developed for posterior exploration based on a data augmentation scheme for the hinge loss. Our simulations and three real data examples demonstrate that the new approach is a strong competitor to some approaches that were proposed recently for dealing with challenging classification examples with high dimensionality.  相似文献   

6.
In this study, the components of extra-Poisson variability are estimated assuming random effect models under a Bayesian approach. A standard existing methodology to estimate extra-Poisson variability assumes a negative binomial distribution. The obtained results show that using the proposed random effect model it is possible to get more accurate estimates for the extra-Poisson variability components when compared to the use of a negative binomial distribution where it is possible to estimate only one component of extra-Poisson variability. Some illustrative examples are introduced considering real data sets.  相似文献   

7.
This paper sets out to implement the Bayesian paradigm for fractional polynomial models under the assumption of normally distributed error terms. Fractional polynomials widen the class of ordinary polynomials and offer an additive and transportable modelling approach. The methodology is based on a Bayesian linear model with a quasi-default hyper-g prior and combines variable selection with parametric modelling of additive effects. A Markov chain Monte Carlo algorithm for the exploration of the model space is presented. This theoretically well-founded stochastic search constitutes a substantial improvement to ad hoc stepwise procedures for the fitting of fractional polynomial models. The method is applied to a data set on the relationship between ozone levels and meteorological parameters, previously analysed in the literature.  相似文献   

8.
Polytomous Item Response Theory (IRT) models are used by specialists to score assessments and questionnaires that have items with multiple response categories. In this article, we study the performance of five model comparison criteria for comparing fit of the graded response and generalized partial credit models using the same dataset when the choice between the two is unclear. Simulation study is conducted to analyze the sensitivity of priors and compare the performance of the criteria using the No-U-Turn Sampler algorithm, under a Bayesian approach. The results were used to select a model for an application in mental health data.  相似文献   

9.
Recent approaches to the statistical analysis of adverse event (AE) data in clinical trials have proposed the use of groupings of related AEs, such as by system organ class (SOC). These methods have opened up the possibility of scanning large numbers of AEs while controlling for multiple comparisons, making the comparative performance of the different methods in terms of AE detection and error rates of interest to investigators. We apply two Bayesian models and two procedures for controlling the false discovery rate (FDR), which use groupings of AEs, to real clinical trial safety data. We find that while the Bayesian models are appropriate for the full data set, the error controlling methods only give similar results to the Bayesian methods when low incidence AEs are removed. A simulation study is used to compare the relative performances of the methods. We investigate the differences between the methods over full trial data sets, and over data sets with low incidence AEs and SOCs removed. We find that while the removal of low incidence AEs increases the power of the error controlling procedures, the estimated power of the Bayesian methods remains relatively constant over all data sizes. Automatic removal of low-incidence AEs however does have an effect on the error rates of all the methods, and a clinically guided approach to their removal is needed. Overall we found that the Bayesian approaches are particularly useful for scanning the large amounts of AE data gathered.  相似文献   

10.
Herein, we propose a fully Bayesian approach to the greenhouse gas emission problem. The goal of this work is to estimate the emission rate of polluting gases from the area flooded by hydroelectric reservoirs. We present models for gas concentration evolution in two ways: first, by proposing them from ordinary differential equation solutions and, second, by using stochastic differential equations with a discretization scheme. Finally, we present techniques to estimate the emission rate for the entire reservoir. In order to carry out the inference, we use the Bayesian framework with Monte Carlo via Markov Chain methods. Discretization schemes over continuous differential equations are used when necessary. These models applied to greenhouse gas emission and Bayesian inference for this purpose are completely new in statistical literature, as far as we know, and contribute to estimate the amount of polluting gases released from hydroelectric reservoirs in Brazil. The proposed models are applied in a real data set and results are presented.  相似文献   

11.
In this paper, we discuss a progressively censored inverted exponentiated Rayleigh distribution. Estimation of unknown parameters is considered under progressive censoring using maximum likelihood and Bayesian approaches. Bayes estimators of unknown parameters are derived with respect to different symmetric and asymmetric loss functions using gamma prior distributions. An importance sampling procedure is taken into consideration for deriving these estimates. Further highest posterior density intervals for unknown parameters are constructed and for comparison purposes bootstrap intervals are also obtained. Prediction of future observations is studied in one- and two-sample situations from classical and Bayesian viewpoint. We further establish optimum censoring schemes using Bayesian approach. Finally, we conduct a simulation study to compare the performance of proposed methods and analyse two real data sets for illustration purposes.  相似文献   

12.
Time-varying coefficient models with autoregressive and moving-average–generalized autoregressive conditional heteroscedasticity structure are proposed for examining the time-varying effects of risk factors in longitudinal studies. Compared with existing models in the literature, the proposed models give explicit patterns for the time-varying coefficients. Maximum likelihood and marginal likelihood (based on a Laplace approximation) are used to estimate the parameters in the proposed models. Simulation studies are conducted to evaluate the performance of these two estimation methods, which is measured in terms of the Kullback–Leibler divergence and the root mean square error. The marginal likelihood approach leads to the more accurate parameter estimates, although it is more computationally intensive. The proposed models are applied to the Framingham Heart Study to investigate the time-varying effects of covariates on coronary heart disease incidence. The Bayesian information criterion is used for specifying the time series structures of the coefficients of the risk factors.  相似文献   

13.
ABSTRACT

In this paper we propose a class of skewed t link models for analyzing binary response data with covariates. It is a class of asymmetric link models designed to improve the overall fit when commonly used symmetric links, such as the logit and probit links, do not provide the best fit available for a given binary response dataset. Introducing a skewed t distribution for the underlying latent variable, we develop the class of models. For the analysis of the models, a Bayesian and non-Bayesian methods are pursued using a Markov chain Monte Carlo (MCMC) sampling based approach. Necessary theories involved in modelling and computation are provided. Finally, a simulation study and a real data example are used to illustrate the proposed methodology.  相似文献   

14.
Even though integer-valued time series are common in practice, the methods for their analysis have been developed only in recent past. Several models for stationary processes with discrete marginal distributions have been proposed in the literature. Such processes assume the parameters of the model to remain constant throughout the time period. However, this need not be true in practice. In this paper, we introduce non-stationary integer-valued autoregressive (INAR) models with structural breaks to model a situation, where the parameters of the INAR process do not remain constant over time. Such models are useful while modelling count data time series with structural breaks. The Bayesian and Markov Chain Monte Carlo (MCMC) procedures for the estimation of the parameters and break points of such models are discussed. We illustrate the model and estimation procedure with the help of a simulation study. The proposed model is applied to the two real biometrical data sets.  相似文献   

15.
The prediction problem in finite populations is considered under error-in-variables super population models. The models considered are the usual regression models involving at most two variables, x and y, where both may be measured with error. Properties of some classical predictors are investigated. A Bayesian approach is proposed.  相似文献   

16.
A family of log-linear models are proposed to describe contingency tables in which one variable can be considered as the response to the remaining. The proposed models take into account the ordering nature of the response categories and have structure similar to that employed in polynomial regression. Stochastic ordering of the response distributions under the proposed models is discussed and the model-reduction techniques are developed. The proposed models are applied to two data sets previously analysed in the literature.  相似文献   

17.
Process capability (PC) indices measure the ability of a process of interest to meet the desired specifications under certain restrictions. There are a variety of capability indices available in literature for different interest variables such as weights, lengths, thickness, and the life time of items among many others. The goal of this article is to study the generalized capability indices from the Bayesian view point under different symmetric and asymmetric loss functions for the simple and mixture of generalized lifetime models. For our study purposes, we have covered a simple and two component mixture of Maxwell distribution as a special case of the generalized class of models. A comparative discussion of the PC with the mixture models under Laplace and inverse Rayleigh are also included. Bayesian point estimation of maintenance performance of the system is also part of the study (considering the Maxwell failure lifetime model and the repair time model). A real-life example is also included to illustrate the procedural details of the proposed method.  相似文献   

18.
Missing data, a common but challenging issue in most studies, may lead to biased and inefficient inferences if handled inappropriately. As a natural and powerful way for dealing with missing data, Bayesian approach has received much attention in the literature. This paper reviews the recent developments and applications of Bayesian methods for dealing with ignorable and non-ignorable missing data. We firstly introduce missing data mechanisms and Bayesian framework for dealing with missing data, and then introduce missing data models under ignorable and non-ignorable missing data circumstances based on the literature. After that, important issues of Bayesian inference, including prior construction, posterior computation, model comparison and sensitivity analysis, are discussed. Finally, several future issues that deserve further research are summarized and concluded.  相似文献   

19.
In this paper, we adopt the Bayesian approach to expectile regression employing a likelihood function that is based on an asymmetric normal distribution. We demonstrate that improper uniform priors for the unknown model parameters yield a proper joint posterior. Three simulated data sets were generated to evaluate the proposed method which show that Bayesian expectile regression performs well and has different characteristics comparing with Bayesian quantile regression. We also apply this approach into two real data analysis.  相似文献   

20.
Copula, marginal distributions and model selection: a Bayesian note   总被引:3,自引:0,他引:3  
Copula functions and marginal distributions are combined to produce multivariate distributions. We show advantages of estimating all parameters of these models using the Bayesian approach, which can be done with standard Markov chain Monte Carlo algorithms. Deviance-based model selection criteria are also discussed when applied to copula models since they are invariant under monotone increasing transformations of the marginals. We focus on the deviance information criterion. The joint estimation takes into account all dependence structure of the parameters’ posterior distributions in our chosen model selection criteria. Two Monte Carlo studies are conducted to show that model identification improves when the model parameters are jointly estimated. We study the Bayesian estimation of all unknown quantities at once considering bivariate copula functions and three known marginal distributions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号