首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered.  相似文献   

2.
Feature selection arises in many areas of modern science. For example, in genomic research, we want to find the genes that can be used to separate tissues of different classes (e.g. cancer and normal). One approach is to fit regression/classification models with certain penalization. In the past decade, hyper-LASSO penalization (priors) have received increasing attention in the literature. However, fully Bayesian methods that use Markov chain Monte Carlo (MCMC) for regression/classification with hyper-LASSO priors are still in lack of development. In this paper, we introduce an MCMC method for learning multinomial logistic regression with hyper-LASSO priors. Our MCMC algorithm uses Hamiltonian Monte Carlo in a restricted Gibbs sampling framework. We have used simulation studies and real data to demonstrate the superior performance of hyper-LASSO priors compared to LASSO, and to investigate the issues of choosing heaviness and scale of hyper-LASSO priors.  相似文献   

3.
This paper proposes a generalized logistic regression model that can account for the correlation among responses on subunits. The subunits may arise as data on multiple observations within an individual. This method generalizes earlier work by Rosner (1984 a,b) and others. Methodological generalizations include: (1) the use of the more general Polya-Eggenberger distribution instead of the beta-binomial distribution to model the correlation structure, so that cases with negative, positive, or zero intraclass correlation can be handled; (2) a stepwise approach; (3) linear and non-linear regression; and, (4) the inclusion of the case of a truncated distribution. The model can accommodate missing data and covariates on the unit and subunit level. The derivative-free simplex algorithm is used to estimate the parameters.

The model is applied to data describing the progression of obstruction in coronary disease where multiple arterial segments are studied for each patient. The correlation in response that may exist for these multiple segments is accounted for in the analyses while attempting to examine associations with individual-specific (e.g., history of diabetes) and segment-specific (e.g., initial percent stenosis) covariates. Analyses were performed on a data set describing 382 patients with unoperated coronary artery disease and two coronary angiograms separated by at least one month and on a data set describing 284 patients undergoing percutaneous transluminal coronary angioplasty and studied by coronary angiograms.  相似文献   

4.
This paper presents a Bayesian technique for the estimation of a logistic regression model including variable selection. As in Ou & Penman (1989), the model is used to predict the direction of company earnings, one year ahead, from a large set of accounting variables from financial statements. To estimate the model, the paper presents a Markov chain Monte Carlo sampling scheme that includes the variable selection technique of Smith & Kohn (1996) and the non-Gaussian estimation method of Mira & Tierney (2001). The technique is applied to data for companies in the United States and Australia. The results obtained compare favourably to the technique used by Ou & Penman (1989) for both regions.  相似文献   

5.
ABSTRACT

Nowadays, generalized linear models have many applications. Some of these models which have more applications in the real world are the models with random effects; that is, some of the unknown parameters are considered random variables. In this article, this situation is considered in logistic regression models with a random intercept having exponential distribution. The aim is to obtain the Bayesian D-optimal design; thus, the method is to maximize the Bayesian D-optimal criterion. For the model was considered here, this criterion is a function of the quasi-information matrix that depends on the unknown parameters of the model. In the Bayesian D-optimal criterion, the expectation is acquired in respect of the prior distributions that are considered for the unknown parameters. Thus, it will only be a function of experimental settings (support points) and their weights. The prior distribution of the fixed parameters is considered uniform and normal. The Bayesian D-optimal design is finally calculated numerically by R3.1.1 software.  相似文献   

6.
We develop a Bayesian variable selection method for logistic regression models that can simultaneously accommodate qualitative covariates and interaction terms under various heredity constraints. We use expectation-maximization variable selection (EMVS) with a deterministic annealing variant as the platform for our method, due to its proven flexibility and efficiency. We propose a variance adjustment of the priors for the coefficients of qualitative covariates, which controls false-positive rates, and a flexible parameterization for interaction terms, which accommodates user-specified heredity constraints. This method can handle all pairwise interaction terms as well as a subset of specific interactions. Using simulation, we show that this method selects associated covariates better than the grouped LASSO and the LASSO with heredity constraints in various exploratory research scenarios encountered in epidemiological studies. We apply our method to identify genetic and non-genetic risk factors associated with smoking experimentation in a cohort of Mexican-heritage adolescents.  相似文献   

7.
Estimating the risk factors of a disease such as diabetic retinopathy (DR) is one of the important research problems among bio-medical and statistical practitioners as well as epidemiologists. Incidentally many studies have focused in building models with binary outcomes, that may not exploit the available information. This article has investigated the importance of retaining the ordinal nature of the response variable (e.g. severity level of a disease) while determining the risk factors associated with DR. A generalized linear model approach with appropriate link functions has been studied using both Classical and Bayesian frameworks. From the result of this study, it can be observed that the ordinal logistic regression with probit link function could be more appropriate approach in determining the risk factors of DR. The study has emphasized the ways to handle the ordinal nature of the response variable with better model fit compared to other link functions.  相似文献   

8.
We propose alternative approaches to analyze residuals in binary regression models based on random effect components. Our preferred model does not depend upon any tuning parameter, being completely automatic. Although the focus is mainly on accommodation of outliers, the proposed methodology is also able to detect them. Our approach consists of evaluating the posterior distribution of random effects included in the linear predictor. The evaluation of the posterior distributions of interest involves cumbersome integration, which is easily dealt with through stochastic simulation methods. We also discuss different specifications of prior distributions for the random effects. The potential of these strategies is compared in a real data set. The main finding is that the inclusion of extra variability accommodates the outliers, improving the adjustment of the model substantially, besides correctly indicating the possible outliers.  相似文献   

9.
This paper adopts a Bayesian strategy for generalized ridge estimation for high-dimensional regression. We also consider significance testing based on the proposed estimator, which is useful for selecting regressors. Both theoretical and simulation studies show that the proposed estimator can simultaneously outperform the ordinary ridge estimator and the LSE in terms of the mean square error (MSE) criterion. The simulation study also demonstrates the competitive MSE performance of our proposal with the Lasso under sparse models. We demonstrate the method using the lung cancer data involving high-dimensional microarrays.  相似文献   

10.
In this paper, we consider the full rank multivariate regression model with matrix elliptically contoured distributed errors. We formulate a conjugate prior distribution for matrix elliptical models and derive the posterior distributions of mean and scale matrices. In the sequel, some characteristics of regression matrix parameters are also proposed.  相似文献   

11.
Bayesian selection of variables is often difficult to carry out because of the challenge in specifying prior distributions for the regression parameters for all possible models, specifying a prior distribution on the model space and computations. We address these three issues for the logistic regression model. For the first, we propose an informative prior distribution for variable selection. Several theoretical and computational properties of the prior are derived and illustrated with several examples. For the second, we propose a method for specifying an informative prior on the model space, and for the third we propose novel methods for computing the marginal distribution of the data. The new computational algorithms only require Gibbs samples from the full model to facilitate the computation of the prior and posterior model probabilities for all possible models. Several properties of the algorithms are also derived. The prior specification for the first challenge focuses on the observables in that the elicitation is based on a prior prediction y 0 for the response vector and a quantity a 0 quantifying the uncertainty in y 0. Then, y 0 and a 0 are used to specify a prior for the regression coefficients semi-automatically. Examples using real data are given to demonstrate the methodology.  相似文献   

12.
ABSTRACT

The paper provides a Bayesian analysis for the zero-inflated regression models based on the generalized power series distribution. The approach is based on Markov chain Monte Carlo methods. The residual analysis is discussed and case-deletion influence diagnostics are developed for the joint posterior distribution, based on the ψ-divergence, which includes several divergence measures such as the Kullback–Leibler, J-distance, L1 norm, and χ2-square in zero-inflated general power series models. The methodology is reflected in a data set collected by wildlife biologists in a state park in California.  相似文献   

13.
In modeling defect counts collected from an established manufacturing processes, there are usually a relatively large number of zeros (non-defects). The commonly used models such as Poisson or Geometric distributions can underestimate the zero-defect probability and hence make it difficult to identify significant covariate effects to improve production quality. This article introduces a flexible class of zero inflated models which includes other familiar models such as the Zero Inflated Poisson (ZIP) models, as special cases. A Bayesian estimation method is developed as an alternative to traditionally used maximum likelihood based methods to analyze such data. Simulation studies show that the proposed method has better finite sample performance than the classical method with tighter interval estimates and better coverage probabilities. A real-life data set is analyzed to illustrate the practicability of the proposed method easily implemented using WinBUGS.  相似文献   

14.
We study objective Bayesian inference for linear regression models with residual errors distributed according to the class of two-piece scale mixtures of normal distributions. These models allow for capturing departures from the usual assumption of normality of the errors in terms of heavy tails, asymmetry, and certain types of heteroscedasticity. We propose a general non-informative, scale-invariant, prior structure and provide sufficient conditions for the propriety of the posterior distribution of the model parameters, which cover cases when the response variables are censored. These results allow us to apply the proposed models in the context of survival analysis. This paper represents an extension to the Bayesian framework of the models proposed in [16]. We present a simulation study that shows good frequentist properties of the posterior credible intervals as well as point estimators associated to the proposed priors. We illustrate the performance of these models with real data in the context of survival analysis of cancer patients.  相似文献   

15.
The randomized response technique (RRT) is an important tool that is commonly used to protect a respondent’s privacy and avoid biased answers in surveys on sensitive issues. In this work, we consider the joint use of the unrelated-question RRT of Greenberg et al. (J Am Stat Assoc 64:520–539, 1969) and the related-question RRT of Warner (J Am Stat Assoc 60:63–69, 1965) dealing with the issue of an innocuous question from the unrelated-question RRT. Unlike the existing unrelated-question RRT of Greenberg et al. (1969), the approach can provide more information on the innocuous question by using the related-question RRT of Warner (1965) to effectively improve the efficiency of the maximum likelihood estimator of Scheers and Dayton (J Am Stat Assoc 83:969–974, 1988). We can then estimate the prevalence of the sensitive characteristic by using logistic regression. In this new design, we propose the transformation method and provide large-sample properties. From the case of two survey studies, an extramarital relationship study and a cable TV study, we develop the joint conditional likelihood method. As part of this research, we conduct a simulation study of the relative efficiencies of the proposed methods. Furthermore, we use the two survey studies to compare the analysis results under different scenarios.  相似文献   

16.
The variance of the Maximum Likelihood Estimator (MLE) of the slope parameter in a logistic regression model becomes large as the degree of collinearity among the explanatory variables increases. In a Monte Carlo study, we observed that a ridge type estimator is at least as good as, and often much better than, the MLE in terms of Total and Prediction Mean Squared Error criteria. Using a set of medical data it is illustrated that the ridge trace of the estimator considered here is a useful diagnostic tool in logistic regression analysis.  相似文献   

17.
Many statistical procedures are based on the models which specify the conditions under which the data are generated. Many applications of linear regression, for example, assume that:(i) the observations are independent; (ii) the errors in the observations are identically distributed; (iii) each error has a normal distribution with mean zero and unknown variance σ2> 0. Previous works have examined individual departures from these assumptions. Here we examine composite departures. It is assumed that the error distribution in a linear model is power-exponential and that the observations are generated via a first order autoregressive model with the possibility of spurious observations. The consequences are illustrated via an example.  相似文献   

18.
Much research has been performed in the area of multiple linear regression, with the resuit that the field is well-developed. This is not true of logistic regression, however. The latter presents special problems because the response is not continuous. Some of these problems are: the difficulty of developing a suitable R2 statistic, possibly poor results produced by the method of maximum likelihood, and the challenge to develop suitable graphical techniques. We describe recent work in some of these directions, and discuss the need for additional research.  相似文献   

19.
In this paper, we study the statistical inference based on the Bayesian approach for regression models with the assumption that independent additive errors follow normal, Student-t, slash, contaminated normal, Laplace or symmetric hyperbolic distribution, where both location and dispersion parameters of the response variable distribution include nonparametric additive components approximated by B-splines. This class of models provides a rich set of symmetric distributions for the model error. Some of these distributions have heavier or lighter tails than the normal as well as different levels of kurtosis. In order to draw samples of the posterior distribution of the interest parameters, we propose an efficient Markov Chain Monte Carlo (MCMC) algorithm, which combines Gibbs sampler and Metropolis–Hastings algorithms. The performance of the proposed MCMC algorithm is assessed through simulation experiments. We apply the proposed methodology to a real data set. The proposed methodology is implemented in the R package BayesGESM using the function gesm().  相似文献   

20.
In this article, we develop a Bayesian variable selection method that concerns selection of covariates in the Poisson change-point regression model with both discrete and continuous candidate covariates. Ranging from a null model with no selected covariates to a full model including all covariates, the Bayesian variable selection method searches the entire model space, estimates posterior inclusion probabilities of covariates, and obtains model averaged estimates on coefficients to covariates, while simultaneously estimating a time-varying baseline rate due to change-points. For posterior computation, the Metropolis-Hastings within partially collapsed Gibbs sampler is developed to efficiently fit the Poisson change-point regression model with variable selection. We illustrate the proposed method using simulated and real datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号