首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Beta Regression for Modelling Rates and Proportions   总被引:9,自引:0,他引:9  
This paper proposes a regression model where the response is beta distributed using a parameterization of the beta law that is indexed by mean and dispersion parameters. The proposed model is useful for situations where the variable of interest is continuous and restricted to the interval (0, 1) and is related to other variables through a regression structure. The regression parameters of the beta regression model are interpretable in terms of the mean of the response and, when the logit link is used, of an odds ratio, unlike the parameters of a linear regression that employs a transformed response. Estimation is performed by maximum likelihood. We provide closed-form expressions for the score function, for Fisher's information matrix and its inverse. Hypothesis testing is performed using approximations obtained from the asymptotic normality of the maximum likelihood estimator. Some diagnostic measures are introduced. Finally, practical applications that employ real data are presented and discussed.  相似文献   

2.
When using the co-twin control design for analysis of event times, one needs a model to address the possible within-pair association. One such model is the shared frailty model in which the random frailty variable creates the desired within-pair association. Standard inference for this model requires independence between the random effect and the covariates. We study how violations of this assumption affect inference for the regression coefficients and conclude that substantial bias may occur. We propose an alternative way of making inference for the regression parameters by using a fixed-effects models for survival in matched pairs. Fitting this model to data generated from the frailty model provides consistent and asymptotically normal estimates of regression coefficients, no matter whether the independence assumption is met.  相似文献   

3.
This paper surveys various shrinkage, smoothing and selection priors from a unifying perspective and shows how to combine them for Bayesian regularisation in the general class of structured additive regression models. As a common feature, all regularisation priors are conditionally Gaussian, given further parameters regularising model complexity. Hyperpriors for these parameters encourage shrinkage, smoothness or selection. It is shown that these regularisation (log-) priors can be interpreted as Bayesian analogues of several well-known frequentist penalty terms. Inference can be carried out with unified and computationally efficient MCMC schemes, estimating regularised regression coefficients and basis function coefficients simultaneously with complexity parameters and measuring uncertainty via corresponding marginal posteriors. For variable and function selection we discuss several variants of spike and slab priors which can also be cast into the framework of conditionally Gaussian priors. The performance of the Bayesian regularisation approaches is demonstrated in a hazard regression model and a high-dimensional geoadditive regression model.  相似文献   

4.
We consider a partially linear model with diverging number of groups of parameters in the parametric component. The variable selection and estimation of regression coefficients are achieved simultaneously by using the suitable penalty function for covariates in the parametric component. An MM-type algorithm for estimating parameters without inverting a high-dimensional matrix is proposed. The consistency and sparsity of penalized least-squares estimators of regression coefficients are discussed under the setting of some nonzero regression coefficients with very small values. It is found that the root pn/n-consistency and sparsity of the penalized least-squares estimators of regression coefficients cannot be given consideration simultaneously when the number of nonzero regression coefficients with very small values is unknown, where pn and n, respectively, denote the number of regression coefficients and sample size. The finite sample behaviors of penalized least-squares estimators of regression coefficients and the performance of the proposed algorithm are studied by simulation studies and a real data example.  相似文献   

5.
For a linear regression model over m populations with separate regression coefficients but a common error variance, a Bayesian model is employed to obtain regression coefficient estimates which are shrunk toward an overall value. The formulation uses Normal priors on the coefficients and diffuse priors on the grand mean vectors, the error variance, and the between-to-error variance ratios. The posterior density of the parameters which were given diffuse priors is obtained. From this the posterior means and variances of regression coefficients and the predictive mean and variance of a future observation are obtained directly by numerical integration in the balanced case, and with the aid of series expansions in the approximately balanced case. An example is presented and worked out for the case of one predictor variable. The method is an extension of Box & Tiao's Bayesian estimation of means in the balanced one-way random effects model.  相似文献   

6.
Quantile regression has gained increasing popularity as it provides richer information than the regular mean regression, and variable selection plays an important role in the quantile regression model building process, as it improves the prediction accuracy by choosing an appropriate subset of regression predictors. Unlike the traditional quantile regression, we consider the quantile as an unknown parameter and estimate it jointly with other regression coefficients. In particular, we adopt the Bayesian adaptive Lasso for the maximum entropy quantile regression. A flat prior is chosen for the quantile parameter due to the lack of information on it. The proposed method not only addresses the problem about which quantile would be the most probable one among all the candidates, but also reflects the inner relationship of the data through the estimated quantile. We develop an efficient Gibbs sampler algorithm and show that the performance of our proposed method is superior than the Bayesian adaptive Lasso and Bayesian Lasso through simulation studies and a real data analysis.  相似文献   

7.
For heteroscedastic simple linear regression when the variances are proportional to a power of the mean of the response variable, Miller (1986) recommends the following procedure: do a weighted least squares regression with the weights (empirical weights) estimated by the inverse of the appropriate power of the response variable. The practical appeal of this approach is its simplicity.

In this article some of the consequences of this simple procedure are considered. Specifically, the effect of this procedure on the bias of the point estimators of the regression coefficients and on the coverage probabilities of their corresponding confidence intervals is examined. It is found that the performance of the process of employing empirical weights in a weighted least squares regression depends on : (1) the particular regression parameter (slope or intercept) of interest, (2) the appropriate power of the mean of the response variable involved, and (3) the amount of variation in the data about the true regression line.  相似文献   

8.
Bounds for the maximum deviation between parameters of a finite population and their corresponding sample estimates are found in the multiple regression model. The parameters considered are the vector of regression coefficients and the value ofthe regression function for given values of the independent variable (or variables). Applications are considered to several widely employed sampling methods.  相似文献   

9.
A general methodology is presented for finding suitable Poisson log-linear models with applications to multiway contingency tables. Mixtures of multivariate normal distributions are used to model prior opinion when a subset of the regression vector is believed to be nonzero. This prior distribution is studied for two- and three-way contingency tables, in which the regression coefficients are interpretable in terms of odds ratios in the table. Efficient and accurate schemes are proposed for calculating the posterior model probabilities. The methods are illustrated for a large number of two-way simulated tables and for two three-way tables. These methods appear to be useful in selecting the best log-linear model and in estimating parameters of interest that reflect uncertainty in the true model.  相似文献   

10.
The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semi-parametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regression coefficients. For the baseline hazard, they use a discrete gamma process prior, and for the regression coefficients and the model space, they propose a semi-automatic parametric informative prior specification that focuses on the observables rather than the parameters. To implement the methodology, they propose a Markov chain Monte Carlo method to compute the posterior model probabilities. Examples using simulated and real data are given to demonstrate the methodology.  相似文献   

11.
In regression analysis, it is assumed that the response (or dependent variable) distribution is Normal, and errors are homoscedastic and uncorrelated. However, in practice, these assumptions are rarely satisfied by a real data set. To stabilize the heteroscedastic response variance, generally, log-transformation is suggested. Consequently, the response variable distribution approaches nearer to the Normal distribution. As a result, the model fit of the data is improved. Practically, a proper (seems to be suitable) transformation may not always stabilize the variance, and the response distribution may not reduce to Normal distribution. The present article assumes that the response distribution is log-normal with compound autocorrelated errors. Under these situations, estimation and testing of hypotheses regarding regression parameters have been derived. From a set of reduced data, we have derived the best linear unbiased estimators of all the regression coefficients, except the intercept which is often unimportant in practice. Unknown correlation parameters have been estimated. In this connection, we have derived a test rule for testing any set of linear hypotheses of the unknown regression coefficients. In addition, we have developed the confidence ellipsoids of a set of estimable functions of regression coefficients. For the fitted regression equation, an index of fit has been proposed. A simulated study illustrates the results derived in this report.  相似文献   

12.
In this paper we present a method for performing regression with stable disturbances. The method of maximum likelihood is used to estimate both distribution and regression parameters. Our approach utilises a numerical integration procedure to calculate the stable density, followed by sequential quadratic programming optimisation procedures to obtain estimates and standard errors. A theoretical justification for the use of stable law regression is given followed by two real world practical examples of the method. First, we fit the stable law multiple regression model to housing price data and examine how the results differ from normal linear regression. Second, we calculate the beta coefficients for 26 companies from the Financial Times Ordinary Shares Index.  相似文献   

13.
The authors consider the problem of simultaneous transformation and variable selection for linear regression. They propose a fully Bayesian solution to the problem, which allows averaging over all models considered including transformations of the response and predictors. The authors use the Box‐Cox family of transformations to transform the response and each predictor. To deal with the change of scale induced by the transformations, the authors propose to focus on new quantities rather than the estimated regression coefficients. These quantities, referred to as generalized regression coefficients, have a similar interpretation to the usual regression coefficients on the original scale of the data, but do not depend on the transformations. This allows probabilistic statements about the size of the effect associated with each variable, on the original scale of the data. In addition to variable and transformation selection, there is also uncertainty involved in the identification of outliers in regression. Thus, the authors also propose a more robust model to account for such outliers based on a t‐distribution with unknown degrees of freedom. Parameter estimation is carried out using an efficient Markov chain Monte Carlo algorithm, which permits moves around the space of all possible models. Using three real data sets and a simulated study, the authors show that there is considerable uncertainty about variable selection, choice of transformation, and outlier identification, and that there is advantage in dealing with all three simultaneously. The Canadian Journal of Statistics 37: 361–380; 2009 © 2009 Statistical Society of Canada  相似文献   

14.
A Bayesian elastic net approach is presented for variable selection and coefficient estimation in linear regression models. A simple Gibbs sampling algorithm was developed for posterior inference using a location-scale mixture representation of the Bayesian elastic net prior for the regression coefficients. The penalty parameters are chosen through an empirical method that maximizes the data marginal likelihood. Both simulated and real data examples show that the proposed method performs well in comparison to the other approaches.  相似文献   

15.
Quantile regression (QR) is a natural alternative for depicting the impact of covariates on the conditional distributions of a outcome variable instead of the mean. In this paper, we investigate Bayesian regularized QR for the linear models with autoregressive errors. LASSO-penalized type priors are forced on regression coefficients and autoregressive parameters of the model. Gibbs sampler algorithm is employed to draw the full posterior distributions of unknown parameters. Finally, the proposed procedures are illustrated by some simulation studies and applied to a real data analysis of the electricity consumption.  相似文献   

16.
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

17.
Abstract

In some clinical, environmental, or economical studies, researchers are interested in a semi-continuous outcome variable which takes the value zero with a discrete probability and has a continuous distribution for the non-zero values. Due to the measuring mechanism, it is not always possible to fully observe some outcomes, and only an upper bound is recorded. We call this left-censored data and observe only the maximum of the outcome and an independent censoring variable, together with an indicator. In this article, we introduce a mixture semi-parametric regression model. We consider a parametric model to investigate the influence of covariates on the discrete probability of the value zero. For the non-zero part of the outcome, a semi-parametric Cox’s regression model is used to study the conditional hazard function. The different parameters in this mixture model are estimated using a likelihood method. Hereby the infinite dimensional baseline hazard function is estimated by a step function. As results, we show the identifiability and the consistency of the estimators for the different parameters in the model. We study the finite sample behaviour of the estimators through a simulation study and illustrate this model on a practical data example.  相似文献   

18.
ABSTRACT

In this paper, we study a novelly robust variable selection and parametric component identification simultaneously in varying coefficient models. The proposed estimator is based on spline approximation and two smoothly clipped absolute deviation (SCAD) penalties through rank regression, which is robust with respect to heavy-tailed errors or outliers in the response. Furthermore, when the tuning parameter is chosen by modified BIC criterion, we show that the proposed procedure is consistent both in variable selection and the separation of varying and constant coefficients. In addition, the estimators of varying coefficients possess the optimal convergence rate under some assumptions, and the estimators of constant coefficients have the same asymptotic distribution as their counterparts obtained when the true model is known. Simulation studies and a real data example are undertaken to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

19.
In this paper, we present a statistical inference procedure for the step-stress accelerated life testing (SSALT) model with Weibull failure time distribution and interval censoring via the formulation of generalized linear model (GLM). The likelihood function of an interval censored SSALT is in general too complicated to obtain analytical results. However, by transforming the failure time to an exponential distribution and using a binomial random variable for failure counts occurred in inspection intervals, a GLM formulation with a complementary log-log link function can be constructed. The estimations of the regression coefficients used for the Weibull scale parameter are obtained through the iterative weighted least square (IWLS) method, and the shape parameter is updated by a direct maximum likelihood (ML) estimation. The confidence intervals for these parameters are estimated through bootstrapping. The application of the proposed GLM approach is demonstrated by an industrial example.  相似文献   

20.
In this paper,we propose a class of general partially linear varying-coefficient transformation models for ranking data. In the models, the functional coefficients are viewed as nuisance parameters and approximated by B-spline smoothing approximation technique. The B-spline coefficients and regression parameters are estimated by rank-based maximum marginal likelihood method. The three-stage Monte Carlo Markov Chain stochastic approximation algorithm based on ranking data is used to compute estimates and the corresponding variances for all the B-spline coefficients and regression parameters. Through three simulation studies and a Hong Kong horse racing data application, the proposed procedure is illustrated to be accurate, stable and practical.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号