首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. This paper shows that a binary logistic regression by categorical explanatory variables can be constructed in a closed-form solution. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with all its parameters and characteristics, including coefficients of regression, their standard errors and t-statistics, as well as the residual and null deviances. The derivation is performed for logistic models with one binary or categorical predictor, and several binary or categorical predictors. The analytical formulae can be used for arithmetical calculation of all the parameters of the logit regression. The explicit expressions for the characteristics of logit regression are convenient for the analysis and interpretation of the results of logistic modeling.  相似文献   

2.
Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model.  相似文献   

3.
McCullagh (1980) presented a comprehensive review of regression models for ordinal response variables. In these models, the functional relationship between the covariates and the response categories is dependent on the link function. This paper shows that discrimination between links is feasible when the response variable is ordinal. Using the log-gamma distribution of Prentice (1974), a generalized link function is constructed which allows discrimination between the probit, log-log, and complementary log-log links. Sample-size considerations are noted, and examples are presented.  相似文献   

4.
Semiparametric models provide a more flexible form for modeling the relationship between the response and the explanatory variables. On the other hand in the literature of modeling for the missing variables, canonical form of the probability of the variable being missing (p) is modeled taking a fully parametric approach. Here we consider a regression spline based semiparametric approach to model the missingness mechanism of nonignorably missing covariates. In this model the relationship between the suitable canonical form of p (e.g. probit p) and the missing covariate is modeled through several splines. A Bayesian procedure is developed to efficiently estimate the parameters. A computationally advantageous prior construction is proposed for the parameters of the semiparametric part. A WinBUGS code is constructed to apply Gibbs sampling to obtain the posterior distributions. We show through an extensive Monte Carlo simulation experiment that response model coefficent estimators maintain better (when the true missingness mechanism is nonlinear) or equivalent (when the true missingness mechanism is linear) bias and efficiency properties with the use of proposed semiparametric missingness model compared to the conventional model.  相似文献   

5.
Variable selection over a potentially large set of covariates in a linear model is quite popular. In the Bayesian context, common prior choices can lead to a posterior expectation of the regression coefficients that is a sparse (or nearly sparse) vector with a few nonzero components, those covariates that are most important. This article extends the “global‐local” shrinkage idea to a scenario where one wishes to model multiple response variables simultaneously. Here, we have developed a variable selection method for a K‐outcome model (multivariate regression) that identifies the most important covariates across all outcomes. The prior for all regression coefficients is a mean zero normal with coefficient‐specific variance term that consists of a predictor‐specific factor (shared local shrinkage parameter) and a model‐specific factor (global shrinkage term) that differs in each model. The performance of our modeling approach is evaluated through simulation studies and a data example.  相似文献   

6.
With competing risks data, one often needs to assess the treatment and covariate effects on the cumulative incidence function. Fine and Gray proposed a proportional hazards regression model for the subdistribution of a competing risk with the assumption that the censoring distribution and the covariates are independent. Covariate‐dependent censoring sometimes occurs in medical studies. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with proper adjustments for covariate‐dependent censoring. We consider a covariate‐adjusted weight function by fitting the Cox model for the censoring distribution and using the predictive probability for each individual. Our simulation study shows that the covariate‐adjusted weight estimator is basically unbiased when the censoring time depends on the covariates, and the covariate‐adjusted weight approach works well for the variance estimator as well. We illustrate our methods with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research. Here, cancer relapse and death in complete remission are two competing risks.  相似文献   

7.
Abstract.  We study a binary regression model using the complementary log–log link, where the response variable Δ is the indicator of an event of interest (for example, the incidence of cancer, or the detection of a tumour) and the set of covariates can be partitioned as ( X ,  Z ) where Z (real valued) is the primary covariate and X (vector valued) denotes a set of control variables. The conditional probability of the event of interest is assumed to be monotonic in Z , for every fixed X . A finite-dimensional (regression) parameter β describes the effect of X . We show that the baseline conditional probability function (corresponding to X  =  0 ) can be estimated by isotonic regression procedures and develop an asymptotically pivotal likelihood-ratio-based method for constructing (asymptotic) confidence sets for the regression function. We also show how likelihood-ratio-based confidence intervals for the regression parameter can be constructed using the chi-square distribution. An interesting connection to the Cox proportional hazards model under current status censoring emerges. We present simulation results to illustrate the theory and apply our results to a data set involving lung tumour incidence in mice.  相似文献   

8.
The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows one to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise different scoring rules as well as predicted market share and compare parametric and semiparametric approaches for a number of brand choice data sets.  相似文献   

9.
We are concerned with cumulative regression models for an ordered categorical response variable Y. We propose two methods to build partial residuals from regression on a subset Z1 of covariates Z., which take into regard the ordinal character of the response. The first method makes use of a multivariate GLM-representation of the model and produces residual measures for diagnostic purposes. The second uses a latent continuous variable model and yields new (adjusted) ordinal data Y*. Both methods are illustrated by a data set from forestry.  相似文献   

10.
A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study.  相似文献   

11.
Incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With generalized linear models, when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights proposed in Ibrahim (1990). In this article, we extend the EM by the method of weights to survival outcomes whose distributions may not fall in the class of generalized linear models. This method requires the estimation of the parameters of the distribution of the covariates. We present a clinical trials example with five covariates, four of which have some missing values.  相似文献   

12.
We introduce multicovariate-adjusted regression (MCAR), an adjustment method for regression analysis, where both the response (Y) and predictors (X 1, …, X p ) are not directly observed. The available data have been contaminated by unknown functions of a set of observable distorting covariates, Z 1, …, Z s , in a multiplicative fashion. The proposed method substantially extends the current contaminated regression modelling capability, by allowing for multiple distorting covariate effects. MCAR is a flexible generalisation of the recently proposed covariate-adjusted regression method, an effective adjustment method in the presence of a single covariate, Z. For MCAR estimation, we establish a connection between the MCAR models and adaptive varying coefficient models. This connection leads to an adaptation of a hybrid backfitting estimation algorithm. Extensive simulations are used to study the performance and limitations of the proposed iterative estimation algorithm. In particular, the bias and mean square error of the proposed MCAR estimators are examined, relative to a baseline and a consistent benchmark estimator. The method is also illustrated with a Pima Indian diabetes data set, where the response and predictors are potentially contaminated by body mass index and triceps skin fold thickness. Both distorting covariates measure aspects of obesity, an important risk factor in type 2 diabetes.  相似文献   

13.
In regression analyses of spatially structured data, it is common practice to introduce spatially correlated random effects into the regression model to reduce or even avoid unobserved variable bias in the estimation of other covariate effects. If besides the response the covariates are also spatially correlated, the spatial effects may confound the effect of the covariates or vice versa. In this case, the model fails to identify the true covariate effect due to multicollinearity. For highly collinear continuous covariates, path analysis and structural equation modeling techniques prove to be helpful to disentangle direct covariate effects from indirect covariate effects arising from correlation with other variables. This work discusses the applicability of these techniques in regression setups, where spatial and covariate effects coincide at least partly and classical geoadditive models fail to separate these effects. Supplementary materials for this article are available online.  相似文献   

14.
The proportional hazards mixed-effects model (PHMM) was proposed to handle dependent survival data. Motivated by its application in genetic epidemiology, we study the interpretation of its parameter estimates under violations of the proportional hazards assumption. The estimated fixed effect turns out to be an averaged regression effect over time, while the estimated variance component could be unaffected, inflated or attenuated depending on whether the random effect is on the baseline hazard, and whether the non-proportional regression effect decreases or increases over time. Using the conditional distribution of the covariates we define the standardized covariate residuals, which can be used to check the proportional hazards assumption. The model checking technique is illustrated on a multi-center lung cancer trial.  相似文献   

15.
We propose a test for state dependence in binary panel data with individual covariates. For this aim, we rely on a quadratic exponential model in which the association between the response variables is accounted for in a different way with respect to more standard formulations. The level of association is measured by a single parameter that may be estimated by a Conditional Maximum Likelihood (CML) approach. Under the dynamic logit model, the conditional estimator of this parameter converges to zero when the hypothesis of absence of state dependence is true. Therefore, it is possible to implement a t-test for this hypothesis which may be very simply performed and attains the nominal significance level under several structures of the individual covariates. Through an extensive simulation study, we find that our test has good finite sample properties and it is more robust to the presence of (autocorrelated) covariates in the model specification in comparison with other existing testing procedures for state dependence. The proposed approach is illustrated by two empirical applications: the first is based on data coming from the Panel Study of Income Dynamics and concerns employment and fertility; the second is based on the Health and Retirement Study and concerns the self reported health status.  相似文献   

16.
Lasso is popularly used for variable selection in recent years. In this paper, lasso-type penalty functions including lasso and adaptive lasso are employed in simultaneously variable selection and parameter estimation for covariate-adjusted linear model, where the predictors and response cannot be observed directly and distorted by some observable covariate through some unknown multiplicative smooth functions. Estimation procedures are proposed and some asymptotic properties are obtained under some mild conditions. It deserves noting that under appropriate conditions, the adaptive lasso estimator correctly select covariates with nonzero coefficients with probability converging to one and that the estimators of nonzero coefficients have the same asymptotic distribution that they would have if the zero coefficients were known in advance, i.e. the adaptive lasso estimator has the oracle property in the sense of Fan and Li [6]. Simulation studies are carried out to examine its performance in finite sample situations and the Boston Housing data is analyzed for illustration.  相似文献   

17.
We propose a method for estimating parameters in generalized linear models with missing covariates and a non-ignorable missing data mechanism. We use a multinomial model for the missing data indicators and propose a joint distribution for them which can be written as a sequence of one-dimensional conditional distributions, with each one-dimensional conditional distribution consisting of a logistic regression. We allow the covariates to be either categorical or continuous. The joint covariate distribution is also modelled via a sequence of one-dimensional conditional distributions, and the response variable is assumed to be completely observed. We derive the E- and M-steps of the EM algorithm with non-ignorable missing covariate data. For categorical covariates, we derive a closed form expression for the E- and M-steps of the EM algorithm for obtaining the maximum likelihood estimates (MLEs). For continuous covariates, we use a Monte Carlo version of the EM algorithm to obtain the MLEs via the Gibbs sampler. Computational techniques for Gibbs sampling are proposed and implemented. The parametric form of the assumed missing data mechanism itself is not `testable' from the data, and thus the non-ignorable modelling considered here can be viewed as a sensitivity analysis concerning a more complicated model. Therefore, although a model may have `passed' the tests for a certain missing data mechanism, this does not mean that we have captured, even approximately, the correct missing data mechanism. Hence, model checking for the missing data mechanism and sensitivity analyses play an important role in this problem and are discussed in detail. Several simulations are given to demonstrate the methodology. In addition, a real data set from a melanoma cancer clinical trial is presented to illustrate the methods proposed.  相似文献   

18.
Let Y be a response variable, possibly multivariate, with a density function f (y|x, v; β) conditional on vectors x and v of covariates and a vector β of unknown parameters. The authors consider the problem of estimating β when the values taken by the covariate vector v are available for all observations while some of those taken by the covariate x are missing at random. They compare the profile estimator to several alternatives, both in terms of bias and standard deviation, when the response and covariates are discrete or continuous.  相似文献   

19.
Both continuous and categorical covariates are common in traditional Chinese medicine (TCM) research, especially in the clinical syndrome identification and in the risk prediction research. For groups of dummy variables which are generated by the same categorical covariate, it is important to penalize them group-wise rather than individually. In this paper, we discuss the group lasso method for a risk prediction analysis in TCM osteoporosis research. It is the first time to apply such a group-wise variable selection method in this field. It may lead to new insights of using the grouped penalization method to select appropriate covariates in the TCM research. The introduced methodology can select categorical and continuous variables, and estimate their parameters simultaneously. In our application of the osteoporosis data, four covariates (including both categorical and continuous covariates) are selected out of 52 covariates. The accuracy of the prediction model is excellent. Compared with the prediction model with different covariates, the group lasso risk prediction model can significantly decrease the error rate and help TCM doctors to identify patients with a high risk of osteoporosis in clinical practice. Simulation results show that the application of the group lasso method is reasonable for the categorical covariates selection model in this TCM osteoporosis research.  相似文献   

20.
This article considers a nonparametric varying coefficient regression model with longitudinal observations. The relationship between the dependent variable and the covariates is assumed to be linear at a specific time point, but the coefficients are allowed to change over time. A general formulation is used to treat mean regression, median regression, quantile regression, and robust mean regression in one setting. The local M-estimators of the unknown coefficient functions are obtained by local linear method. The asymptotic distributions of M-estimators of unknown coefficient functions at both interior and boundary points are established. Various applications of the main results, including estimating conditional quantile coefficient functions and robustifying the mean regression coefficient functions are derived. Finite sample properties of our procedures are studied through Monte Carlo simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号