首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In comparison to other experimental studies, multicollinearity appears frequently in mixture experiments, a special study area of response surface methodology, due to the constraints on the components composing the mixture. In the analysis of mixture experiments by using a special generalized linear model, logistic regression model, multicollinearity causes precision problems in the maximum-likelihood logistic regression estimate. Therefore, effects due to multicollinearity can be reduced to a certain extent by using alternative approaches. One of these approaches is to use biased estimators for the estimation of the coefficients. In this paper, we suggest the use of logistic ridge regression (RR) estimator in the cases where there is multicollinearity during the analysis of mixture experiments using logistic regression. Also, for the selection of the biasing parameter, we use fraction of design space plots for evaluating the effect of the logistic RR estimator with respect to the scaled mean squared error of prediction. The suggested graphical approaches are illustrated on the tumor incidence data set.  相似文献   

2.
H. Bunke 《Statistics》2013,47(1):7-11
Asymptotic distributions for parameter estimates in inadequate nonlinear regression models are considered. In comparison with earlier results no fictive limit design appears in the formulation. In this form the results can be applied for approximation using only finite sample size data.  相似文献   

3.
The paper provides a novel application of the probabilistic reduction (PR) approach to the analysis of multi-categorical outcomes. The PR approach, which systematically takes account of heterogeneity and functional form concerns, can improve the specification of binary regression models. However, its utility for systematically enriching the specification of and inference from models of multi-categorical outcomes has not been examined, while multinomial logistic regression models are commonly used for inference and, increasingly, prediction. Following a theoretical derivation of the PR-based multinomial logistic model (MLM), we compare functional specification and marginal effects from a traditional specification and a PR-based specification in a model of post-stroke hospital discharge disposition and find that the traditional MLM is misspecified. Results suggest that the impact on the reliability of substantive inferences from a misspecified model may be significant, even when model fit statistics do not suggest a strong lack of fit compared with a properly specified model using the PR approach. We identify situations under which a PR-based MLM specification can be advantageous to the applied researcher.  相似文献   

4.
Standard methods for analyzing binomial regression data rely on asymptotic inferences. Bayesian methods can be performed using simple computations, and they apply for any sample size. We provide a relatively complete discussion of Bayesian inferences for binomial regression with emphasis on inferences for the probability of “success.” Furthermore, we illustrate diagnostic tools, perform model selection among nonnested models, and examine the sensitivity of the Bayesian methods.  相似文献   

5.
Using the marginal likelihood based on the signed ranks derived from matched pairs data, inferences are made for regression parameters. Both members of a given pair are subject to the same censoring time, while different pairs are subject to different censoring times. Censoring is independent of the response and on the right. Easily calculated logistic density scores are used to provide an approximate analysis so that inferences can be made about a regression parameter in the presence of a difference within the matched pairs. Inference for the survival times of matched skin grafts is considered.  相似文献   

6.
Clustering due to unobserved heterogeneity may seriously impact on inference from binary regression models. We examined the performance of the logistic, and the logistic-normal models for data with such clustering. The total variance of unobserved heterogeneity rather than the level of clustering determines the size of bias of the maximum likelihood (ML) estimator, for the logistic model. Incorrect specification of clustering as level 2, using the logistic-normal model, provides biased estimates of the structural and random parameters, while specifying level 1, provides unbiased estimates for the former, and adequately estimates the latter. The proposed procedure appeals to many research areas.  相似文献   

7.
Inverse Gaussian first hitting time regression models sometimes provide an attractive representation of lifetime data. Various authors comment that dependence of both parameters on the same covariate may imply multicollinearity. The frequent appearance of conflicting signs for the two coefficients of the same covariate may be related to this. We carry out simulation studies to examine the reality of this possible multicollinearity. Although there is some dependence between estimates, multicollinearity does not seem to be a major problem. Fitting this model to data generated by a Weibull regression suggests that conflicting signs of estimates may be due to model misspecification.  相似文献   

8.
Regression analyses are commonly performed with doubly limited continuous dependent variables; for instance, when modeling the behavior of rates, proportions and income concentration indices. Several models are available in the literature for use with such variables, one of them being the unit gamma regression model. In all such models, parameter estimation is typically performed using the maximum likelihood method and testing inferences on the model''s parameters are usually based on the likelihood ratio test. Such a test can, however, deliver quite imprecise inferences when the sample size is small. In this paper, we propose two modified likelihood ratio test statistics for use with the unit gamma regressions that deliver much more accurate inferences when the number of data points in small. Numerical (i.e. simulation) evidence is presented for both fixed dispersion and varying dispersion models, and also for tests that involve nonnested models. We also present and discuss two empirical applications.  相似文献   

9.
Generalized linear models with random effects and/or serial dependence are commonly used to analyze longitudinal data. However, the computation and interpretation of marginal covariate effects can be difficult. This led Heagerty (1999, 2002) to propose models for longitudinal binary data in which a logistic regression is first used to explain the average marginal response. The model is then completed by introducing a conditional regression that allows for the longitudinal, within‐subject, dependence, either via random effects or regressing on previous responses. In this paper, the authors extend the work of Heagerty to handle multivariate longitudinal binary response data using a triple of regression models that directly model the marginal mean response while taking into account dependence across time and across responses. Markov Chain Monte Carlo methods are used for inference. Data from the Iowa Youth and Families Project are used to illustrate the methods.  相似文献   

10.
To estimate model parameters from complex sample data. we apply maximum likelihood techniques to the complex sample data from the finite population, which is treated as a sample from an i nfinite superpopulation. General asymptotic distribution theory is developed and then applied to both logistic regression and discrete proportional hazards models. Data from the Lipid Research Clinics Program areused to illustrate each model, demonstrating the effects on inference of neglecting the sampling design during parameter estimation. These empirical results also shed light on the issue of model-based vs. design-based inferences.  相似文献   

11.
In this paper, we propose a new semiparametric heteroscedastic regression model allowing for positive and negative skewness and bimodal shapes using the B-spline basis for nonlinear effects. The proposed distribution is based on the generalized additive models for location, scale and shape framework in order to model any or all parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. We motivate the new model by means of Monte Carlo simulations, thus ignoring the skewness and bimodality of the random errors in semiparametric regression models, which may introduce biases on the parameter estimates and/or on the estimation of the associated variability measures. An iterative estimation process and some diagnostic methods are investigated. Applications to two real data sets are presented and the method is compared to the usual regression methods.  相似文献   

12.
An exploratory model analysis device we call CDF knotting is introduced. It is a technique we have found useful for exploring relationships between points in the parameter space of a model and global properties of associated distribution functions. It can be used to alert the model builder to a condition we call lack of distinguishability which is to nonlinear models what multicollinearity is to linear models. While there are simple remedial actions to deal with multicollinearity in linear models, techniques such as deleting redundant variables in those models do not have obvious parallels for nonlinear models. In some of these nonlinear situations, however, CDF knotting may lead to alternative models with fewer parameters whose distribution functions are very similar to those of the original overparameterized model. We also show how CDF knotting can be exploited as a mathematical tool for deriving limiting distributions and illustrate the technique for the 3-parameterWeibull family obtaining limiting forms and moment ratios which correct and extend previously published results. Finally, geometric insights obtained by CDF knotting are verified relative to data fitting and estimation.  相似文献   

13.
Proportion differences are often used to estimate and test treatment effects in clinical trials with binary outcomes. In order to adjust for other covariates or intra-subject correlation among repeated measures, logistic regression or longitudinal data analysis models such as generalized estimating equation or generalized linear mixed models may be used for the analyses. However, these analysis models are often based on the logit link which results in parameter estimates and comparisons in the log-odds ratio scale rather than in the proportion difference scale. A two-step method is proposed in the literature to approximate the calculation of confidence intervals for the proportion difference using a concept of effective sample sizes. However, the performance of this two-step method has not been investigated in their paper. On this note, we examine the properties of the two-step method and propose an adjustment to the effective sample size formula based on Bayesian information theory. Simulations are conducted to evaluate the performance and to show that the modified effective sample size improves the coverage property of the confidence intervals.  相似文献   

14.
In this article, a robust multistage parameter estimator is proposed for nonlinear regression with heteroscedastic variance, where the residual variances are considered as a general parametric function of predictors. The motivation is based on considering the chi-square distribution for the calculated sample variance of the data. It is shown that outliers that are influential in nonlinear regression parameter estimates are not necessarily influential in calculating the sample variance. This matter persuades us, not only to robustify the estimate of the parameters of the models for both the regression function and the variance, but also to replace the sample variance of the data by a robust scale estimate.  相似文献   

15.
If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study.  相似文献   

16.
We study methods to estimate regression and variance parameters for over-dispersed and correlated count data from highly stratified surveys. Our application involves counts of fish catches from stratified research surveys and we propose a novel model in fisheries science to address changes in survey protocols. A challenge with this model is the large number of nuisance parameters which leads to computational issues and biased statistical inferences. We use a computationally efficient profile generalized estimating equation method and compare it to marginal maximum likelihood (MLE) and restricted MLE (REML) methods. We use REML to address bias and inaccurate confidence intervals because of many nuisance parameters. The marginal MLE and REML approaches involve intractable integrals and we used a new R package that is designed for estimating complex nonlinear models that may include random effects. We conclude from simulation analyses that the REML method provides more reliable statistical inferences among the three methods we investigated.  相似文献   

17.
Ridge regression solves multicollinearity problems by introducing a biasing parameter that is called ridge parameter; it shrinks the estimates and their standard errors in order to reach acceptable results. Selection of the ridge parameter was done using several subjective and objective techniques that are concerned with certain criteria. In this study, selection of the ridge parameter depends on other important statistical measures to reach a better value of the ridge parameter. The proposed ridge parameter selection technique depends on a mathematical programming model and the results are evaluated using a simulation study. The performance of the proposed method is good when the error variance is greater than or equal to one; the sample consists of 20 observations, the number of explanatory variables in the model is 2, and there is a very strong correlation between the two explanatory variables.  相似文献   

18.
We propose a stochastic model to analyse risk factors for emesis in multi-cycle chemotherapies, which allows to describe the effect of a potential risk factor by a single parameter. This model is a hybrid between a random intercept model and a transition model and it is motivated by some medical background knowledge with respect to frequency and course of emesis in cancer patients. We consider maximum likelihood estimation of the parameters of the model and additionally efficient estimation of the marginal risk in the first cycle. Finite sample properties are investigated in a simulation study. The proposed model suffers from a slight overparametrization, such that ML estimates show some poor statistical properties, but estimates of the marginal risk behave quite well. An investigation of alternative, simpler regression models reveals, that in this setting these models allow to define a time-constant regression coefficient only in a somewhat arbitrary manner. Hence we conclude, that the proposed model is valuable in spite of the difficulties with respect to parameter estimation.  相似文献   

19.
Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered.  相似文献   

20.
We consider the use of minimax shrinkage estimators for the linear regression mcjel under several loss functions when severe multicollinearity is present. The examples considered illustrate that little or no departure from the least squares estimates is permitted in many cases when the data is highly multicollinear and/or shrinkage is toward a point in the parameter space that does not closely agree with the sample data  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号