首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It is common practice to compare the fit of non‐nested models using the Akaike (AIC) or Bayesian (BIC) information criteria. The basis of these criteria is the log‐likelihood evaluated at the maximum likelihood estimates of the unknown parameters. For the general linear model (and the linear mixed model, which is a special case), estimation is usually carried out using residual or restricted maximum likelihood (REML). However, for models with different fixed effects, the residual likelihoods are not comparable and hence information criteria based on the residual likelihood cannot be used. For model selection, it is often suggested that the models are refitted using maximum likelihood to enable the criteria to be used. The first aim of this paper is to highlight that both the AIC and BIC can be used for the general linear model by using the full log‐likelihood evaluated at the REML estimates. The second aim is to provide a derivation of the criteria under REML estimation. This aim is achieved by noting that the full likelihood can be decomposed into a marginal (residual) and conditional likelihood and this decomposition then incorporates aspects of both the fixed effects and variance parameters. Using this decomposition, the appropriate information criteria for model selection of models which differ in their fixed effects specification can be derived. An example is presented to illustrate the results and code is available for analyses using the ASReml‐R package.  相似文献   

2.
Non‐parametric generalized likelihood ratio test is a popular method of model checking for regressions. However, there are two issues that may be the barriers for its powerfulness: existing bias term and curse of dimensionality. The purpose of this paper is thus twofold: a bias reduction is suggested and a dimension reduction‐based adaptive‐to‐model enhancement is recommended to promote the power performance. The proposed test statistic still possesses the Wilks phenomenon and behaves like a test with only one covariate. Thus, it converges to its limit at a much faster rate and is much more sensitive to alternative models than the classical non‐parametric generalized likelihood ratio test. As a by‐product, we also prove that the bias‐corrected test is more efficient than the one without bias reduction in the sense that its asymptotic variance is smaller. Simulation studies and a real data analysis are conducted to evaluate of proposed tests.  相似文献   

3.
A multi‐level model allows the possibility of marginalization across levels in different ways, yielding more than one possible marginal likelihood. Since log‐likelihoods are often used in classical model comparison, the question to ask is which likelihood should be chosen for a given model. The authors employ a Bayesian framework to shed some light on qualitative comparison of the likelihoods associated with a given model. They connect these results to related issues of the effective number of parameters, penalty function, and consistent definition of a likelihood‐based model choice criterion. In particular, with a two‐stage model they show that, very generally, regardless of hyperprior specification or how much data is collected or what the realized values are, a priori, the first‐stage likelihood is expected to be smaller than the marginal likelihood. A posteriori, these expectations are reversed and the disparities worsen with increasing sample size and with increasing number of model levels.  相似文献   

4.
We introduce two types of graphical log‐linear models: label‐ and level‐invariant models for triangle‐free graphs. These models generalise symmetry concepts in graphical log‐linear models and provide a tool with which to model symmetry in the discrete case. A label‐invariant model is category‐invariant and is preserved after permuting some of the vertices according to transformations that maintain the graph, whereas a level‐invariant model equates expected frequencies according to a given set of permutations. These new models can both be seen as instances of a new type of graphical log‐linear model termed the restricted graphical log‐linear model, or RGLL, in which equality restrictions on subsets of main effects and first‐order interactions are imposed. Their likelihood equations and graphical representation can be obtained from those derived for the RGLL models.  相似文献   

5.
Focusing on the model selection problems in the family of Poisson mixture models (including the Poisson mixture regression model with random effects and zero‐inflated Poisson regression model with random effects), the current paper derives two conditional Akaike information criteria. The criteria are the unbiased estimators of the conditional Akaike information based on the conditional log‐likelihood and the conditional Akaike information based on the joint log‐likelihood, respectively. The derivation is free from the specific parametric assumptions about the conditional mean of the true data‐generating model and applies to different types of estimation methods. Additionally, the derivation is not based on the asymptotic argument. Simulations show that the proposed criteria have promising estimation accuracy. In addition, it is found that the criterion based on the conditional log‐likelihood demonstrates good model selection performance under different scenarios. Two sets of real data are used to illustrate the proposed method.  相似文献   

6.
Network meta‐analysis can be implemented by using arm‐based or contrast‐based models. Here we focus on arm‐based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial‐by‐treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi‐likelihood/pseudo‐likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood reduce bias and yield satisfactory coverage rates. Sum‐to‐zero restriction and baseline contrasts for random trial‐by‐treatment interaction effects, as well as a residual ML‐like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood are therefore recommended.  相似文献   

7.
In real‐data analysis, deciding the best subset of variables in regression models is an important problem. Akaike's information criterion (AIC) is often used in order to select variables in many fields. When the sample size is not so large, the AIC has a non‐negligible bias that will detrimentally affect variable selection. The present paper considers a bias correction of AIC for selecting variables in the generalized linear model (GLM). The GLM can express a number of statistical models by changing the distribution and the link function, such as the normal linear regression model, the logistic regression model, and the probit model, which are currently commonly used in a number of applied fields. In the present study, we obtain a simple expression for a bias‐corrected AIC (corrected AIC, or CAIC) in GLMs. Furthermore, we provide an ‘R’ code based on our formula. A numerical study reveals that the CAIC has better performance than the AIC for variable selection.  相似文献   

8.
As a natural successor of the information criteria AIC and ABIC, information criteria for the Bayes models were developed by evaluating the bias of the log likelihood of the predictive distribution as an estimate of its expected log-likelihood. Considering two specific situations for the true distribution, two information criteria, PIC1 and PIC2 are derived. Linear Gaussian cases are considered in details and the evaluation of the maximum a posteriori estimator is also considered. By a simple example of estimating the signal to noise ratio, it was shown that the PIC2 is a good approximation to the expected log-likelihood in the entire region of the signal to noise ratio. On the other hand, PIC1 performs good only for the smaller values of the variance ratio. For illustration, the problems of trend estimation and seasonal adjustment are considered. Examples show that the hyper-parameters estimated by the new criteria are usually closer to the best ones than those by the ABIC.  相似文献   

9.
In this paper, we investigate Bayesian generalized nonlinear mixed‐effects (NLME) regression models for zero‐inflated longitudinal count data. The methodology is motivated by and applied to colony forming unit (CFU) counts in extended bactericidal activity tuberculosis (TB) trials. Furthermore, for model comparisons, we present a generalized method for calculating the marginal likelihoods required to determine Bayes factors. A simulation study shows that the proposed zero‐inflated negative binomial regression model has good accuracy, precision, and credibility interval coverage. In contrast, conventional normal NLME regression models applied to log‐transformed count data, which handle zero counts as left censored values, may yield credibility intervals that undercover the true bactericidal activity of anti‐TB drugs. We therefore recommend that zero‐inflated NLME regression models should be fitted to CFU count on the original scale, as an alternative to conventional normal NLME regression models on the logarithmic scale.  相似文献   

10.
Abstract. This study gives a generalization of Birch's log‐linear model numerical invariance result. The generalization is given in the form of a sufficient condition for numerical invariance that is simple to verify in practice and is applicable for a much broader class of models than log‐linear models. Unlike Birch's log‐linear result, the generalization herein does not rely on any relationship between sufficient statistics and maximum likelihood estimates. Indeed the generalization does not rely on the existence of a reduced set of sufficient statistics. Instead, the concept of homogeneity takes centre stage. Several examples illustrate the utility of non‐log‐linear models, the invariance (and non‐invariance) of fitted values, and the invariance (and non‐invariance) of certain approximating distributions.  相似文献   

11.
Effective implementation of likelihood inference in models for high‐dimensional data often requires a simplified treatment of nuisance parameters, with these having to be replaced by handy estimates. In addition, the likelihood function may have been simplified by means of a partial specification of the model, as is the case when composite likelihood is used. In such circumstances tests and confidence regions for the parameter of interest may be constructed using Wald type and score type statistics, defined so as to account for nuisance parameter estimation or partial specification of the likelihood. In this paper a general analytical expression for the required asymptotic covariance matrices is derived, and suggestions for obtaining Monte Carlo approximations are presented. The same matrices are involved in a rescaling adjustment of the log likelihood ratio type statistic that we propose. This adjustment restores the usual chi‐squared asymptotic distribution, which is generally invalid after the simplifications considered. The practical implication is that, for a wide variety of likelihoods and nuisance parameter estimates, confidence regions for the parameters of interest are readily computable from the rescaled log likelihood ratio type statistic as well as from the Wald type and score type statistics. Two examples, a measurement error model with full likelihood and a spatial correlation model with pairwise likelihood, illustrate and compare the procedures. Wald type and score type statistics may give rise to confidence regions with unsatisfactory shape in small and moderate samples. In addition to having satisfactory shape, regions based on the rescaled log likelihood ratio type statistic show empirical coverage in reasonable agreement with nominal confidence levels.  相似文献   

12.
In this article the author investigates the application of the empirical‐likelihood‐based inference for the parameters of varying‐coefficient single‐index model (VCSIM). Unlike the usual cases, if there is no bias correction the asymptotic distribution of the empirical likelihood ratio cannot achieve the standard chi‐squared distribution. To this end, a bias‐corrected empirical likelihood method is employed to construct the confidence regions (intervals) of regression parameters, which have two advantages, compared with those based on normal approximation, that is, (1) they do not impose prior constraints on the shape of the regions; (2) they do not require the construction of a pivotal quantity and the regions are range preserving and transformation respecting. A simulation study is undertaken to compare the empirical likelihood with the normal approximation in terms of coverage accuracies and average areas/lengths of confidence regions/intervals. A real data example is given to illustrate the proposed approach. The Canadian Journal of Statistics 38: 434–452; 2010 © 2010 Statistical Society of Canada  相似文献   

13.
A fast and accurate method of confidence interval construction for the smoothing parameter in penalised spline and partially linear models is proposed. The method is akin to a parametric percentile bootstrap where Monte Carlo simulation is replaced by saddlepoint approximation, and can therefore be viewed as an approximate bootstrap. It is applicable in a quite general setting, requiring only that the underlying estimator be the root of an estimating equation that is a quadratic form in normal random variables. This is the case under a variety of optimality criteria such as those commonly denoted by maximum likelihood (ML), restricted ML (REML), generalized cross validation (GCV) and Akaike's information criteria (AIC). Simulation studies reveal that under the ML and REML criteria, the method delivers a near‐exact performance with computational speeds that are an order of magnitude faster than existing exact methods, and two orders of magnitude faster than a classical bootstrap. Perhaps most importantly, the proposed method also offers a computationally feasible alternative when no known exact or asymptotic methods exist, e.g. GCV and AIC. An application is illustrated by applying the methodology to well‐known fossil data. Giving a range of plausible smoothed values in this instance can help answer questions about the statistical significance of apparent features in the data.  相似文献   

14.
The authors provide a rigorous large sample theory for linear models whose response variable has been subjected to the Box‐Cox transformation. They provide a continuous asymptotic approximation to the distribution of estimators of natural parameters of the model. They show, in particular, that the maximum likelihood estimator of the ratio of slope to residual standard deviation is consistent and relatively stable. The authors further show the importance for inference of normality of the errors and give tests for normality based on the estimated residuals. For non‐normal errors, they give adjustments to the log‐likelihood and to asymptotic standard errors.  相似文献   

15.
Informative dropout is a vexing problem for any biomedical study. Most existing statistical methods attempt to correct estimation bias related to this phenomenon by specifying unverifiable assumptions about the dropout mechanism. We consider a cohort study in Africa that uses an outreach programme to ascertain the vital status for dropout subjects. These data can be used to identify a number of relevant distributions. However, as only a subset of dropout subjects were followed, vital status ascertainment was incomplete. We use semi‐competing risk methods as our analysis framework to address this specific case where the terminal event is incompletely ascertained and consider various procedures for estimating the marginal distribution of dropout and the marginal and conditional distributions of survival. We also consider model selection and estimation efficiency in our setting. Performance of the proposed methods is demonstrated via simulations, asymptotic study and analysis of the study data.  相似文献   

16.
Two different forms of Akaike's information criterion (AIC) are compared for selecting the smooth terms in penalized spline additive mixed models. The conditional AIC (cAIC) has been used traditionally as a criterion for both estimating penalty parameters and selecting covariates in smoothing, and is based on the conditional likelihood given the smooth mean and on the effective degrees of freedom for a model fit. By comparison, the marginal AIC (mAIC) is based on the marginal likelihood from the mixed‐model formulation of penalized splines which has recently become popular for estimating smoothing parameters. To the best of the authors' knowledge, the use of mAIC for selecting covariates for smoothing in additive models is new. In the competing models considered for selection, covariates may have a nonlinear effect on the response, with the possibility of group‐specific curves. Simulations are used to compare the performance of cAIC and mAIC in model selection settings that have correlated and hierarchical smooth terms. In moderately large samples, both formulations of AIC perform extremely well at detecting the function that generated the data. The mAIC does better for simple functions, whereas the cAIC is more sensitive to detecting a true model that has complex and hierarchical terms.  相似文献   

17.
We conducted confirmatory factor analysis (CFA) of responses (N=803) to a self‐reported measure of optimism, using full‐information estimation via adaptive quadrature (AQ), an alternative estimation method for ordinal data. We evaluated AQ results in terms of the number of iterations required to achieve convergence, model fit, parameter estimates, standard errors (SE), and statistical significance, across four link‐functions (logit, probit, log‐log, complimentary log‐log) using 3–10 and 20 quadrature points. We compared AQ results with those obtained using maximum likelihood, robust maximum likelihood, and robust diagonally weighted least‐squares estimation. Compared to the other two link‐functions, logit and probit not only produced fit statistics, parameters estimates, SEs, and levels of significance that varied less across numbers of quadrature points, but also fitted the data better and provided larger completely standardised loadings than did maximum likelihood and diagonally weighted least‐squares. Our findings demonstrate the viability of using full‐information AQ to estimate CFA models with real‐world ordinal data.  相似文献   

18.
Abstract. We propose an extension of graphical log‐linear models to allow for symmetry constraints on some interaction parameters that represent homologous factors. The conditional independence structure of such quasi‐symmetric (QS) graphical models is described by an undirected graph with coloured edges, in which a particular colour corresponds to a set of equality constraints on a set of parameters. Unlike standard QS models, the proposed models apply with contingency tables for which only some variables or sets of the variables have the same categories. We study the graphical properties of such models, including conditions for decomposition of model parameters and of maximum likelihood estimates.  相似文献   

19.
In this article, we address the testing problem for additivity in nonparametric regression models. We develop a kernel‐based consistent test of a hypothesis of additivity in nonparametric regression, and establish its asymptotic distribution under a sequence of local alternatives. Compared to other existing kernel‐based tests, the proposed test is shown to effectively ameliorate the influence from estimation bias of the additive component of the nonparametric regression, and hence increase its efficiency. Most importantly, it avoids the tuning difficulties by using estimation‐based optimal criteria, while there is no direct tuning strategy for other existing kernel‐based testing methods. We discuss the usage of the new test and give numerical examples to demonstrate the practical performance of the test. The Canadian Journal of Statistics 39: 632–655; 2011. © 2011 Statistical Society of Canada  相似文献   

20.
Abstract. The cross‐validation (CV) criterion is known to be asecond‐order unbiased estimator of the risk function measuring the discrepancy between the candidate model and the true model, as well as the generalized information criterion (GIC) and the extended information criterion (EIC). In the present article, we show that the 2kth‐order unbiased estimator can be obtained using a linear combination from the leave‐one‐out CV criterion to the leave‐k‐out CV criterion. The proposed scheme is unique in that a bias smaller than that of a jackknife method can be obtained without any analytic calculation, that is, it is not necessary to obtain the explicit form of several terms in an asymptotic expansion of the bias. Furthermore, the proposed criterion can be regarded as a finite correction of a bias‐corrected CV criterion by using scalar coefficients in a bias‐corrected EIC obtained by the bootstrap iteration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号