首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
It is common practice to compare the fit of non‐nested models using the Akaike (AIC) or Bayesian (BIC) information criteria. The basis of these criteria is the log‐likelihood evaluated at the maximum likelihood estimates of the unknown parameters. For the general linear model (and the linear mixed model, which is a special case), estimation is usually carried out using residual or restricted maximum likelihood (REML). However, for models with different fixed effects, the residual likelihoods are not comparable and hence information criteria based on the residual likelihood cannot be used. For model selection, it is often suggested that the models are refitted using maximum likelihood to enable the criteria to be used. The first aim of this paper is to highlight that both the AIC and BIC can be used for the general linear model by using the full log‐likelihood evaluated at the REML estimates. The second aim is to provide a derivation of the criteria under REML estimation. This aim is achieved by noting that the full likelihood can be decomposed into a marginal (residual) and conditional likelihood and this decomposition then incorporates aspects of both the fixed effects and variance parameters. Using this decomposition, the appropriate information criteria for model selection of models which differ in their fixed effects specification can be derived. An example is presented to illustrate the results and code is available for analyses using the ASReml‐R package.  相似文献   

2.
Abstract

This paper is devoted to attain multiple objects via proposing two compound optimality criteria constructed with A-optimality criterion. The offered compound criteria are ADP-optimality to seek about an optimal design for minimizing the average variance, having an efficient parameter estimates, likewise, maximizing the probability of a particular event and AKL-optimality that provides an identified balance between model discrimination and minimizing the average variance of the parameter estimates. The equivalence theorems are stated and proved. Finally, a numerical example is applied on probit GLMs to illustrate the results for both compound criteria.  相似文献   

3.
This paper investigates, by means of Monte Carlo simulation, the effects of different choices of order for autoregressive approximation on the fully efficient parameter estimates for autoregressive moving average models. Four order selection criteria, AIC, BIC, HQ and PKK, were compared and different model structures with varying sample sizes were used to contrast the performance of the criteria. Some asymptotic results which provide a useful guide for assessing the performance of these criteria are presented. The results of this comparison show that there are marked differences in the accuracy implied using these alternative criteria in small sample situations and that it is preferable to apply BIC criterion, which leads to greater precision of Gaussian likelihood estimates, in such cases. Implications of the findings of this study for the estimation of time series models are highlighted.  相似文献   

4.
The estimation of the variance function of a linear regression model used in the asymptotic quasi-likelihood approach is considered. It is shown that the variance function used in the determination of the asymptotic quasi-likelihood estimates encompasses the variance functions commonly found in the literature. Selection criteria of the most appropriate estimate of the variance function for given data are established. These criteria are based on a graphical technique and a chi-squared test.  相似文献   

5.
The population growth rate of the European dipper has been shown to decrease with winter temperature and population size. We examine here the demographic mechanism for this effect by analysing how these factors affect the survival rate. Using more than 20 years of capture-mark-recapture data (1974-1997) based on more than 4000 marked individuals, we perform analyses using open capture-mark-recapture models. This allowed us to estimate the annual apparent survival rates (probability of surviving and staying on the study site from one year to the next one) and the recapture probabilities. We partitioned the variance of the apparent survival rates into sampling variance and process variance using random effects models, and investigated which variables best accounted for temporal process variation. Adult males and females had similar apparent survival rates, with an average of 0.52 and a coefficient of variation of 40%. Chick apparent survival was lower, averaging 0.06 with a coefficient of variation of 42%. Eighty percent of the variance in apparent survival rates was explained by winter temperature and population size for adults and 48% by winter temperature for chicks. The process variance outweighed the sampling variance both for chick and adult survival rates, which explained that shrunk estimates obtained under random effects models were close to MLE estimates. A large proportion of the annual variation in the apparent survival rate of chicks appears to be explained by inter-year differences in dispersal rates.  相似文献   

6.
The population growth rate of the European dipper has been shown to decrease with winter temperature and population size. We examine here the demographic mechanism for this effect by analysing how these factors affect the survival rate. Using more than 20 years of capture-mark-recapture data (1974-1997) based on more than 4000 marked individuals, we perform analyses using open capture-mark-recapture models. This allowed us to estimate the annual apparent survival rates (probability of surviving and staying on the study site from one year to the next one) and the recapture probabilities. We partitioned the variance of the apparent survival rates into sampling variance and process variance using random effects models, and investigated which variables best accounted for temporal process variation. Adult males and females had similar apparent survival rates, with an average of 0.52 and a coefficient of variation of 40%. Chick apparent survival was lower, averaging 0.06 with a coefficient of variation of 42%. Eighty percent of the variance in apparent survival rates was explained by winter temperature and population size for adults and 48% by winter temperature for chicks. The process variance outweighed the sampling variance both for chick and adult survival rates, which explained that shrunk estimates obtained under random effects models were close to MLE estimates. A large proportion of the annual variation in the apparent survival rate of chicks appears to be explained by inter-year differences in dispersal rates.  相似文献   

7.
This article assumes the goal of proposing a simulation-based theoretical model comparison methodology with application to two time series road accident models. The model comparison exercise helps to quantify the main differences and similarities between the two models and comprises of three main stages: (1) simulation of time series through a true model with predefined properties; (2) estimation of the alternative model using the simulated data; (3) sensitivity analysis to quantify the effect of changes in the true model parameters on alternative model parameter estimates through analysis of variance, ANOVA. The proposed methodology is applied to two time series road accident models: UCM (unobserved components model) and DRAG (Demand for Road Use, Accidents and their Severity). Assuming that the real data-generating process is the UCM, new datasets approximating the road accident data are generated, and DRAG models are estimated using the simulated data. Since these two methodologies are usually assumed to be equivalent, in a sense that both models accurately capture the true effects of the regressors, we are specifically addressing the modeling of the stochastic trend, through the alternative model. Stochastic trend is the time-varying component and is one of the crucial factors in time series road accident data. Theoretically, it can be easily modeled through UCM, given its modeling properties. However, properly capturing the effect of a non-stationary component such as stochastic trend in a stationary explanatory model such as DRAG is challenging. After obtaining the parameter estimates of the alternative model (DRAG), the estimates of both true and alternative models are compared and the differences are quantified through experimental design and ANOVA techniques. It is observed that the effects of the explanatory variables used in the UCM simulation are only partially captured by the respective DRAG coefficients. This a priori, could be due to multicollinearity but the results of both simulation of UCM data and estimating of DRAG models reveal that there is no significant static correlation among regressors. Moreover, in fact, using ANOVA, it is determined that this regression coefficient estimation bias is caused by the presence of the stochastic trend present in the simulated data. Thus, the results of the methodological development suggest that the stochastic component present in the data should be treated accordingly through a preliminary, exploratory data analysis.  相似文献   

8.
Predictive criteria, including the adjusted squared multiple correlation coefficient, the adjusted concordance correlation coefficient, and the predictive error sum of squares, are available for model selection in the linear mixed model. These criteria all involve some sort of comparison of observed values and predicted values, adjusted for the complexity of the model. The predicted values can be conditional on the random effects or marginal, i.e., based on averages over the random effects. These criteria have not been investigated for model selection success.

We used simulations to investigate selection success rates for several versions of these predictive criteria as well as several versions of Akaike's information criterion and the Bayesian information criterion, and the pseudo F-test. The simulations involved the simple scenario of selection of a fixed parameter when the covariance structure is known.

Several variance–covariance structures were used. For compound symmetry structures, higher success rates for the predictive criteria were obtained when marginal rather than conditional predicted values were used. Information criteria had higher success rates when a certain term (normally left out in SAS MIXED computations) was included in the criteria. Various penalty functions were used in the information criteria, but these had little effect on success rates. The pseudo F-test performed as expected. For the autoregressive with random effects structure, the results were the same except that success rates were higher for the conditional version of the predictive error sum of squares.

Characteristics of the data, such as the covariance structure, parameter values, and sample size, greatly impacted performance of various model selection criteria. No one criterion was consistently better than the others.  相似文献   

9.
We propose a spatial-temporal stochastic model for daily average surface temperature data. First, we build a model for a single spatial location, independently on the spatial information. The model includes trend, seasonality, and mean reversion, together with a seasonally dependent variance of the residuals. The spatial dependency is modelled by a Gaussian random field. Empirical fitting to data collected in 16 measurement stations in Lithuania over more than 40 years shows that our model captures the seasonality in the autocorrelation of the squared residuals, a property of temperature data already observed by other authors. We demonstrate through examples that our spatial-temporal model is applicable for prediction and classification.  相似文献   

10.
We develop and apply an approach to the spatial interpolation of a vector-valued random response field. The Bayesian approach we adopt enables uncertainty about the underlying models to be représentés in expressing the accuracy of the resulting interpolants. The methodology is particularly relevant in environmetrics, where vector-valued responses are only observed at designated sites at successive time points. The theory allows space-time modelling at the second level of the hierarchical prior model so that uncertainty about the model parameters has been fully expressed at the first level. In this way, we avoid unduly optimistic estimates of inferential accuracy. Moreover, the prior model can be upgraded with any available new data, while past data can be used in a systematic way to fit model parameters. The theory is based on the multivariate normal and related joint distributions. Our hierarchical prior models lead to posterior distributions which are robust with respect to the choice of the prior (hyperparameters). We illustrate our theory with an example involving monitoring stations in southern Ontario, where monthly average levels of ozone, sulphate, and nitrate are available and between-station response triplets are interpolated. In this example we use a recently developed method for interpolating spatial correlation fields.  相似文献   

11.
For constructing simultaneous confidence intervals for ratios of means for lognormal distributions, two approaches using a two-step method of variance estimates recovery are proposed. The first approach proposes fiducial generalized confidence intervals (FGCIs) in the first step followed by the method of variance estimates recovery (MOVER) in the second step (FGCIs–MOVER). The second approach uses MOVER in the first and second steps (MOVER–MOVER). Performance of proposed approaches is compared with simultaneous fiducial generalized confidence intervals (SFGCIs). Monte Carlo simulation is used to evaluate the performance of these approaches in terms of coverage probability, average interval width, and time consumption.  相似文献   

12.
Monte Carlo methods are used to compere a number of adaptive strategies for deciding which of several covariates to incorporate into the analysis of a randomized experiment.Sixteen selection strategies in three categories are considered: 1)select covariates correlated with the response, 2)select covariates with means differing across groups, and 3)select covariates with means differing across groups that are also correlated with the response. The criteria examined are the type I error rate of the test for equality of adjusted group means and the variance of the estimated treatment effect. These strategies can result in either inflated or deflated type I errors, depending on the method and the population parameters. The adaptive methods in the first category some times yieldpoint estimates of the treatment effect more precise than estimators derive dusing either all or none of the covariates.  相似文献   

13.
We consider the evaluation of laboratory practice through the comparison of measurements made by participating metrology laboratories when the measurement procedures are considered to have both fixed effects (the residual error due to unrecognised sources of error) and random effects (drawn from a distribution of known variance after correction for all known systematic errors). We show that, when estimating the participant fixed effects, the random effects described can be ignored. We also derive the adjustment to the variance estimates of the participant fixed effects due to these random effects.  相似文献   

14.
A simulation study of the binomial-logit model with correlated random effects is carried out based on the generalized linear mixed model (GLMM) methodology. Simulated data with various numbers of regression parameters and different values of the variance component are considered. The performance of approximate maximum likelihood (ML) and residual maximum likelihood (REML) estimators is evaluated. For a range of true parameter values, we report the average biases of estimators, the standard error of the average bias and the standard error of estimates over the simulations. In general, in terms of bias, the two methods do not show significant differences in estimating regression parameters. The REML estimation method is slightly better in reducing the bias of variance component estimates.  相似文献   

15.
Ridge regression solves multicollinearity problems by introducing a biasing parameter that is called ridge parameter; it shrinks the estimates and their standard errors in order to reach acceptable results. Selection of the ridge parameter was done using several subjective and objective techniques that are concerned with certain criteria. In this study, selection of the ridge parameter depends on other important statistical measures to reach a better value of the ridge parameter. The proposed ridge parameter selection technique depends on a mathematical programming model and the results are evaluated using a simulation study. The performance of the proposed method is good when the error variance is greater than or equal to one; the sample consists of 20 observations, the number of explanatory variables in the model is 2, and there is a very strong correlation between the two explanatory variables.  相似文献   

16.
Implementation of a full Bayesian non-parametric analysis involving neutral to the right processes (apart from the special case of the Dirichlet process) has been difficult for two reasons: first, the posterior distributions are complex and therefore only Bayes estimates (posterior expectations) have previously been presented; secondly, it is difficult to obtain an interpretation for the parameters of a neutral to the right process. In this paper we extend Ferguson & Phadia (1979) by presenting a general method for specifying the prior mean and variance of a neutral to the right process, providing the interpretation of the parameters. Additionally, we provide the basis for a full Bayesian analysis, via simulation, from the posterior process using a hybrid of new algorithms that is applicable to a large class of neutral to the right processes (Ferguson & Phadia only provide posterior means). The ideas are exemplified through illustrative analyses.  相似文献   

17.
When modeling multilevel data, it is important to accurately represent the interdependence of observations within clusters. Ignoring data clustering may result in parameter misestimation. However, it is not well established to what degree parameter estimates are affected by model misspecification when applying missing data techniques (MDTs) to incomplete multilevel data. We compare the performance of three MDTs with incomplete hierarchical data. We consider the impact of imputation model misspecification on the quality of parameter estimates by employing multiple imputation under assumptions of a normal model (MI/NM) with two-level cross-sectional data when values are missing at random on the dependent variable at rates of 10%, 30%, and 50%. Five criteria are used to compare estimates from MI/NM to estimates from MI assuming a linear mixed model (MI/LMM) and maximum likelihood estimation to the same incomplete data sets. With 10% missing data (MD), techniques performed similarly for fixed-effects estimates, but variance components were biased with MI/NM. Effects of model misspecification worsened at higher rates of MD, with the hierarchical structure of the data markedly underrepresented by biased variance component estimates. MI/LMM and maximum likelihood provided generally accurate and unbiased parameter estimates but performance was negatively affected by increased rates of MD.  相似文献   

18.
The mixed effects models with two variance components are often used to analyze longitudinal data. For these models, we compare two approaches to estimating the variance components, the analysis of variance approach and the spectral decomposition approach. We establish a necessary and sufficient condition for the two approaches to yield identical estimates, and some sufficient conditions for the superiority of one approach over the other, under the mean squared error criterion. Applications of the methods to circular models and longitudinal data are discussed. Furthermore, simulation results indicate that better estimates of variance components do not necessarily imply higher power of the tests or shorter confidence intervals.  相似文献   

19.
Can we find some common principle in the three comparisons? Lacking adequate time for a thorough exploration, let me suggest that representation is that common principle. I suggested (section 4) that judgment selection of spatial versus temporal extensions distinguish “longitudinal” local studies from “cross-section” population sampling. We had noted (section 3) that censuses are taken for detailed representation of the spatial dimension but they depend on judgmental selection of the temporal. Survey sampling lacks spatial detail but is spatially representative with randomization, and it can be made timely. Periodic samples can be designed that are representative of temporal extension. Furthermore, spatial and temporal detail can be obtained either through estimation or through cumulated samples [Purcell and Kish 1979, 1980; Kish 1979b, 1981, 1986 6.6]. Registers and administrative records can have good spatial and temporal representation, but representation may be lacking in population content, and surely in representation of variables. Representation of variables and of the relations between variables and over the population are the issues in conflict between surveys, experiments, and observations. This is a deep subject, and too deep to be explored again, as it was in section 2. A final point about limits for randomization to achieve representation through sampling: randomization for selecting samples of variables is beyond me generally, because I cannot conceive of frames for defined populations of variables. Yet we can find attempts at randomized selection of variables: in the selection of items for the consumer price index, also of items for tests of IQ or of achievements. Generally I believe that randomization is the way to achieve representation without complete coverage, and that it can be applied and practised in many dimensions.  相似文献   

20.
Development of anti-cancer therapies usually involve small to moderate size studies to provide initial estimates of response rates before initiating larger studies to better quantify response. These early trials often each contain a single tumor type, possibly using other stratification factors. Response rate for a given tumor type is routinely reported as the percentage of patients meeting a clinical criteria (e.g. tumor shrinkage), without any regard to response in the other studies. These estimates (called maximum likelihood estimates or MLEs) on average approximate the true value, but have variances that are usually large, especially for small to moderate size studies. The approach presented here is offered as a way to improve overall estimation of response rates when several small trials are considered by reducing the total uncertainty.The shrinkage estimators considered here (James-Stein/empirical Bayes and hierarchical Bayes) are alternatives that use information from all studies to provide potentially better estimates for each study. While these estimates introduce a small bias, they have a considerably smaller variance, and thus tend to be better in terms of total mean squared error. These procedures provide a better view of drug performance in that group of tumor types as a whole, as opposed to estimating each response rate individually without consideration of the others. In technical terms, the vector of estimated response rates is nearer the vector of true values, on average, than the vector of the usual unbiased MLEs applied to such trials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号