首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Bayesian information criterion (BIC) is widely used for variable selection. We focus on the regression setting for which variations of the BIC have been proposed. A version that includes the Fisher Information matrix of the predictor variables performed best in one published study. In this article, we extend the evaluation, introduce a performance measure involving how closely posterior probabilities are approximated, and conclude that the version that includes the Fisher Information often favors regression models having more predictors, depending on the scale and correlation structure of the predictor matrix. In the image analysis application that we describe, we therefore prefer the standard BIC approximation because of its relative simplicity and competitive performance at approximating the true posterior probabilities.  相似文献   

2.
Rong Zhu  Xinyu Zhang 《Statistics》2018,52(1):205-227
The theories and applications of model averaging have been developed comprehensively in the past two decades. In this paper, we consider model averaging for multivariate multiple regression models. In order to make use of the correlation information of the dependent variables sufficiently, we propose a model averaging method based on Mahalanobis distance which is related to the correlation of the dependent variables. We prove the asymptotic optimality of the resulting Mahalanobis Mallows model averaging (MMMA) estimators under certain assumptions. In the simulation study, we show that the proposed MMMA estimators compare favourably with model averaging estimators based on AIC and BIC weights and the Mallows model averaging estimators from the single dependent variable regression models. We further apply our method to the real data on urbanization rate and the proportion of non-agricultural population in ethnic minority areas of China.  相似文献   

3.
Nonlinear regression-adjusted control variables are investigated for improving variance reduction in statistical and system simulations. To this end, simple control variables are piecewise sectioned and then transformed using linear and nonlinear transformations. Optimal parameters of these transformations are selected using linear or nonlinear least-squares regression algorithms. As an example, piecewise power-transformed variables are used in the estimation of the mean for the twovariable Anderson-Darling goodness-of-fit statistic W 2 2. Substantial variance reduction over straightforward controls is obtained. These parametric transformations are compared against optimal, additive nonparametric transformations obtained by using the ACE algorithm and are shown, in comparison to the results from ACE, to be nearly optimal.  相似文献   

4.
In this paper, we extend the focused information criterion (FIC) to copula models. Copulas are often used for applications where the joint tail behavior of the variables is of particular interest, and selecting a copula that captures this well is then essential. Traditional model selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) aim at finding the overall best‐fitting model, which is not necessarily the one best suited for the application at hand. The FIC, on the other hand, evaluates and ranks candidate models based on the precision of their point estimates of a context‐given focus parameter. This could be any quantity of particular interest, for example, the mean, a correlation, conditional probabilities, or measures of tail dependence. We derive FIC formulae for the maximum likelihood estimator, the two‐stage maximum likelihood estimator, and the so‐called pseudo‐maximum‐likelihood (PML) estimator combined with parametric margins. Furthermore, we confirm the validity of the AIC formula for the PML estimator combined with parametric margins. To study the numerical behavior of FIC, we have carried out a simulation study, and we have also analyzed a multivariate data set pertaining to abalones. The results from the study show that the FIC successfully ranks candidate models in terms of their performance, defined as how well they estimate the focus parameter. In terms of estimation precision, FIC clearly outperforms AIC, especially when the focus parameter relates to only a specific part of the model, such as the conditional upper‐tail probability.  相似文献   

5.
The statistical shape theory via QR decomposition and based on Gaussian and isotropic models is extended in this paper to the families of non-isotropic elliptical distributions. The new shape distributions are easily computable and then the inference procedure can be studied with the resulting exact densities. An application in Biology is studied under two Kotz models, the best distribution (non-Gaussian) is selected by using a modified Bayesian information criterion (BIC)*.  相似文献   

6.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

7.
A number of articles have discussed the way lower order polynomial and interaction terms should be handled in linear regression models. Only if all lower order terms are included in the model will the regression model be invariant with respect to coding transformations of the variables. If lower order terms are omitted, the regression model will not be well formulated. In this paper, we extend this work to examine the implications of the ordering of variables in the linear mixed-effects model. We demonstrate how linear transformations of the variables affect the model and tests of significance of fixed effects in the model. We show how the transformations modify the random effects in the model, as well as their covariance matrix and the value of the restricted log-likelihood. We suggest a variable selection strategy for the linear mixed-effects model.  相似文献   

8.
Ordered multiple categorical (MC) variable has been widely considered and studied as response variable, and few studies have carefully considered it as a predictor in linear regression. When doing this, the existence of some pseudo-categories may result in overfitting, and to detect those pseudo-categories by hypothesis test of all dummy variables might have low specificity. In this paper, we propose a transformation method of dummy variables for such ordered MC predictors, after which a model selection method combined with BIC will be elaborated. Theoretical consistency of our model selection method is established under some common assumptions. Both simulation studies and real data analysis of a medical survey indicate that our method provides good performance and is applicable to a wide range of biomedical research.  相似文献   

9.
A note on the correlation structure of transformed Gaussian random fields   总被引:1,自引:0,他引:1  
Transformed Gaussian random fields can be used to model continuous time series and spatial data when the Gaussian assumption is not appropriate. The main features of these random fields are specified in a transformed scale, while for modelling and parameter interpretation it is useful to establish connections between these features and those of the random field in the original scale. This paper provides evidence that for many ‘normalizing’ transformations the correlation function of a transformed Gaussian random field is not very dependent on the transformation that is used. Hence many commonly used transformations of correlated data have little effect on the original correlation structure. The property is shown to hold for some kinds of transformed Gaussian random fields, and a statistical explanation based on the concept of parameter orthogonality is provided. The property is also illustrated using two spatial datasets and several ‘normalizing’ transformations. Some consequences of this property for modelling and inference are also discussed.  相似文献   

10.
Recently, authors have studied the weighted version of Kerridgeinaccuracy measure for left/right truncated distributions. In the present communication we introduce the notion of weighted interval inaccuracy measure and study it in the context of two-sided truncated random variables. In reliability theory and survival analysis, this measure may help to study the various characteristics of a system/component when it fails between two time points. We study various properties of this measure, including the effect of monotone transformations, and obtain its upper and lower bounds. It is shown that the proposed measure can uniquely determine the distribution function and characterizations of some important life distributions have been provided. This new measure is a generalization of recent weighted residual (past) inaccuracy measure.  相似文献   

11.
Stock & Watson (1999) consider the relative quality of different univariate forecasting techniques. This paper extends their study on forecasting practice, comparing the forecasting performance of two popular model selection procedures, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). This paper considers several topics: how AIC and BIC choose lags in autoregressive models on actual series, how models so selected forecast relative to an AR(4) model, the effect of using a maximum lag on model selection, and the forecasting performance of combining AR(4), AIC, and BIC models with an equal weight.  相似文献   

12.
Given dichotomized data from a bivariate normal distribution, the tetrachoric correlation coefficient provides a reasonable estimate of Pearson's correlation between the underlying variables. Greer et al. [2003. A Monte Carlo evaluation of the tetrachoric correlation coefficient. Educ. Psychol. Meas. 63, 931–950] suggested that this may be the case also under suitable transformations of the margins. As a complement to their work, this paper considers the estimation of Pearson's correlation between variables that are normal, but not jointly. A small Monte Carlo study is used to assess whether various approximations of the tetrachoric correlation coefficient could be helpful in this context. The results are encouraging, in terms of both bias and mean-square error.  相似文献   

13.
In many of the applied sciences, it is common that the forms of empirical relationships are almost completely unknown prior to study. Scatterplot smoothers used in nonparametric regression methods have considerable potential to ease the burden of model specification that a researcher would otherwise face in this situation. Occasionally the researcher will know the sign of the first or second derivatives, or both. This article develops a smoothing method that can incorporate this kind of information. I show that cubic regression splines with bounds on the coefficients offer a simple and effective approximation to monotonic, convex or concave transformations. I also discuss methods for testing whether the constraints should be imposed. Monte Carlo results indicate that this method, dubbed CoSmo, has a lower approximation error than either locally weighted regression or two other constrained smoothing methods. CoSmo has many potential applications and should be especially useful in applied econometrics. As an illustration, I apply CoSmo in a multivariate context to estimate a hedonic price function and to test for concavity in one of the variables.  相似文献   

14.
Reply     
ABSTRACT

In the class of stochastic volatility (SV) models, leverage effects are typically specified through the direct correlation between the innovations in both returns and volatility, resulting in the dynamic leverage (DL) model. Recently, two asymmetric SV models based on threshold effects have been proposed in the literature. As such models consider only the sign of the previous return and neglect its magnitude, this paper proposes a dynamic asymmetric leverage (DAL) model that accommodates the direct correlation as well as the sign and magnitude of the threshold effects. A special case of the DAL model with zero direct correlation between the innovations is the asymmetric leverage (AL) model. The dynamic asymmetric leverage models are estimated by the Monte Carlo likelihood (MCL) method. Monte Carlo experiments are presented to examine the finite sample properties of the estimator. For a sample size of T = 2000 with 500 replications, the sample means, standard deviations, and root mean squared errors of the MCL estimators indicate only a small finite sample bias. The empirical estimates for S&;P 500 and TOPIX financial returns, and USD/AUD and YEN/USD exchange rates, indicate that the DAL class, including the DL and AL models, is generally superior to threshold SV models with respect to AIC and BIC, with AL typically providing the best fit to the data.  相似文献   

15.
Moderated multiple regression provides a useful framework for understanding moderator variables. These variables can also be examined within multilevel datasets, although the literature is not clear on the best way to assess data for significant moderating effects, particularly within a multilevel modeling framework. This study explores potential ways to test moderation at the individual level (level one) within a 2-level multilevel modeling framework, with varying effect sizes, cluster sizes, and numbers of clusters. The study examines five potential methods for testing interaction effects: the Wald test, F-test, likelihood ratio test, Bayesian information criterion (BIC), and Akaike information criterion (AIC). For each method, the simulation study examines Type I error rates and power. Following the simulation study, an applied study uses real data to assess interaction effects using the same five methods. Results indicate that the Wald test, F-test, and likelihood ratio test all perform similarly in terms of Type I error rates and power. Type I error rates for the AIC are more liberal, and for the BIC typically more conservative. A four-step procedure for applied researchers interested in examining interaction effects in multi-level models is provided.  相似文献   

16.
The goal of the current paper is to compare consistent and inconsistent model selection criteria by looking at their convergence rates (to be defined in the first section). The prototypes of the two types of criteria are the AIC and BIC criterion respectively. For linear regression models with normally distributed errors, we show that the convergence rates for AIC and BIC are 0(n-1) and 0((n log n)-1/2) respectively. When the error distributions are unknown, the two criteria become indistinguishable, all having convergence rate O(n-1/2). We also argue that the BIC criterion has nearly optimal convergence rate. The results partially justified some of the controversial simulation results in which inconsistent criteria seem to outperform consistent ones.  相似文献   

17.
In this article, we propose a new empirical information criterion (EIC) for model selection which penalizes the likelihood of the data by a non-linear function of the number of parameters in the model. It is designed to be used where there are a large number of time series to be forecast. However, a bootstrap version of the EIC can be used where there is a single time series to be forecast. The EIC provides a data-driven model selection tool that can be tuned to the particular forecasting task.

We compare the EIC with other model selection criteria including Akaike’s information criterion (AIC) and Schwarz’s Bayesian information criterion (BIC). The comparisons show that for the M3 forecasting competition data, the EIC outperforms both the AIC and BIC, particularly for longer forecast horizons. We also compare the criteria on simulated data and find that the EIC does better than existing criteria in that case also.  相似文献   

18.
In medical studies, Cox proportional hazards model is a commonly used method to deal with the right-censored survival data accompanied by many explanatory covariates. In practice, the Akaike's information criterion (AIC) or the Bayesian information criterion (BIC) is usually used to select an appropriate subset of covariates. It is well known that neither the AIC criterion nor the BIC criterion dominates for all situations. In this paper, we propose an adaptive-Cox model averaging procedure to get a more robust hazard estimator. First, by applying AIC and BIC criteria to perturbed datasets, we obtain two model averaging (MA) estimated survival curves, called AIC-MA and BIC-MA. Then, based on Kullback–Leibler loss, a better estimate of survival curve between AIC-MA and BIC-MA is chosen, which results in an adaptive-Cox estimate of survival curve. Simulation results show the superiority of our approach and an application of the proposed method is also presented by analyzing the German Breast Cancer Study dataset.  相似文献   

19.
This paper is concerned with model selection and model averaging procedures for partially linear single-index models. The profile least squares procedure is employed to estimate regression coefficients for the full model and submodels. We show that the estimators for submodels are asymptotically normal. Based on the asymptotic distribution of the estimators, we derive the focused information criterion (FIC), formulate the frequentist model average (FMA) estimators and construct proper confidence intervals for FMA estimators and FIC estimator, a special case of FMA estimators. Monte Carlo studies are performed to demonstrate the superiority of the proposed method over the full model, and over models chosen by AIC or BIC in terms of coverage probability and mean squared error. Our approach is further applied to real data from a male fertility study to explore potential factors related to sperm concentration and estimate the relationship between sperm concentration and monobutyl phthalate.  相似文献   

20.
It is common practice to compare the fit of non‐nested models using the Akaike (AIC) or Bayesian (BIC) information criteria. The basis of these criteria is the log‐likelihood evaluated at the maximum likelihood estimates of the unknown parameters. For the general linear model (and the linear mixed model, which is a special case), estimation is usually carried out using residual or restricted maximum likelihood (REML). However, for models with different fixed effects, the residual likelihoods are not comparable and hence information criteria based on the residual likelihood cannot be used. For model selection, it is often suggested that the models are refitted using maximum likelihood to enable the criteria to be used. The first aim of this paper is to highlight that both the AIC and BIC can be used for the general linear model by using the full log‐likelihood evaluated at the REML estimates. The second aim is to provide a derivation of the criteria under REML estimation. This aim is achieved by noting that the full likelihood can be decomposed into a marginal (residual) and conditional likelihood and this decomposition then incorporates aspects of both the fixed effects and variance parameters. Using this decomposition, the appropriate information criteria for model selection of models which differ in their fixed effects specification can be derived. An example is presented to illustrate the results and code is available for analyses using the ASReml‐R package.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号