首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT

Inflated data are prevalent in many situations and a variety of inflated models with extensions have been derived to fit data with excessive counts of some particular responses. The family of information criteria (IC) has been used to compare the fit of models for selection purposes. Yet despite the common use in statistical applications, there are not too many studies evaluating the performance of IC in inflated models. In this study, we studied the performance of IC for data with dual-inflated data. The new zero- and K-inflated Poisson (ZKIP) regression model and conventional inflated models including Poisson regression and zero-inflated Poisson (ZIP) regression were fitted for dual-inflated data and the performance of IC were compared. The effect of sample sizes and the proportions of inflated observations towards selection performance were also examined. The results suggest that the Bayesian information criterion (BIC) and consistent Akaike information criterion (CAIC) are more accurate than the Akaike information criterion (AIC) in terms of model selection when the true model is simple (i.e. Poisson regression (POI)). For more complex models, such as ZIP and ZKIP, the AIC was consistently better than the BIC and CAIC, although it did not reach high levels of accuracy when sample size and the proportion of zero observations were small. The AIC tended to over-fit the data for the POI, whereas the BIC and CAIC tended to under-parameterize the data for ZIP and ZKIP. Therefore, it is desirable to study other model selection criteria for dual-inflated data with small sample size.  相似文献   

2.
In this article, we propose a new empirical information criterion (EIC) for model selection which penalizes the likelihood of the data by a non-linear function of the number of parameters in the model. It is designed to be used where there are a large number of time series to be forecast. However, a bootstrap version of the EIC can be used where there is a single time series to be forecast. The EIC provides a data-driven model selection tool that can be tuned to the particular forecasting task.

We compare the EIC with other model selection criteria including Akaike’s information criterion (AIC) and Schwarz’s Bayesian information criterion (BIC). The comparisons show that for the M3 forecasting competition data, the EIC outperforms both the AIC and BIC, particularly for longer forecast horizons. We also compare the criteria on simulated data and find that the EIC does better than existing criteria in that case also.  相似文献   

3.
Autoregressive model is a popular method for analysing the time dependent data, where selection of order parameter is imperative. Two commonly used selection criteria are the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), which are known to suffer the potential problems regarding overfit and underfit, respectively. To our knowledge, there does not exist a criterion in the literature that can satisfactorily perform under various situations. Therefore, in this paper, we focus on forecasting the future values of an observed time series and propose an adaptive idea to combine the advantages of AIC and BIC but to mitigate their weaknesses based on the concept of generalized degrees of freedom. Instead of applying a fixed criterion to select the order parameter, we propose an approximately unbiased estimator of mean squared prediction errors based on a data perturbation technique for fairly comparing between AIC and BIC. Then use the selected criterion to determine the final order parameter. Some numerical experiments are performed to show the superiority of the proposed method and a real data set of the retail price index of China from 1952 to 2008 is also applied for illustration.  相似文献   

4.
In time series modeling consistent criteria like Bayesian Information Criterion (BIC) outperform in terms of predictability loss-efficient criteria like Akaike Information Criterion (AIC) when data are generated by a finite-order autoregressive process, and the reverse is true when data are generated by an infinite-order autoregressive process. Since in practice we don’t know the data-generating process, it is useful to have an adaptive criterion that behaves as either a consistent or just as a loss-efficient criterion, whichever performs better. Here we derive such a criterion. Moreover, our criterion is adaptive to effective sample sizes and not sensitive to maximum a priori determined order limits.  相似文献   

5.
This paper derives Akaike information criterion (AIC), corrected AIC, the Bayesian information criterion (BIC) and Hannan and Quinn’s information criterion for approximate factor models assuming a large number of cross-sectional observations and studies the consistency properties of these information criteria. It also reports extensive simulation results comparing the performance of the extant and new procedures for the selection of the number of factors. The simulation results show the di?culty of determining which criterion performs best. In practice, it is advisable to consider several criteria at the same time, especially Hannan and Quinn’s information criterion, Bai and Ng’s ICp2 and BIC3, and Onatski’s and Ahn and Horenstein’s eigenvalue-based criteria. The model-selection criteria considered in this paper are also applied to Stock and Watson’s two macroeconomic data sets. The results differ considerably depending on the model-selection criterion in use, but evidence suggesting five factors for the first data and five to seven factors for the second data is obtainable.  相似文献   

6.
In the problem of selecting variables in a multivariate linear regression model, we derive new Bayesian information criteria based on a prior mixing a smooth distribution and a delta distribution. Each of them can be interpreted as a fusion of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Inheriting their asymptotic properties, our information criteria are consistent in variable selection in both the large-sample and the high-dimensional asymptotic frameworks. In numerical simulations, variable selection methods based on our information criteria choose the true set of variables with high probability in most cases.  相似文献   

7.
In a recent volume of this journal, Holden [Testing the normality assumption in the Tobit Model, J. Appl. Stat. 31 (2004) pp. 521–532] presents Monte Carlo evidence comparing several tests for departures from normality in the Tobit Model. This study adds to the work of Holden by considering another test, and several information criteria, for detecting departures from normality in the Tobit Model. The test given here is a modified likelihood ratio statistic based on a partially adaptive estimator of the Censored Regression Model using the approach of Caudill [A partially adaptive estimator for the Censored Regression Model based on a mixture of normal distributions, Working Paper, Department of Economics, Auburn University, 2007]. The information criteria examined include the Akaike’s Information Criterion (AIC), the Consistent AIC (CAIC), the Bayesian information criterion (BIC), and the Akaike’s BIC (ABIC). In terms of fewest ‘rejections’ of a true null, the best performance is exhibited by the CAIC and the BIC, although, like some of the statistics examined by Holden, there are computational difficulties with each.  相似文献   

8.
Abstract. It is quite common in epidemiology that we wish to assess the quality of estimators on a particular set of information, whereas the estimators may use a larger set of information. Two examples are studied: the first occurs when we construct a model for an event which happens if a continuous variable is above a certain threshold. We can compare estimators based on the observation of only the event or on the whole continuous variable. The other example is that of predicting the survival based only on survival information or using in addition information on a disease. We develop modified Akaike information criterion (AIC) and Likelihood cross‐validation (LCV) criteria to compare estimators in this non‐standard situation. We show that a normalized difference of AIC has a bias equal to o ( n ? 1 ) if the estimators are based on well‐specified models; a normalized difference of LCV always has a bias equal to o ( n ? 1 ). A simulation study shows that both criteria work well, although the normalized difference of LCV tends to be better and is more robust. Moreover in the case of well‐specified models the difference of risks boils down to the difference of statistical risks which can be rather precisely estimated. For ‘compatible’ models the difference of risks is often the main term but there can also be a difference of mis‐specification risks.  相似文献   

9.
In medical studies, Cox proportional hazards model is a commonly used method to deal with the right-censored survival data accompanied by many explanatory covariates. In practice, the Akaike's information criterion (AIC) or the Bayesian information criterion (BIC) is usually used to select an appropriate subset of covariates. It is well known that neither the AIC criterion nor the BIC criterion dominates for all situations. In this paper, we propose an adaptive-Cox model averaging procedure to get a more robust hazard estimator. First, by applying AIC and BIC criteria to perturbed datasets, we obtain two model averaging (MA) estimated survival curves, called AIC-MA and BIC-MA. Then, based on Kullback–Leibler loss, a better estimate of survival curve between AIC-MA and BIC-MA is chosen, which results in an adaptive-Cox estimate of survival curve. Simulation results show the superiority of our approach and an application of the proposed method is also presented by analyzing the German Breast Cancer Study dataset.  相似文献   

10.
We introduce a modified version ?nof the piecewiss linear hisiugrimi uf Beirlant et al. (1998) which is a true probability density, i.e., ?n[d] 0 and [d]?n=1. We prove that ?nestimates the underlying densitv ? strongly consistently in the L1mmn, derive large deviation inequalities for the t\ error \?n- f\ and prove that £||/"-/|| tends to zero with the rate n -1\3, We also show that the derivative lf'n estimates consistently in ine expected Lx error the derivative/ of sufficiently smooth density and evaluate the rate of convergence n-i/5 for Epf'n -f'% The estimator/" thus enables to approximate/in the Besov space with a guaranteed rate of convergence. Optimization of the smoothing parameter is also studied. The theoretical or experimentally approximated values of the expected errors E\\?n- f\\ and E||2?'n-?' are compared with tiie errors aCiiieveu u-y t"e histogram of Beirlant et ah, and other nonparametric methods.  相似文献   

11.
Model selection strategies play an important, if not explicit, role in quantitative research. The inferential properties of these strategies are largely unknown, therefore, there is little basis for recommending (or avoiding) any particular set of strategies. In this paper, we evaluate several commonly used model selection procedures [Bayesian information criterion (BIC), adjusted R 2, Mallows’ C p, Akaike information criteria (AIC), AICc, and stepwise regression] using Monte-Carlo simulation of model selection when the true data generating processes (DGP) are known.

We find that the ability of these selection procedures to include important variables and exclude irrelevant variables increases with the size of the sample and decreases with the amount of noise in the model. None of the model selection procedures do well in small samples, even when the true DGP is largely deterministic; thus, data mining in small samples should be avoided entirely. Instead, the implicit uncertainty in model specification should be explicitly discussed. In large samples, BIC is better than the other procedures at correctly identifying most of the generating processes we simulated, and stepwise does almost as well. In the absence of strong theory, both BIC and stepwise appear to be reasonable model selection strategies in large samples. Under the conditions simulated, adjusted R 2, Mallows’ C p AIC, and AICc are clearly inferior and should be avoided.  相似文献   


12.
Moderated multiple regression provides a useful framework for understanding moderator variables. These variables can also be examined within multilevel datasets, although the literature is not clear on the best way to assess data for significant moderating effects, particularly within a multilevel modeling framework. This study explores potential ways to test moderation at the individual level (level one) within a 2-level multilevel modeling framework, with varying effect sizes, cluster sizes, and numbers of clusters. The study examines five potential methods for testing interaction effects: the Wald test, F-test, likelihood ratio test, Bayesian information criterion (BIC), and Akaike information criterion (AIC). For each method, the simulation study examines Type I error rates and power. Following the simulation study, an applied study uses real data to assess interaction effects using the same five methods. Results indicate that the Wald test, F-test, and likelihood ratio test all perform similarly in terms of Type I error rates and power. Type I error rates for the AIC are more liberal, and for the BIC typically more conservative. A four-step procedure for applied researchers interested in examining interaction effects in multi-level models is provided.  相似文献   

13.
Consider the regression model Yi= g(xi) + ei, i = 1,…, n, where g is an unknown function defined on [0, 1], 0 = x0 < x1 < … < xn≤ 1 are chosen so that max1≤i≤n(xi-xi- 1) = 0(n-1), and where {ei} are i.i.d. with Ee1= 0 and Var e1 - s?2. In a previous paper, Cheng & Lin (1979) study three estimators of g, namely, g1n of Cheng & Lin (1979), g2n of Clark (1977), and g3n of Priestley & Chao (1972). Consistency results are established and rates of strong uniform convergence are obtained. In the current investigation the limiting distribution of &in, i = 1, 2, 3, and that of the isotonic estimator g**n are considered.  相似文献   

14.
Stock & Watson (1999) consider the relative quality of different univariate forecasting techniques. This paper extends their study on forecasting practice, comparing the forecasting performance of two popular model selection procedures, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). This paper considers several topics: how AIC and BIC choose lags in autoregressive models on actual series, how models so selected forecast relative to an AR(4) model, the effect of using a maximum lag on model selection, and the forecasting performance of combining AR(4), AIC, and BIC models with an equal weight.  相似文献   

15.
We investigate an empirical Bayes testing problem in a positive exponential family having pdf f{x/θ)=c(θ)u(x) exp(?x/θ), x>0, θ>0. It is assumed that θ is in some known compact interval [C1, C2]. The value C1 is used in the construction of the proposed empirical Bayes test δ* n. The asymptotic optimality and rate of convergence of its associated Bayes risk is studied. It is shown that under the assumption that θ is in [C1, C2] δ* n is asymptotically optimal at a rate of convergence of order O(n?1/n n). Also, δ* n is robust in the sense that δ* n still possesses the asymptotic optimality even the assumption that "C1≦θ≦C2 may not hold.  相似文献   

16.
17.
In this paper, we use the Bayesian method in the application of hypothesis testing and model selection to determine the order of a Markov chain. The criteria used are based on Bayes factors with noninformative priors. Com¬parisons with the commonly used AIC and BIC criteria are made through an example and computer simulations. The results show that the proposed method is better than the AIC and BIC criteria, especially for Markov chains with higher orders and larger state spaces.  相似文献   

18.
Sharp rates of convergence of histogram estimates of the marginal density of a linear process are obtained. Histograms can achieve optimal rates of convergence (n−1 log n)1·3 under general conditions. The assumptions involved are easily verifiable. Histograms appear to be very good estimators from the point of view of uniform convergence.  相似文献   

19.
The supremum of random variables representing a sequence of rewards is of interest in establishing the existence of optimal stopping rules. Necessary and sufficient conditions are given for existence of moments of supn(Xn ? cn) and supn(Sn ? cn) where X1, X2, … are i.i.d. random variables, Sn = X1 + … + Xn, and cn = (nL(n))1/r, 0 < r < 2, L = 1, L = log, and L = log log. Following Cohn (1974), “rates of convergence” results are used in the proof.  相似文献   

20.
This paper proposes an adaptive model selection criterion with a data-driven penalty term. We treat model selection as an equality constrained minimization problem and develop an adaptive model selection procedure based on the Lagrange optimization method. In contrast to Akaike's information criterion (AIC), Bayesian information criterion (BIC) and most other existing criteria, this new criterion is to minimize the model size and take a measure of lack-of-fit as an adaptive penalty. Both theoretical results and simulations illustrate the power of this criterion with respect to consistency and pointwise asymptotic loss efficiency in the parametric and nonparametric cases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号