首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
To measure the distance between a robust function evaluated under the true regression model and under a fitted model, we propose generalized Kullback–Leibler information. Using this generalization we have developed three robust model selection criteria, AICR*, AICCR* and AICCR, that allow the selection of candidate models that not only fit the majority of the data but also take into account non-normally distributed errors. The AICR* and AICCR criteria can unify most existing Akaike information criteria; three examples of such unification are given. Simulation studies are presented to illustrate the relative performance of each criterion.  相似文献   

An asymptotic expansion of the cross-validation criterion (CVC) using the Kullback-Leibler distance is derived when the leave-k-out method is used and when parameters are estimated by the weighted score method. By this expansion, the asymptotic bias of the Takeuchi information criterion (TIC) is derived as well as that of the CVC. Under canonical parametrization in the exponential family of distributions when maximum likelihood estimation is used, the magnitudes of the asymptotic biases of the Akaike information criterion (AIC) and CVC are shown to be smaller than that of the TIC. Examples in typical statistical distributions are shown.  相似文献   

This article deals with a criterion for selection of variables for the multiple group discriminant analysis in high-dimensional data. The variable selection models considered for discriminant analysis in Fujikoshi (1985 Fujikoshi , Y. ( 1985 ). Selection of variables in discriminant analysis and canonical correlation analysis . In: Krishnaiah , P. R. , ed. Multivariate Analysis . Vol. VI. Amsterdam : North-Holland , pp. 219236 . [Google Scholar], 2002 Fujikoshi , Y. ( 2002 ). Selection of variables for discriminant analysis in a high-dimensional case . Sankhya Ser. A 64 : 256257 . [Google Scholar]) are the ones based on additional information due to Rao (1948 Rao , C. R. ( 1948 ). Tests of significance in multivariate analysis . Biometrika 35 : 5879 .[Crossref], [PubMed], [Web of Science ®] [Google Scholar], 1970 Rao , C. R. ( 1970 ). Inference on discriminant function coefficients . In: Bose , R. C. , ed. Essays in Probability and Statistics . Chapel Hill , NC : University of North Carolina Press , pp. 537602 . [Google Scholar]). Our criterion is based on Akaike information criterion (AIC) for this model. The AIC has been successfully used in the literature in model selection when the dimension p is smaller than the sample size N. However, the case when p > N has not been considered in the literature, because MLE can not be estimated corresponding to singularity of the within-group covariance matrix. A popular method used to address the singularity problem in high-dimensional classification is the regularized method, which replaces the within-group sample covariance matrix with a ridge-type covariance estimate to stabilize the estimate. In this article, we propose AIC-type criterion by replacing MLE of the within-group covariance matrix with ridge-type estimator. This idea follows Srivastava and Kubokawa (2008 Srivastava , M. S. , Kubokawa , T. ( 2008 ). Akaike information criterion for selecting components of the mean vector in high dimensional data with fewer observations . J. Japan Statist. Soc. 38 : 259283 . [Google Scholar]). Simulations revealed that our proposed criterion performs well.  相似文献   

The generalized method of moments (GMM) and empirical likelihood (EL) are popular methods for combining sample and auxiliary information. These methods are used in very diverse fields of research, where competing theories often suggest variables satisfying different moment conditions. Results in the literature have shown that the efficient‐GMM (GMME) and maximum empirical likelihood (MEL) estimators have the same asymptotic distribution to order n?1/2 and that both estimators are asymptotically semiparametric efficient. In this paper, we demonstrate that when data are missing at random from the sample, the utilization of some well‐known missing‐data handling approaches proposed in the literature can yield GMME and MEL estimators with nonidentical properties; in particular, it is shown that the GMME estimator is semiparametric efficient under all the missing‐data handling approaches considered but that the MEL estimator is not always efficient. A thorough examination of the reason for the nonequivalence of the two estimators is presented. A particularly strong feature of our analysis is that we do not assume smoothness in the underlying moment conditions. Our results are thus relevant to situations involving nonsmooth estimating functions, including quantile and rank regressions, robust estimation, the estimation of receiver operating characteristic (ROC) curves, and so on.  相似文献   

A class of predictive densities is derived by weighting the observed samples in maximizing the log-likelihood function. This approach is effective in cases such as sample surveys or design of experiments, where the observed covariate follows a different distribution than that in the whole population. Under misspecification of the parametric model, the optimal choice of the weight function is asymptotically shown to be the ratio of the density function of the covariate in the population to that in the observations. This is the pseudo-maximum likelihood estimation of sample surveys. The optimality is defined by the expected Kullback–Leibler loss, and the optimal weight is obtained by considering the importance sampling identity. Under correct specification of the model, however, the ordinary maximum likelihood estimate (i.e. the uniform weight) is shown to be optimal asymptotically. For moderate sample size, the situation is in between the two extreme cases, and the weight function is selected by minimizing a variant of the information criterion derived as an estimate of the expected loss. The method is also applied to a weighted version of the Bayesian predictive density. Numerical examples as well as Monte-Carlo simulations are shown for polynomial regression. A connection with the robust parametric estimation is discussed.  相似文献   

This paper is concerned with the problem of selecting variables in two-group discriminant analysis for high-dimensional data with fewer observations than the dimension. We consider a selection criterion based on approximately unbiased for AIC type of risk. When the dimension is large compared to the sample size, AIC type of risk cannot be defined. We propose AIC by replacing maximum likelihood estimator with ridge-type estimator. This idea follows Srivastava and Kubokawa (2008). It has been further extended by Yamamura et al. (2010). Simulation revealed that the proposed AIC performs well.  相似文献   

Some statistics in common use take a form of a ratio of two statistics.In this paper, we will discuss asymptotic properties of the ratio statistic.We obtain an asymptotic representation of the ratio with remainder term o p(n -1) and a Edgeworth expansion with remainder term o(n -1/2) And as example, the asymptotic representation and the Edgeworth expansion of the jackknife skewness estimator for U-statistics are established and we discuss the biases of the skewness estimator theoretically.We also apply the result to an estimator of Pearson’s coefficient of variation and the sample correlation coefficient.  相似文献   

Autoregressive model is a popular method for analysing the time dependent data, where selection of order parameter is imperative. Two commonly used selection criteria are the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), which are known to suffer the potential problems regarding overfit and underfit, respectively. To our knowledge, there does not exist a criterion in the literature that can satisfactorily perform under various situations. Therefore, in this paper, we focus on forecasting the future values of an observed time series and propose an adaptive idea to combine the advantages of AIC and BIC but to mitigate their weaknesses based on the concept of generalized degrees of freedom. Instead of applying a fixed criterion to select the order parameter, we propose an approximately unbiased estimator of mean squared prediction errors based on a data perturbation technique for fairly comparing between AIC and BIC. Then use the selected criterion to determine the final order parameter. Some numerical experiments are performed to show the superiority of the proposed method and a real data set of the retail price index of China from 1952 to 2008 is also applied for illustration.  相似文献   

We define the maximum-relevance weighted likelihood estimator (MREWLE) using the relevance-weighted likelihood function introduced by Hu and Zidek (1995). Furthermore, we establish the consistency of the MREWLE under a wide range of conditions. Our results generalize those of Wald (1948) to both nonidentically distributed random variables and unequally weighted likelihoods (when dealing with independent data sets of varying relevance to the inferential problem of interest). Asymptotic normality is also proven. Applying these results to generalized smoothing model is discussed.  相似文献   

With respect to random sampling from finite population, when the correlation between the auxiliary and the main characteristics is negative, the product estimator is often used to estimate the population mean. The product estimator, however, would have a large mean-squared-error (MSE) if the coefficients of variations for these two characteristics were large and the absolute value of the correlation between them was small. In this paper, we propose a general family of modified product estimators, that include the product estimator as a special case. We provide a discussion on the reduction of the MSE by using the optimal modified product estimator that has the minimal MSE in the proposed family. In certain situations, these reductions of the MSE can be significant.  相似文献   

Due to the widespread use of the coefficient of variation in empirical finance, we derive its asymptotic sampling distribution in the case of non-iid random variables to deal with autocorrelation and/or conditional heteroskedasticity stylized facts of financial returns. We also propose statistical tests for the comparison of two coefficients of variation based on asymptotic normality and studentized time-series bootstrap. In an illustrative example, we analyze the monthly return volatility of six stock market indexes during the years 1990–2007.  相似文献   

In this paper, we consider the problem of estimating the number of components of a superimposed nonlinear sinusoids model of a signal in the presence of additive noise. We propose and provide a detailed empirical comparison of robust methods for estimation of the number of components. The proposed methods, which are robust modifications of the commonly used information theoretic criteria, are based on various M-estimator approaches and are robust with respect to outliers present in the data and heavy-tailed noise. The proposed methods are compared with the usual non-robust methods through extensive simulations under varied model scenarios. We also present real signal analysis of two speech signals to show the usefulness of the proposed methodology.  相似文献   

In this paper, we study asymptotic normality of the kernel estimators of the density function and its derivatives as well as the mode in the randomly right censorship model. The mode estimator is defined as the random variable that maximizes the kernel density estimator. Our results are stated under some suitable conditions upon the kernel function, the smoothing parameter and both distributions functions that appear in this model. Here, the Kaplan–Meier estimator of the distribution function is used to build the estimates. We carry out a simulation study which shows how good the normality works.  相似文献   

This article deals with some important computational aspects of the generalized von Mises distribution in relation with parameter estimation, model selection and simulation. The generalized von Mises distribution provides a flexible model for circular data allowing for symmetry, asymmetry, unimodality and bimodality. For this model, we show the equivalence between the trigonometric method of moments and the maximum likelihood estimators, we give their asymptotic distribution, we provide bias-corrected estimators of the entropy, the Akaike information criterion and the measured entropy for model selection, and we implement the ratio-of-uniforms method of simulation.  相似文献   

On the use of corrections for overdispersion   总被引:3,自引:0,他引:3  
In studying fluctuations in the size of a blackgrouse ( Tetrao tetrix ) population, an autoregressive model using climatic conditions appears to follow the change quite well. However, the deviance of the model is considerably larger than its number of degrees of freedom. A widely used statistical rule of thumb holds that overdispersion is present in such situations, but model selection based on a direct likelihood approach can produce opposing results. Two further examples, of binomial and of Poisson data, have models with deviances that are almost twice the degrees of freedom and yet various overdispersion models do not fit better than the standard model for independent data. This can arise because the rule of thumb only considers a point estimate of dispersion, without regard for any measure of its precision. A reasonable criterion for detecting overdispersion is that the deviance be at least twice the number of degrees of freedom, the familiar Akaike information criterion, but the actual presence of overdispersion should then be checked by some appropriate modelling procedure.  相似文献   

This article considers first-order autoregressive panel model that is a simple model for dynamic panel data (DPD) models. The generalized method of moments (GMM) gives efficient estimators for these models. This efficiency is affected by the choice of the weighting matrix that has been used in GMM estimation. The non-optimal weighting matrices have been used in the conventional GMM estimators. This led to a loss of efficiency. Therefore, we present new GMM estimators based on optimal or suboptimal weighting matrices. Monte Carlo study indicates that the bias and efficiency of the new estimators are more reliable than the conventional estimators.  相似文献   

The use of asymptotic moments to increase the precision of the control variate technique for Monte Carlo estimation is dis­cussed. An application is made to the estimation of the mean and variance of the likelihood ratio goodness–of–fit statistic with the Pearson statistic used as a control variate. Estimates of the variance reductions are given.  相似文献   

Summary.  Data from 20 sporting contests in which the same two teams compete regularly are studied. Strong and weak symmetry requirements for possible models are identified, and some simple models are proposed and fitted to the data. The need to compute the exact likelihood function and the presence of missing values make this non-trivial. Forecasting match outcomes by using the models can give a modest improvement over a naïve forecast. Significance tests for studying the effect of 'match covariates' such as playing at home or away or winning the toss are introduced, and the effect of these covariates is in general found to be quite large.  相似文献   

Calibration of the estimators of variance   总被引:2,自引:0,他引:2  
This investigation suggests new techniques to calibrate estimators of variance. Estimators of the variance of simple mean, ratio and regression estimators under different sampling schemes are shown to be special cases of the proposed calibration techniques. The approach has more practical use due to recent advances in programming techniques and computational speed. An empirical study has been carried out to address the properties of these proposed strategies.  相似文献   

In the present article, we propose the generalized ratio-type and generalized ratio-exponential-type estimators for population mean in adaptive cluster sampling (ACS) under modified Horvitz-Thompson estimator. The proposed estimators utilize the auxiliary information in combination of conventional measures (coefficient of skewness, coefficient of variation, correlation coefficient, covariance, coefficient of kurtosis) and robust measures (tri-mean, Hodges-Lehmann, mid-range) to increase the efficiency of the estimators. Properties of the proposed estimators are discussed using the first order of approximation. The simulation study is conducted to evaluate the performances of the estimators. The results reveal that the proposed estimators are more efficient than competing estimators for population mean in ACS under both modified Hansen-Hurwitz and Horvitz-Thompson estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号