首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

We derive the influence function of the likelihood ratio test statistic for multivariate normal sample. The derived influence function does not depend on the influence functions of the parameters under the null hypothesis. So we can obtain directly the empirical influence function with only the maximum likelihood estimators under the null hypothesis. Since the derived formula is a general form, it can be applied to influence analysis on many statistical testing problems.  相似文献   

2.
The influence function introduced by Hampel (1968, 1973, 1974) i s a tool that can be used for outlier detection. Campbell (1978) has derived influence function for ~ahalanobis's distance between two populations which can be used for detecting outliers i n discriminant analysis. Radhakrishnan and Kshirsagar (1981) have obtained influence functions for a variety of parametric functions i n multivariate analysis. Radhakrishnan (1983) obtained influence functions for parameters corresponding to "residual" wilks's A and i t s "direction" and "collinearity" factors i n discriminant analysis when a single discriminant function is ade- quate while discriminating among several groups. In this paper influence functions for parameters that correspond to "residual" wilks's A and its "direction" and "coplanarity" factors used to test the goodness of f i t of s (s>l) assigned discriminant func- tions for discriminating among several groups are obtained. These influence functions can be used for outlier detection i n m u l t i -variate data when a single discriminant function is not adequate.  相似文献   

3.
In robust statistics, the influence function was developed as an important measure of sensitivity of estimators to large values. As a measure of income inequality, the quintile share ratio was introduced and not much is known about the theoretical properties of its nonparametric estimator. One such property is its sensitivity to outliers. In this article, we derive the influence function of the quintile share ratio. As is to be expected from its definition, the influence function is unbounded. A nonparametric estimator for the quintile share ratio is defined and its sensitivity to outliers is investigated in a small simulation study.  相似文献   

4.
Expressions are found for the influence function of the coefficient of variation, CV, and its reciprocal, the signal to noise ratio. These functions are free of units, which permits the comparison of the values of the CVs of continuous positive distributions to a perturbation by a small amount of probability at x. For a CV ≤0.5, the influence function response will be negative, of modest size, for values of x near E(X). For such values of a CV and of x, the influence function for 1/CV will be positive and its values will be substantial. These results imply similar behavior by the sample coefficient of variation or its reciprocal, which is supported by simulation studies in the literature. Values of the CV ≥1 are associated with large negative responses of their influence functions. The distributions producing such responses often have densities that decrease from positive infinite to zero on the positive axis with a long tail to the right. An influence function for the difference of two coefficients of variation is also obtained.  相似文献   

5.
This paper presents a unified method for influence analysis to deal with random effects appeared in additive nonlinear regression models for repeated measurement data. The basic idea is to apply the Q-function, the conditional expectation of the complete-data log-likelihood function obtained from EM algorithm, instead of the observed-data log-likelihood function as used in standard influence analysis. Diagnostic measures are derived based on the case-deletion approach and the local influence approach. Two real examples and a simulation study are examined to illustrate our methodology.  相似文献   

6.
The empirical influence function for Mahalanobis distance and for misclassification rates are presented for discriminant analysis with two multivariate normal populations, following Campbell (1978). Conclusions about the effects of outliers from the empirical influence function are contrasted with exact calculations for four simple cases. These cases demonstrate that the higher-order terms discarded in deriving the empirical influence function can be important in practical problems.  相似文献   

7.
The local influence method is adapted to canonical correlation analysis for the purpose of investigating the influence of observations. We consider a perturbation based on the empirical distribution function. An illustrative example is given to show the effectiveness of the local influence method for the identification of influential observations.  相似文献   

8.
Logistic regression is frequently used for classifying observations into two groups. Unfortunately there are often outlying observations in a data set and these might affect the estimated model and the associated classification error rate. In this paper, the authors study the effect of observations in the training sample on the error rate by deriving influence functions. They obtain a general expression for the influence function of the error rate, and they compute it for the maximum likelihood estimator as well as for several robust logistic discrimination procedures. Besides being of interest in their own right, the influence functions are also used to derive asymptotic classification efficiencies of different logistic discrimination rules. The authors also show how influential points can be detected by means of a diagnostic plot based on the values of the influence function  相似文献   

9.
Ordinal regression is used for modelling an ordinal response variable as a function of some explanatory variables. The classical technique for estimating the unknown parameters of this model is Maximum Likelihood (ML). The lack of robustness of this estimator is formally shown by deriving its breakdown point and its influence function. To robustify the procedure, a weighting step is added to the Maximum Likelihood estimator, yielding an estimator with bounded influence function. We also show that the loss in efficiency due to the weighting step remains limited. A diagnostic plot based on the Weighted Maximum Likelihood estimator allows to detect outliers of different types in a single plot.  相似文献   

10.
Hotelling's T2 statistic has many applications in multivariate analysis. In particular, it can be used to measure the influence that a particular observation vector has on parameter estimation. For example, in the bivariate case, there exists a direct relationship between the ellipse generated using a T2 statistic for individual observations and the hyperbolae generated using Hampel's influence function for the corresponding correlation coefficient. In this paper, we jointly use the components of an orthogonal decomposition of the T2 statistic and some influence functions to identify outliers or influential observations. Since the conditional components in the T2 statistic are related to the possible changes in the correlation between a variable and a group of other variables, we consider the theoretical influence functions of the correlations and multiple correlation coefficients. Finite-sample versions of these influence functions are used to find the estimated influence function values.  相似文献   

11.
To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore, it can be used to compute the asymptotic variance and the mean-squared error (MSE). In this paper we compute the influence function, the asymptotic variance and the MSE for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations non-standard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.  相似文献   

12.
The influence function introduced by Hampe1 (1968, 1973, 1974) is a tool that can be used for outlier detection. Campbell (1978) has obtained influence function for Mahalanobis’s distance between two populations which can be used for detecting outliers in discrim-inant analysis. In this paper influence functions for a variety of parametric functions in multivariate analysis are obtained. Influence functions for the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ in multivariate analysis of variance and its eigen values, the matrix L, which is a generalization of 1-R2 , canonical correlations, principal components and parameters that correspond to Pillai’s statistic (1955), Hotelling’s (1951) generalized To2 and Wilk’s Λ (1932), which can be used for outlier detection in multivariate analysis, are obtained. Delvin, Ginanadesikan and Kettenring (1975) have obtained influence function for the population correlation co-efficient in the bivariate case. It is shown in this paper that influence functions for parameters corresponding to r2, R2, and Mahalanobis D2 can be obtained as particular cases.  相似文献   

13.
In the classical principal component analysis (PCA), the empirical influence function for the sensitivity coefficient ρ is used to detect influential observations on the subspace spanned by the dominants principal components. In this article, we derive the influence function of ρ in the case where the reweighted minimum covariance determinant (MCD1) is used as estimator of multivariate location and scatter. Our aim is to confirm the reliability in terms of robustness of the MCD1 via the approach based on the influence function of the sensitivity coefficient.  相似文献   

14.
For the data from multivariate t distributions, it is very hard to make an influence analysis based on the probability density function since its expression is intractable. In this paper, we present a technique for influence analysis based on the mixture distribution and EM algorithm. In fact, the multivariate t distribution can be considered as a particular Gaussian mixture by introducing the weights from the Gamma distribution. We treat the weights as the missing data and develop the influence analysis for the data from multivariate t distributions based on the conditional expectation of the complete-data log-likelihood function in the EM algorithm. Several case-deletion measures are proposed for detecting influential observations from multivariate t distributions. Two numerical examples are given to illustrate our methodology.  相似文献   

15.
This paper considers the maximum likelihood type (M) estimator based on Student's t distribution for the location/scale model. The Student t M-estimator is generally thought to be robust to outliers. This paper shows that this is only true if the degrees of freedom parameter is kept fixed. By contrast, if the degrees of freedom parameter is also estimated from the data, the influence functions for the scale and degrees of freedom parameter become unbounded. Moreover, the influence function of the location parameter remains bounded, but its change-of-variance function is unboi~nded. The intuitioil behind these results is explained in the paper. The rates at which both the influence functions and the change-of-variance function diverge to infinity, are very slow. Tliis implies that outliers have to be extremely large in order to become detrimental to the performance of the Student t based M-estimator with estimated degrees of freedom. The theoretical results are illustrated in a a simulation experiment using several related competing estimators and several distributions for the error process.  相似文献   

16.
Robust estimates for the parameters in the general linear model are proposed which are based on weighted rank statistics. The method is based on the minimization of a dispersion function defined by a weighted Gini's mean difference. The asymptotic distribution of the estimate is derived with an asymptotic linearity result. An influence function is determined to measure how the weights can reduce the influence of high-leverage points. The weights can also be used to base the ranking on a restricted set of comparisons. This is illustrated in several examples with stratified samples, treatment vs control groups and ordered alternatives.  相似文献   

17.
The finite sample performance of the rank estimator of regression coefficients obtained using the iteratively reweighted least squares (IRLS) of Sievers and Abebe (2004) is evaluated. Efficiency comparisons show that the IRLS method does quite well in comparison to least squares or the traditional rank estimates in cases of moderate-tailed error distributions; however, the IRLS method does not appear to be suitable for heavy-tailed data. Moreover, our results show that the IRLS estimator will have an unbounded influence function even if we use an initial estimator with a bounded influence function.  相似文献   

18.
The influence function of the covariance matrix is decomposed into a finite number of components. This decomposition provides a useful tool to develop efficient methods for computing empirical influence curves related to various multivariate methods. It can also be used to characterize multivariate methods from the sensitivity perspective. A numerical example is given to demonstrate efficient computing and to characterize some procedures of exploratory factor analysis.  相似文献   

19.
Theories about the bandwidth of kernel density estimation have been well established by many statisticians. However, the influence function of the bandwidth has not been well investigated. The influence function of the optimal bandwidth that minimizes the mean integrated square error is derived and the asymptotic property of the bandwidth selectors based on the influence function is provided.  相似文献   

20.
Cook (1986) presented the idea of local influence to study the sensitivity of inferences to model assumptions:introduce a vector δ of perturbations to the model; choose a discrepancy function D to measure differences between the original inference and the inference under the perturbed model; study the behavior of D near δ = 0, the original model, usually by taking derivatives. Johnson and Geisser (1983) measure influence in Bayesian inference by the Kullback-Leibler divergence between predictive distributions. I~IcCulloch (1989) is a synthesis of Cook and Johnson and Geisser, using Kullback-Leibler divergence between posterior or predictive distributions as the discrepancy function in Bayesian local influence analyses. We analyze a special case for which McCulloch gives the general theory; namely, the linear model with conjugate prior. We present specific formulae for local influence measures for 1) changes in the parameters of the gamma prior for the precision, 2) changes in the mean of the normal prior for the regression coefficients, 3) changes in the covariance matrix of the normal prior for the regression coefficients and 4) changes in the case weights. Our method is an easy way to find locally influential subsets of points without knowing in advance the sizes of the subsets. The techniques are illustrated with a regression example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号