首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 595 毫秒
1.
A note on the Cook''s distance   总被引:1,自引:0,他引:1  
A modification of the classical Cook's distance is proposed, providing us with a generalized Mahalanobis distance in the context of multivariate elliptical linear regression models. We establish the exact distribution of a pivotal type statistics based on this generalized Mahalanobis distance, which provides critical points for the identification of outlier data points. Based on the equivalence between the modified Cook's distance and what is called the mean-shift multivariate outlier elliptical model, twelve new modifications are proposed for the Cook's distance. We also describe the explicit relationship between the Cook's distance and the likelihood displacement with the modified Cook's distance. We illustrate the procedure with some examples, in the context of multiple and multivariate linear regression.  相似文献   

2.
This paper studies the efficient estimation of seemingly unrelated linear models with integrated regressors and stationary errors. We consider two cases. The first one has no common regressor among the equations. In this case, we show that by adding leads and lags of the first differences of the regressors and estimating this augmented dynamic regression model by generalized least squares using the long-run covariance matrix, we obtain an efficient estimator of the cointegrating vector that has a limiting mixed normal distribution. In the second case we consider, there is a common regressor to all equations, and we discuss efficient minimum distance estimation in this context. Simulation results suggests that our new estimator compares favorably with others already proposed in the literature. We apply these new estimators to the testing of the proportionality and symmetry conditions implied by purchasing power parity (PPP) among the G-7 countries. The tests based on the efficient estimates easily reject the joint hypotheses of proportionality and symmetry for all countries with either the United States or Germany as numeraire. Based on individual tests, our results suggest that Canada and Germany are the most likely countries for which the proportionality condition holds, and that Italy and Japan for the symmetry condition relative to the United States.  相似文献   

3.
《Econometric Reviews》2013,32(4):293-323
Abstract

This paper studies the efficient estimation of seemingly unrelated linear models with integrated regressors and stationary errors. We consider two cases. The first one has no common regressor among the equations. In this case, we show that by adding leads and lags of the first differences of the regressors and estimating this augmented dynamic regression model by generalized least squares using the long-run covariance matrix, we obtain an efficient estimator of the cointegrating vector that has a limiting mixed normal distribution. In the second case we consider, there is a common regressor to all equations, and we discuss efficient minimum distance estimation in this context. Simulation results suggests that our new estimator compares favorably with others already proposed in the literature. We apply these new estimators to the testing of the proportionality and symmetry conditions implied by purchasing power parity (PPP) among the G-7 countries. The tests based on the efficient estimates easily reject the joint hypotheses of proportionality and symmetry for all countries with either the United States or Germany as numeraire. Based on individual tests, our results suggest that Canada and Germany are the most likely countries for which the proportionality condition holds, and that Italy and Japan for the symmetry condition relative to the United States.  相似文献   

4.
I consider the problem of estimating the Mahalanobis distance between multivariate normal populations when the population covariance matrix satisfies a graphical model. In addition to providing a clear understanding of the dependencies in a multivariate data set, the use of graphical models can reduce the variability of the estimated distances and improve inferences. I derive the asymptotic distribution of the estimated Mahalanobis distance under a general covariance model, which includes graphical models as a special case. Two examples are discussed.  相似文献   

5.
A minimum distance procedure, analogous to maximum likelihood for multinomial data, is employed to fit mixture models to mass-size relative frequencies recorded for some clay soils of southeastern Australia. Log hyperbolic component distributions are considered initially and it is shown how they can be fitted satisfactorily at least to ungrouped data using a generalized EM algorithm. A computationally more convenient model with log skew Laplace components is subsequently shown to suffice. It is demonstrated how it can be fitted to the data in their original grouped form. Consideration is given also to the provision of standard errors using the idea of a quasi-sample size.  相似文献   

6.
Abstract. Estimators based on data‐driven generalized weighted Cramér‐von Mises distances are defined for data that are subject to a possible right censorship. The function used to measure the distance between the data, summarized by the Kaplan–Meier estimator, and the target model is allowed to depend on the sample size and, for example, on the number of censored items. It is shown that the estimators are consistent and asymptotically multivariate normal for every p dimensional parametric family fulfiling some mild regularity conditions. The results are applied to finite mixtures. Simulation results for finite mixtures indicate that the estimators are useful for moderate sample sizes. Furthermore, the simulation results reveal the usefulness of sample size dependent and censoring sensitive distance functions for moderate sample sizes. Moreover, the estimators for the mixing proportion seem to be fairly robust against a ‘symmetric’ contamination model even when censoring is present.  相似文献   

7.
For the exploratory analysis of three-way data, the Tucker3 is one of the most applied models to study three-way arrays when the data are quadrilinear. When the data consist of vectors of positive values summing to a unit, as in the case of compositional data, this model should consider the specific problems that compositional data analysis brings. The main purpose of this paper is to describe how to do a Tucker3 analysis of compositional data, and to show the relationships between the loading matrices when different preprocessing procedures are used.  相似文献   

8.
The paper describes two regression models—principal components and maximum-likelihood factor analysis—which may be used when the stochastic predictor varibles are highly intereorrelated and/or contain measurement error. The two problems can occur jointly, for example in social-survey data where the true (but unobserved) covariance matrix can be singular. Departure from singularity of the sample dispersion matrix is then due to measurement error. We first consider the more elementary principal components regression model, where it is shown that it can be derived as a special case of (i) canonical correlation, and (ii) restricted least squares. The second part consists of the more general maximum-likelihood factor-analysis regression model, which is derived from the generalized inverse of the product of two singular matrices. Also, it is proved that factor-analysis regression can be considered as an instrumental variables estimator and therefore does not depend on whether factors have been “properly” identified in terms of substantive behaviour. Consequently the additional task of rotating factors to “simple structure” does not arise.  相似文献   

9.
Summary. Semiparametric mixed models are useful in biometric and econometric applications, especially for longitudinal data. Maximum penalized likelihood estimators (MPLEs) have been shown to work well by Zhang and co-workers for both linear coefficients and nonparametric functions. This paper considers the role of influence diagnostics in the MPLE by extending the case deletion and subject deletion analysis of linear models to accommodate the inclusion of a nonparametric component. We focus on influence measures for the fixed effects and provide formulae that are analogous to those for simpler models and readily computable with the MPLE algorithm. We also establish an equivalence between the case or subject deletion model and a mean shift outlier model from which we derive tests for outliers. The influence diagnostics proposed are illustrated through a longitudinal hormone study on progesterone and a simulated example.  相似文献   

10.
多水平模型及静态面板数据模型的比较研究   总被引:1,自引:0,他引:1  
对两水平模型与静态面板数据模型进行对比分析:多水平模型主要用于分析具有层次结构的统计数据,面板数据模型是针对面板数据而提出的一种应用广泛的计量经济模型。面板数据可以看成是具有截面水平与时间水平的两层数据,两水平模型也能对面板数据进行分析,在一定条件下具有一定的相似性。因此,提出多水平的静态面板数据模型,为分析具有多个层次结构的面板数据提供分析工具。  相似文献   

11.
In this paper, two new multiple influential observation detection methods, GCD.GSPR and mCD*, are introduced for logistic regression. The proposed diagnostic measures are compared with the generalized difference in fits (GDFFITS) and the generalized squared difference in beta (GSDFBETA), which are multiple influential diagnostics. The simulation study is conducted with one, two and five independent variable logistic regression models. The performance of the diagnostic measures is examined for a single contaminated independent variable for each model and in the case where all the independent variables are contaminated with certain contamination rates and intensity. In addition, the performance of the diagnostic measures is compared in terms of the correct identification rate and swamping rate via a frequently referred to data set in the literature.  相似文献   

12.
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827–842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.  相似文献   

13.
This paper is motivated from a neurophysiological study of muscle fatigue, in which biomedical researchers are interested in understanding the time-dependent relationships of handgrip force and electromyography measures. A varying coefficient model is appealing here to investigate the dynamic pattern in the longitudinal data. The response variable in the study is continuous but bounded on the standard unit interval (0, 1) over time, while the longitudinal covariates are contaminated with measurement errors. We propose a generalization of varying coefficient models for the longitudinal proportional data with errors-in-covariates. We describe two estimation methods with penalized splines, which are formalized under a Bayesian inferential perspective. The first method is an adaptation of the popular regression calibration approach. The second method is based on a joint likelihood under the hierarchical Bayesian model. A simulation study is conducted to evaluate the efficacy of the proposed methods under different scenarios. The analysis of the neurophysiological data is presented to demonstrate the use of the methods.  相似文献   

14.
Summary Several techniques for exploring ann×p data set are considered in the light of the statistical framework: data-structure+noise. The first application is to Principal Component Analysis (PCA), in fact generalized PCA with any metric M on the unit space ℝ p . A natural model for supporting this analysis is the fixed-effect model where the expectation of each unit is assumed to belong to some q-dimensional linear manyfold defining the structure, while the variance describes the noise. The best estimation of the structure is obtained for a proper choice of metric M and dimensionality q: guidelines are provided for both choices in section 2. The second application is to Projection Pursuit which aims to reveal structure in the original data by means of suitable low-dimensional projections of them. We suggest the use of generalized PCA with suitable metric M as a Projection Pursuit technique. According to the kind of structure which is looked for, two such metrics are proposed in section 3. Finally, the analysis ofn×p contingency tables is considered in section 4. Since the data are frequencies, we assume a multinomial or Poisson model for the noise. Several models may be considered for the structural part; we can say that Correspondence Analysis rests on one of them, spherical factor analysis on another one; Goodman association models also provide an alternative modelling. These different approaches are discussed and compared from several points of view.  相似文献   

15.
Preference decisions will usually depend on the characteristics of both the judges and the objects being judged. In the analysis of paired comparison data concerning European universities and students' characteristics, it is demonstrated how to incorporate subject-specific information into Bradley–Terry-type models. Using this information it is shown that preferences for universities and therefore university rankings are dramatically different for different groups of students. A log-linear representation of a generalized Bradley–Terry model is specified which allows simultaneous modelling of subject- and object-specific covariates and interactions between them. A further advantage of this approach is that standard software for fitting log-linear models, such as GLIM, can be used.  相似文献   

16.
17.
We first compare correspondence analysis, which uses chi-square distance, and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these two approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. We also make comparisons with the logratio approach based on compositional data. These three methods of representation can produce quite similar results. Two illustrative examples are given.  相似文献   

18.
Efficiency and robustness are two fundamental concepts in parametric estimation problems. It was long thought that there was an inherent contradiction between the aims of achieving robustness and efficiency; that is, a robust estimator could not be efficient and vice versa. It is now known that the minimum Hellinger distance approached introduced by Beran [R. Beran, Annals of Statistics 1977;5:445–463] is one way of reconciling the conflicting concepts of efficiency and robustness. For parametric models, it has been shown that minimum Hellinger estimators achieve efficiency at the model density and simultaneously have excellent robustness properties. In this article, we examine the application of this approach in two semiparametric models. In particular, we consider a two‐component mixture model and a two‐sample semiparametric model. In each case, we investigate minimum Hellinger distance estimators of finite‐dimensional Euclidean parameters of particular interest and study their basic asymptotic properties. Small sample properties of the proposed estimators are examined using a Monte Carlo study. The results can be extended to semiparametric models of general form as well. The Canadian Journal of Statistics 37: 514–533; 2009 © 2009 Statistical Society of Canada  相似文献   

19.
A new method of discrimination and classification based on a Hausdorff type distance is proposed. In two groups, the Hausdorff distance is defined as the sum of the furthest distance of the nearest elements of one set to another. This distance has some useful properties and is exploited in developing a discriminant criterion between individual objects belonging to two groups based on a finite number of classification variables. The discrimination criterion is generalized to more than two groups in a couple of ways. Several data sets are analysed and their classification accuracy is compared to that obtained from linear discriminant function and the results are encouraging. The method in simple, lends itself to parallel computation and imposes less stringent conditions on the data.  相似文献   

20.
A common approach to modelling extreme data are to consider the distribution of the exceedance value over a high threshold. This approach is based on the distribution of excess, which follows the generalized Pareto distribution (GPD) and has shown to be adequate for this type of situation. As with all data involving analysis in time, excesses above a threshold may also vary and suffer from the influence of covariates. Thus, the GPD distribution can be modelled by entering the presence of these factors. This paper presents a new model for extreme values, where GPD parameters are written on the basis of a dynamic regression model. The estimation of the model parameters is made under the Bayesian paradigm, with sampling points via MCMC. As with environmental data, behaviour data are related to other factors such as time and covariates such as latitude and distance from the sea. Simulation studies have shown the efficiency and identifiability of the model, and applying real rain data from the state of Piaui, Brazil, shows the advantage in predicting and interpreting the model against other similar models proposed in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号