首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The influence function introduced by Hampe1 (1968, 1973, 1974) is a tool that can be used for outlier detection. Campbell (1978) has obtained influence function for Mahalanobis’s distance between two populations which can be used for detecting outliers in discrim-inant analysis. In this paper influence functions for a variety of parametric functions in multivariate analysis are obtained. Influence functions for the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ in multivariate analysis of variance and its eigen values, the matrix L, which is a generalization of 1-R2 , canonical correlations, principal components and parameters that correspond to Pillai’s statistic (1955), Hotelling’s (1951) generalized To2 and Wilk’s Λ (1932), which can be used for outlier detection in multivariate analysis, are obtained. Delvin, Ginanadesikan and Kettenring (1975) have obtained influence function for the population correlation co-efficient in the bivariate case. It is shown in this paper that influence functions for parameters corresponding to r2, R2, and Mahalanobis D2 can be obtained as particular cases.  相似文献   

2.
When a process is monitored with a T 2 control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T 2 statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T 2 statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T 2 statistic in Phase I with the corresponding T 2 statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition.  相似文献   

3.
The Hotelling's T2statistic has been used in constructing a multivariate control chart for individual observations. In Phase II operations, the distribution of the T2statistic is related to the F distribution provided the underlying population is multivariate normal. Thus, the upper control limit (UCL) is proportional to a percentile of the F distribution. However, if the process data show sufficient evidence of a marked departure from multivariate normality, the UCL based on the F distribution may be very inaccurate. In such situations, it will usually be helpful to determine the UCL based on the percentile of the estimated distribution for T2. In this paper, we use a kernel smoothing technique to estimate the distribution of the T2statistic as well as of the UCL of the T2chart, when the process data are taken from a multivariate non-normal distribution. Through simulations, we examine the sample size requirement and the in-control average run length of the T2control chart for sample observations taken from a multivariate exponential distribution. The paper focuses on the Phase II situation with individual observations.  相似文献   

4.
Abstract

In a 2-step monotone missing dataset drawn from a multivariate normal population, T2-type test statistic (similar to Hotelling’s T2 test statistic) and likelihood ratio (LR) are often used for the test for a mean vector. In complete data, Hotelling’s T2 test and LR test are equivalent, however T2-type test and LR test are not equivalent in the 2-step monotone missing dataset. Then we interest which statistic is reasonable with relation to power. In this paper, we derive asymptotic power function of both statistics under a local alternative and obtain an explicit form for difference in asymptotic power function. Furthermore, under several parameter settings, we compare LR and T2-type test numerically by using difference in empirical power and in asymptotic power function. Summarizing obtained results, we recommend applying LR test for testing a mean vector.  相似文献   

5.
A series expansion is obtained for the confluent hypergeometric function of the second kind when the argument is a 2 times 2 positive definite matrix. Applications are made to the distributions of Hotelling's generalized T02 statistic, and the smallest latent root of the covariance matrix.  相似文献   

6.
In this paper, we propose five types of copulas on the Hotelling's T2 control chart when observations are from exponential distribution and use the Monte Carlo simulation to compare the performance of the control chart, which is based on the Average Run Length (ARL) for each copula. Five types of copulas function for specifying dependence between random variables are used and measured by Kendall's tau. The results show that the copula approach can be fitted the observation and we can use copula as an option for application on Hotelling's T2 control chart.  相似文献   

7.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

8.
In this paper we consider the problem of testing the means of k multivariate normal populations with additional data from an unknown subset of the k populations. The purpose of this research is to offer test procedures utilizing all the available data for the multivariate analysis of variance problem because the additional data may contain valuable information about the parameters of the k populations. The standard procedure uses only the data from identified populations. We provide a test using all available data based upon Hotelling' s generalized T2statistic. The power of this test is computed using Betz's approximation of Hotelling' s generalized T2statistic by an F-distribution. A comparison of the power of the test and the standard test procedure is also given.  相似文献   

9.
Abstract

This paper examines the high dimensional asymptotics of the naive Hotelling T2 statistic. Naive Bayes has been utilized in high dimensional pattern recognition as a method to avoid singularities in the estimated covariance matrix. The naive Hotelling T2 statistic, which is equivalent to the estimator of the naive canonical correlation, is a statistically important quantity in naive Bayes and its high dimensional behavior has been studied under several conditions. In this paper, asymptotic normality of the naive Hotelling T2 statistic under a high dimension low sample size setting is developed using the central limit theorem of a martingale difference sequence.  相似文献   

10.
When performing the Wald-Wolfowitz runs test, observations from two samples are combined and ordered, and the test statistic is the number of sequences of observations from the same sample. This test statistic is equivalent to the number of links between observations from different samples, if we consider each observation to be linked to the next higher and next lower observations. While it is known that the Wald-Wolfowitz runs test is not very powerful, what would be the effect on the power of the Wald-Wolfowitz runs test if all observations within a specified Euclidean distance or “tolerance” were linked instead? This question is motivated by the simulation results of Whaley and Quade (1985), who found that for normal data, the power of the multi-dimensional runs test using a linkage tolerance compared favorably to Hotelling's T2 in some instances. The results of a similar simulation procedure show that the power of the Wald-Wolfowitz runs test does indeed improve when observations are linked using a tolerance. The results also suggest that a better large sample approximation to the distribution of the test statistic needs to be found.  相似文献   

11.
A general rank test procedure based on an underlying multinomial distribution is suggested for randomized block experiments with multifactor treatment combinations within each block. The Wald statistic for the multinomial is used to test hypotheses about the within–block rankings. This statistic is shown to be related to the one–sample Hotellingt's T2 statistic, suggesting a method for computing the test statistic using the standard statistical computer packages.  相似文献   

12.
We consider the problem of testing the equality of two population means when the population variances are not necessarily equal. We propose a Welch-type statistic, say T* c, based on Tiku!s ‘1967, 1980’ modified maximum likelihood estimators, and show that this statistic is robust to symmetric and moderately skew distributions. We investigate the power properties of the statistic T* c; T* c clearly seems to be more powerful than Yuen's ‘1974’ Welch-type robust statistic based on the trimmed sample means and the matching sample variances. We show that the analogous statistics based on the ‘adaptive’ robust estimators give misleading Type I errors. We generalize the results to testing linear contrasts among k population means  相似文献   

13.
We develop a ‘robust’ statistic T2 R, based on Tiku's (1967, 1980) MML (modified maximum likelihood) estimators of location and scale parameters, for testing an assumed meam vector of a symmetric multivariate distribution. We show that T2 R is one the whole considerably more powerful than the prominenet Hotelling T2 statistics. We also develop a robust statistic T2 D for testing that two multivariate distributions (skew or symmetric) are identical; T2 D seems to be usually more powerful than nonparametric statistics. The only assumption we make is that the marginal distributions are of the type (1/σk)f((x-μk)/σk) and the means and variances of these marginal distributions exist.  相似文献   

14.
In this paper, we extend the univariate control median test to the multivariate case. We apply the permutation principle for the null distribution function of the test statistic and obtain a conditionally nonparametric test procedure. Because of the amount of computational work involved in implementing the test, we consider the normal approximation. We prove the consistency and derive the asymptotic efficiency of our control median test relative to Puri and Sen's median test. Finally, we compare the power of our control median test with those of Hotelling's T2 test and Puri and Sen's median test through the simulations.  相似文献   

15.
The small-sample accuracy of seven members of the family of power-divergence statistics for testing independence or homogeneity in contingency tables was studied via simulation. The likelihood ratio statistic G 2 and Pearson's X 2 statistic are among these seven members, whose behavior was studied at nominal test sizes of.01 and.05 with marginal distributions that could be uniform or skewed and with a set of sample sizes that included sparseness conditions as measured through table density (i.e., the ratio of sample size to number of cells). The likelihood ratio statistic G 2 rejected the null hypothesis too often even with large table density, whereas Pearson's X 2 was sufficiently accurate and only presented a minor misbehavior when table density was less than two observations/cell. None of the other five statistics outperformed Pearson's X 2. A nonasymptotic variant of X 2 solved the minor inaccuracies of Pearson's X 2 and turned out to be the most accurate statistic for testing independence or homogeneity, even with table densities of one observation/cell. These results clearly advise against the use of the likelihood ratio statistic G 2.  相似文献   

16.
Goodness of fit testing for the binomial distribution can be carried out using Pearson's X2p statistic and its components. Applications of this technique are considered and compared with recently suggested empirical distribution function tests. Diagnostic use of components is discussed.  相似文献   

17.
The problem of detecting influential observations in principalcomponent analysis was discussed by several authors. Radhakrishnan and kshirsagar ( 1981 ), Critchley ( 1985 ), jolliffe ( 1986 )among others discussed this topicby using the influence functions I(X;θs)and I(X;Vs)of eigenvalues and eigenvectors, which wwere derived under the assumption that the eigenvalues of interest were simple. In this paper we propose the influence functionsI(X;∑q s=1θsVsVs T)and I(x;∑q s=1VsVs t)(q<p;p:number of variables) to investigate the influence onthe subspace spanned by principal components. These influence functions are applicable not only to the case where the edigenvalues of interst are all simple but also to the case where there are some multiple eigenvalues among those of interest.  相似文献   

18.
On making use of a result of Imhof, an integral representation of the distribution function of linear combinations of the components of a Dirichlet random vector is obtained. In fact, the distributions of several statistics such as Moran and Geary's indices, the Cliff‐Ord statistic for spatial correlation, the sample coefficient of determination, F‐ratios and the sample autocorrelation coefficient can be similarly determined. Linear combinations of the components of Dirichlet random vectors also turn out to be a key component in a decomposition of quadratic forms in spherically symmetric random vectors. An application involving the sample spectrum associated with series generated by ARMA processes is discussed.  相似文献   

19.
Moran's I statistic [Moran, (1950), ‘Notes on Continuous Stochastic Phenomena’, Biometrika, 37, 17–23] has been widely used to evaluate spatial autocorrelation. This paper is concerned with Moran's I-induced testing procedure in residual analysis. We begin with exploring the Moran's I statistic in both its original and extended forms analytically and numerically. We demonstrate that the magnitude of the statistic in general depends not only on the underlying correlation but also on certain heterogeneity in the individual observations. One should exercise caution when interpreting the outcome on correlation by the Moran's I-induced procedure. On the other hand, the effect on the Moran's I due to heterogeneity in the observations enables a regression model checking procedure with the residuals. This novel application of Moran's I is justified by simulation and illustrated by an analysis of wildfire records from Alberta, Canada.  相似文献   

20.
This article performs a sensitivity analyses of the synthetic T2 chart using fractional factorial design, which integrates the interaction effects. We are interested in the effects of the input parameters on the optimal cost, chart's parameters, and average run lengths. We also look at the input parameters responsible for the increase in cost and improvement in statistical performance under statistical constraints, and investigate how the input parameters influence the binding effect of the statistical constraints. The sensitivity analyses of the synthetic T2 chart are compared with that of the Hotelling's T2 chart, and parameters responsible for the cost advantage of the synthetic T2 chart are identified.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号