首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Based on a chi square transform of the multivariate normal data set, we proposed a technique for testing multinormality which is the sum of interpoint squared distances between an ordered set of the transformed observations and the set of the population pth quantiles of the chi squared distribution. The critical values of the test were evaluated for different sample sizes and random vector dimensions through extensive simulations. The empirical type-I-error rates and powers of the proposed test were compared with those of some other well known tests for MVN with the proposed test showing excellent results at large sample sizes.  相似文献   

2.
A goodness-of-fit test for multivariate normality is proposed which is based on Shapiro–Wilk's statistic for univariate normality and on an empirical standardization of the observations. The critical values can be approximated by using a transformation of the univariate standard normal distribution. A Monte Carlo study reveals that this test has a better power performance than some of the best known tests for multinormality against a wide range of alternatives.  相似文献   

3.
Necessary and sufficient conditions on the observation covariance structure and on the set of linear transformations are given for which the distribution of the multivariate maximum squared - radii statistic for detecting a single multivariate outlier is invariant from the distribution assuming the usual independence covariance structure. Thus, we extend the work of Baksalary and Puntanen (1990), who have given necessary and sufficient conditions for an independence-distribution-preserving covariance structure for Grubbs' statistic for detecting a univariate outlier. We also extend the work of Marco, Young, and Turner (1987) and Pavur and Young (1991), who have given sufficient conditions for an independence-distribution-preserving dependency structure for the multivariate squared - radii statistic.  相似文献   

4.
We propose a multivariate extension of the univariate chi-squared normality test. Using a known result for the distribution of quadratic forms in normal variables, we show that the proposed test statistic has an approximated chi-squared distribution under the null hypothesis of multivariate normality. As in the univariate case, the new test statistic is based on a comparison of observed and expected frequencies for specified events in sample space. In the univariate case, these events are the standard class intervals, but in the multivariate extension we propose these become hyper-ellipsoidal annuli in multivariate sample space. We assess the performance of the new test using Monte Carlo simulation. Keeping the type I error rate fixed, we show that the new test has power that compares favourably with other standard normality tests, though no uniformly most powerful test has been found. We recommend the new test due to its competitive advantages.  相似文献   

5.
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.  相似文献   

6.
Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q2 statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R 2 with Q2 and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q2 statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q2 and R 2 values is negligible.  相似文献   

7.
We propose a new method to estimate the cumulative hazard function and the corresponding distribution function of survival times under randomly left-truncated and right-censored observations (LTRC). The new estimators are based on presmoothing ideas, the estimation of the conditional expectation m of the censoring indicator. An almost sure representation for both estimators is established, from which a strong consistency rate and asymptotic normality are derived. It is shown that the presmoothed modification leads to a gain in terms of asymptotic mean squared error. This efficiency with respect to the classical estimators is also shown in a simulation study. Finally, an application to a real data set is provided.  相似文献   

8.

A basic graphical approach for checking normality is the Q - Q plot that compares sample quantiles against the population quantiles. In the univariate analysis, the probability plot correlation coefficient test for normality has been studied extensively. We consider testing the multivariate normality by using the correlation coefficient of the Q - Q plot. When multivariate normality holds, the sample squared distance should follow a chi-square distribution for large samples. The plot should resemble a straight line. A correlation coefficient test can be constructed by using the pairs of points in the probability plot. When the correlation coefficient test does not reject the null hypothesis, the sample data may come from a multivariate normal distribution or some other distributions. So, we use the following two steps to test multivariate normality. First, we check the multivariate normality by using the probability plot correction coefficient test. If the test does not reject the null hypothesis, then we test symmetry of the distribution and determine whether multivariate normality holds. This test procedure is called the combination test. The size and power of this test are studied, and it is found that the combination test, in general, is more powerful than other tests for multivariate normality.  相似文献   

9.
We provide numerically reliable analytical expressions for the score, Hessian, and information matrix of conditionally heteroscedastic dynamic regression models when the conditional distribution is multivariatet. We also derive one-sided and two-sided Lagrange multiplier tests for multivariate normality versus multivariate t based on the first two moments of the squared norm of the standardized innovations evaluated at the Gaussian pseudo-maximum likelihood estimators of the conditional mean and variance parameters. Finally, we illustrate our techniques through both Monte Carlo simulations and an empirical application to 26 U.K. sectorial stock returns that confirms that their conditional distribution has fat tails.  相似文献   

10.
The Hotelling's T2statistic has been used in constructing a multivariate control chart for individual observations. In Phase II operations, the distribution of the T2statistic is related to the F distribution provided the underlying population is multivariate normal. Thus, the upper control limit (UCL) is proportional to a percentile of the F distribution. However, if the process data show sufficient evidence of a marked departure from multivariate normality, the UCL based on the F distribution may be very inaccurate. In such situations, it will usually be helpful to determine the UCL based on the percentile of the estimated distribution for T2. In this paper, we use a kernel smoothing technique to estimate the distribution of the T2statistic as well as of the UCL of the T2chart, when the process data are taken from a multivariate non-normal distribution. Through simulations, we examine the sample size requirement and the in-control average run length of the T2control chart for sample observations taken from a multivariate exponential distribution. The paper focuses on the Phase II situation with individual observations.  相似文献   

11.
A control procedure is presented for monitoring changes in variation for a multivariate normal process in a Phase II operation where the subgroup size, m, is less than p, the number of variates. The methodology is based on a form of Wilk' statistic, which can be expressed as a function of the ratio of the determinants of two separate estimates of the covariance matrix. One estimate is based on the historical data set from Phase I and the other is based on an augmented data set including new data obtained in Phase II. The proposed statistic is shown to be distributed as the product of independent beta distributions that can be approximated using either a chi-square or F-distribution. An ARL study of the statistic is presented for a range of conditions for the population covariance matrix. Cases are considered where a p-variate process is being monitored using a sample of m observations per subgroup and m < p. Data from an industrial multivariate process is used to illustrate the proposed technique.  相似文献   

12.
If the asymptotic normality of a statistic is inadequate for approximating its distribution in practice, then the statistic may be transformed in order to improve the approximation by accelerating the convergence to normality. We treat a goodness-of-fit statistic, the sum of the logarithms of generalized uniform spacings introduced by Cressie (1976, 1978), in this spirit. Specifically, we apply the method of maximum likelihood to simulations of the statistic in order to estimate a power transformation, as in Box & Cox (1964), and hence develop a small sample normal approximation. This approximation provides a more versatile method of applying the statistic than currently available tables of percentiles.  相似文献   

13.
We define a chi-squared statistic for p-dimensional data as follows. First, we transform the data to remove the correlations between the p variables. Then, we discretize each variable into groups of equal size and compute the cell counts in the resulting p-way contingency table. Our statistic is just the usual chi-squared statistic for testing independence in a contingency table. Because the cells have been chosen in a data-dependent manner, this statistic does not have the usual limiting distribution. We derive the limiting joint distribution of the cell counts and the limiting distribution of the chi-squared statistic when the data is sampled from a multivariate normal distribution. The chi-squared statistic is useful in detecting hidden structure in raw data or residuals. It can also be used as a test for multivariate normality.  相似文献   

14.
Under a randomization model for a completely randomized design permutation tests are considered based on the usual F statistic and on a multi-response permutation procedure statistic. For the first statistic the first two moments are obtained so a comparision with the distribution under the normal theory model can be made. The second statistic is shown to converge in distribution to an infinite weighted sum of chi-squared variates, the weights being the limits of the eigenvalues of a matrix depending on the distance measure used and the order statistics of the observations.  相似文献   

15.
The author presents a multivariate location model for cluster correlated observations. He proposes an affine‐invariant multivariate sign statistic for testing the value of the location parameter. His statistic is an adaptation of that proposed by Randles (2000). The author shows, under very mild conditions, that his test statistic is asymptotically distributed as a chi‐squared random variable under the null hypothesis. In particular, the test can be used for skewed populations. In the context of a general multivariate normal model, the author obtains values of his test's Pitman asymptotic efficiency relative to another test based on the overall average. He shows that there is an improvement in the relative performance of the new test as soon as intra‐cluster correlation is present Even in the univariate case, the new test can be very competitive for Gaussian data. Furthermore, the statistic is easy to compute, even for large dimensional data. The author shows through simulations that his test performs well compared to the average‐based test. He illustrates its use with real data.  相似文献   

16.
In fitting regression model, one or more observations may have substantial effects on estimators. These unusual observations are precisely detected by a new diagnostic measure, Pena's statistic. In this article, we introduce a type of Pena's statistic for each point in Liu regression. Using the forecast change property, we simplify the Pena's statistic in a numerical sense. It is found that the simplified Pena's statistic behaves quite well as far as detection of influential observations is concerned. We express Pena's statistic in terms of the Liu leverages and residuals. The normality of this statistic is also discussed and it is demonstrated that it can identify a subset of high Liu leverage outliers. For numerical evaluation, simulated studies are given and a real data set has been analysed for illustration.  相似文献   

17.
The main purpose of this paper is to give an algorithm to attain joint normality of non-normal multivariate observations through a new power normal family introduced by the author (Isogai, 1999). The algorithm tries to transform each marginal variable simultaneously to joint normality, but due to a large number of parameters it repeats a maximization process with respect to the conditional normal density of one transformed variable given the other transformed variables. A non-normal data set is used to examine performance of the algorithm, and the degree of achievement of joint normality is evaluated by measures of multivariate skewness and kurtosis. Besides the above topic, making use of properties of our power normal family, we discuss not only a normal approximation formula of non-central F distributions in the frame of regression analysis but also some decomposition formulas of a power parameter, which appear in a Wilson-Hilferty power transformation setting.  相似文献   

18.
This paper deals with testing equality of variances of observations in the different treatment groups assuming treatment effects are fixed. We study the distribution of a test statistic which is known to perform comparably well with other statistics for the same purpose under normality. The statistic we consider is based on Shannon’s entropy for a distribution function. We will derive the asymptotic expansion for the distribution of the test statistic based on Shannon’s entropy under nonnormality and numerically examine its performance in comparison with the modified likelihood ratio criteria for normal and some nonnormal populations.   相似文献   

19.
A correlation-type statistic for assessing multivariate normality is described. Its estimated finite sample distribution is tabulated, and its performance against certain alternatives is compared with that of a competing Cramer-von Mises type statistic in a Monte Carlo power study. A set of quadrivariate data is examined as illustration of the procedure.  相似文献   

20.
In this article, we introduce a new multivariate cumulative sum chart, where the target mean shift is assumed to be a weighted sum of principal directions of the population covariance matrix. This chart provides an attractive performance in terms of average run length (ARL) for large-dimensional data and it also compares favorably to existing multivariate charts including Crosier's benchmark chart with updated values of the upper control limit and the associated ARL function. In addition, Monte Carlo simulations are conducted to assess the accuracy of the well-known Siegmund's approximation of the average ARL function when observations are normal distributed. As a byproduct of the article, we provide updated values of upper control limits and the associated ARL function for Crosier's multivariate CUSUM chart.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号