首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The surveillance of multivariate processes has received growing attention during the last decade. Several generalizations of well-known methods such as Shewhart, CUSUM and EWMA charts have been proposed. Many of these multivariate procedures are based on a univariate summarized statistic of the multivariate observations, usually the likelihood ratio statistic. In this paper we consider the surveillance of multivariate observation processes for a shift between two fully specified alternatives. The effect of the dimension reduction using likelihood ratio statistics are discussed in the context of sufficiency properties. Also, an example of the loss of efficiency when not using the univariate sufficient statistic is given. Furthermore, a likelihood ratio method, the LR method, for constructing surveillance procedures is suggested for multivariate surveillance situations. It is shown to produce univariate surveillance procedures based on the sufficient likelihood ratios. As the LR procedure has several optimality properties in the univariate, it is also used here as a benchmark for comparisons between multivariate surveillance procedures  相似文献   

2.
We define a new family of influence measures based on the divergence measures, in the multivariate general linear model. Influence measures are obtained by quantifying the divergence between the sample distribution of an estimate obtained with all the observations and the sample distribution of the same estimate obtained without any observation. This approach is applied to best linear unbiased estimates of estimable functions. Therefore, these diagnostics can be applied to every statistical multivariate technique that can be formulated like this kind of model. Some examples are considered to clarify the applicability of the introduced diagnostics.  相似文献   

3.
In a cluster analysis of a multivariate data set, it may happen that one or two observations have a disproportionately large effect on the analysis, in the sense that their removal causes a dramatic change to the results. It is important to be able to identify such influential observations, and the present paper addresses this problem. To do so, we must first quantify the effect of a single observation. Various definitions are discussed, and criteria for identifying influential observations are investigated; the minimum spanning tree and the number of neighbours of each observation are considered. The investigation concentrates on single-link cluster analysis, although complete-link analysis is also briefly discussed. Patterns emerge in both real and simulated data, which suggest ways of predicting observations with no effect and those with the greatest effect. It is not necessary to recalculate the results with each observation omitted—an economy of presentation as well as labour.  相似文献   

4.
The ideas of influence are now well known, and influence functions have been investigated widely, especially in the contexts of regression analysis and multivariate analysis. However, there seems to be no published account of the influence of single observations on simple one-sample tests for means and variance. The usual test statistics, based on normality assumptions, for a mean when the variance is known, and for the variance, display an obvious behaviour when an extra observation is added to a data set. However, the t-test for a single mean has a more interesting, and less predictable, pattern of behaviour. The t-statistic also demonstrates how the theoretical influence function can sometimes be misleading when used to estimate the effect of inserting or deleting a single observation.  相似文献   

5.
Summary.  A new test is proposed comparing two multivariate distributions by using distances between observations. Unlike earlier tests using interpoint distances, the new test statistic has a known exact distribution and is exactly distribution free. The interpoint distances are used to construct an optimal non-bipartite matching, i.e. a matching of the observations into disjoint pairs to minimize the total distance within pairs. The cross-match statistic is the number of pairs containing one observation from the first distribution and one from the second. Distributions that are very different will exhibit few cross-matches. When comparing two discrete distributions with finite support, the test is consistent against all alternatives. The test is applied to a study of brain activation measured by functional magnetic resonance imaging during two linguistic tasks, comparing brains that are impaired by arteriovenous abnormalities with normal controls. A second exact distribution-free test is also discussed: it ranks the pairs and sums the ranks of the cross-matched pairs.  相似文献   

6.
Hotelling's T2 statistic has many applications in multivariate analysis. In particular, it can be used to measure the influence that a particular observation vector has on parameter estimation. For example, in the bivariate case, there exists a direct relationship between the ellipse generated using a T2 statistic for individual observations and the hyperbolae generated using Hampel's influence function for the corresponding correlation coefficient. In this paper, we jointly use the components of an orthogonal decomposition of the T2 statistic and some influence functions to identify outliers or influential observations. Since the conditional components in the T2 statistic are related to the possible changes in the correlation between a variable and a group of other variables, we consider the theoretical influence functions of the correlations and multiple correlation coefficients. Finite-sample versions of these influence functions are used to find the estimated influence function values.  相似文献   

7.
Covariance matrices, or in general matrices of sums of squares and cross-products, are used as input in many multivariate analyses techniques. The eigenvalues of these matrices play an important role in the statistical analysis of data including estimation and hypotheses testing. It has been recognized that one or few observations can exert an undue influence on the eigenvalues of a covariance matrix. The relationship between the eigenvalues of the covariance matrix computed from all data and the eigenvalues of the perturbed covariance matrix (a covariance matrix computed after a small subset of the observations has been deleted) cannot in general be written in closed-form. Two methods for approximating the eigenvalues of a perturbed covariance matrix have been suggested by Hadi (1988) and Wang and Nyquist (1991) for the case of a perturbation by a single observation. In this paper we improve on these two methods and give some additional theoretical results that may give further insight into the problem. We also compare the two improved approximations in terms of their accuracies.  相似文献   

8.
Estimation in the multivariate context when the number of observations available is less than the number of variables is a classical theoretical problem. In order to ensure estimability, one has to assume certain constraints on the parameters. A method for maximum likelihood estimation under constraints is proposed to solve this problem. Even in the extreme case where only a single multivariate observation is available, this may provide a feasible solution. It simultaneously provides a simple, straightforward methodology to allow for specific structures within and between covariance matrices of several populations. This methodology yields exact maximum likelihood estimates.  相似文献   

9.
ABSTRACT The analysis of a set of data consisting of N short (≤20 observations each) multivariate time series, where the observations are irregularly spaced and where observations for the different components of each multivariate series are observed at different times, is discussed. With the increased use of automatic recording devices in many fields, data such as these, which are of course samples from smooth response curves, are becoming more common. In this application, which was a clinical trial comparing two cements for use in hip replacement surgery, the key to the analysis was in recognizing that the interest lay in the degree to which the five curves representing a patient's vital signs deviated from baseline (i.e., normal for that patient) during surgery. This enabled the statisticians to define appropriate response variables. The analysis included Rosseeuw's (1984) technique for the identification of multivariate outliers and logistic regressions to identify any effects on the process producing the outliers due to treatment or covariates.  相似文献   

10.
Nonparametric estimation of copula-based measures of multivariate association in a continuous random vector X=(X1, …, Xd) is usually based on complete continuous data. In many practical applications, however, these types of data are not readily available; instead aggregated ordinal observations are given, for example, ordinal ratings based on a latent continuous scale. This article introduces a purely nonparametric and data-driven estimator of the unknown copula density and the corresponding copula based on multivariate contingency tables. Estimators for multivariate Spearman's rho and Kendall's tau are based thereon. The properties of these estimators in samples of medium and large size are evaluated in a simulation study. An increasing bias can be observed along with an increasing degree of association between the components. As it is to be expected, the bias is severely influenced by the amount of information available. Additionally, the influence of sample size is only marginal. We further give an empirical illustration based on daily returns of five German stocks.  相似文献   

11.
Clinical trials involving multiple time‐to‐event outcomes are increasingly common. In this paper, permutation tests for testing for group differences in multivariate time‐to‐event data are proposed. Unlike other two‐sample tests for multivariate survival data, the proposed tests attain the nominal type I error rate. A simulation study shows that the proposed tests outperform their competitors when the degree of censored observations is sufficiently high. When the degree of censoring is low, it is seen that naive tests such as Hotelling's T2 outperform tests tailored to survival data. Computational and practical aspects of the proposed tests are discussed, and their use is illustrated by analyses of three publicly available datasets. Implementations of the proposed tests are available in an accompanying R package.  相似文献   

12.
In recent years, statistical process control (SPC) of multivariate and autocorrelated processes has received a great deal of attention. Modern manufacturing/service systems with more advanced technology and higher production rates can generate complex processes in which consecutive observations are dependent and each variable is correlated. These processes obviously violate the assumption of the independence of each observation that underlies traditional SPC and thus deteriorate the performance of its traditional tools. The popular way to address this issue is to monitor the residuals—the difference between the actual value and the fitted value—with the traditional SPC approach. However, this residuals-based approach requires two steps: (1) finding the residuals; and (2) monitoring the process. Also, an accurate prediction model is necessary to obtain the uncorrelated residuals. Furthermore, these residuals are not the original values of the observations and consequently may have lost some useful information about the targeted process. The main purpose of this article is to examine the feasibility of using one-class classification-based control charts to handle multivariate and autocorrelated processes. The article uses simulated data to present an analysis and comparison of one-class classification-based control charts and the traditional Hotelling's T 2 chart.  相似文献   

13.
Necessary and sufficient conditions are given for the covariance structure of all the observations in a multivariate factorial experiment under which certain multivariate quadratic forms are independent and distributed as a constant times a Wishart. It is also shown that exact multivariate test statistics can be formed for certain covariance structures of the observations when the assumption of equal covariance matrices for each normal population is relaxed. A characterization is given for the dependency structure between random vectors in which the sample mean and sample covariance matrix have certain properties.  相似文献   

14.
We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

15.
In this paper, we generalize the notion of classification of an observation (sample), into one of the given n populations to the case where some or all of the populations into which the new observation is to be classified may be new but related in a simple way to the given n populations. The discussion is in the frame-work of the given set of observations obeying the usual multivariate general linear hypothesis model. The set ofpopulations into which the new observation may be classified could be linear manifolds of the parameter space or their closed subsets or closed convex subsets or a combination of them or simply t subsets of the parameter space each of which has a finite number of elements. In the last case alikelihood ratio procedure can be obtained easily. Classification procedures given here are based on Mahalanobis distance. Bonferroni lower bound estimate of the probability of correctly classifying an observation is given for the case when the covariance matrix is known or is estimated from a large sample. A numerical example relating to the classification procedures suggested her is given.  相似文献   

16.
A new density-based classification method that uses semiparametric mixtures is proposed. Like other density-based classifiers, it first estimates the probability density function for the observations in each class, with a semiparametric mixture, and then classifies a new observation by the highest posterior probability. By making a proper use of a multivariate nonparametric density estimator that has been developed recently, it is able to produce adaptively smooth and complicated decision boundaries in a high-dimensional space and can thus work well in such cases. Issues specific to classification are studied and discussed. Numerical studies using simulated and real-world data show that the new classifier performs very well as compared with other commonly used classification methods.  相似文献   

17.
In geostatistics, detecting atypical observations is of special interest due to the changes they can cause in environmental and geological patterns. Several methods for detecting them have been already suggested for the univariate spatial case. However, the problem is more complicated when various variables are observed simultaneously and the spatial correlation among them must be taken into account. The aim of this paper is to detect outliers and influential observations in multivariate spatial linear models. For this purpose, we derive and explore two different methods. First, a multivariate version of the forward search algorithm is given, where locations with outliers are detected in the last steps of the procedure. Next, we derive influence measures to assess the impact of the observations on the multivariate spatial linear model. The procedures are easy to compute and to interpret by means of graphical representations. Finally, an example and a Monte Carlo study illustrate the performance of these methods for identification of outliers in multivariate spatial linear models.  相似文献   

18.
The main purpose of this paper is to give an algorithm to attain joint normality of non-normal multivariate observations through a new power normal family introduced by the author (Isogai, 1999). The algorithm tries to transform each marginal variable simultaneously to joint normality, but due to a large number of parameters it repeats a maximization process with respect to the conditional normal density of one transformed variable given the other transformed variables. A non-normal data set is used to examine performance of the algorithm, and the degree of achievement of joint normality is evaluated by measures of multivariate skewness and kurtosis. Besides the above topic, making use of properties of our power normal family, we discuss not only a normal approximation formula of non-central F distributions in the frame of regression analysis but also some decomposition formulas of a power parameter, which appear in a Wilson-Hilferty power transformation setting.  相似文献   

19.
Multivariate extreme value statistical analysis is concerned with observations on several variables which are thought to possess some degree of tail dependence. The main approaches to inference for multivariate extremes consist in approximating either the distribution of block component‐wise maxima or the distribution of the exceedances over a high threshold. Although the expressions of the asymptotic density functions of these distributions may be characterized, they cannot be computed in general. In this paper, we study the case where the spectral random vector of the multivariate max‐stable distribution has known conditional distributions. The asymptotic density functions of the multivariate extreme value distributions may then be written through univariate integrals that are easily computed or simulated. The asymptotic properties of two likelihood estimators are presented, and the utility of the method is examined via simulation.  相似文献   

20.
This investigation considers a general linear model which changes parameters exactly once during the observation period. Assuming all the parameters are unknown and a proper prior distribution, the Bayesian predictive distribution of the future observations is derived.

It is shown that the predictive distribution is a mixture of multivariate t distributions and that the mixing distribution is the marginal posterior mass function of the change point parameter.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号