首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we revisit the alternative outlier model of Thompson [A note on restricted maximum likelihood estimation with an alternative outlier model, J. Roy. Stat. Soc. Ser. B 47 (1985), pp. 53–55] for detecting outliers in the linear model. Gumedze et al. [A variance shift model for detection of outliers in the linear mixed model, Comput. Statist. Data Anal. 54 (2010), pp. 2128–2144] called this model the variance shift outlier model (VSOM). The basic idea behind the VSOM is to detect observations with inflated variance and isolate them for further investigation. The VSOM is appealing because it downweights an outlier in the analysis, with the weighting determined automatically as part of the estimation procedure. We set up the VSOM as a linear mixed model and then use the likelihood ratio test (LRT) statistic as an objective measure for determining whether the weighting is required, i.e. whether the observation is an outlier. We also derived one-step updates of the variance parameter estimates based on observed, expected and average information matrices to obtain one-step LRT statistics which usually require less computation. Both the fully iterated and one-step LRTs are functions of the squared standard residuals from the null model and therefore can be computed directly without the need to fit the VSOM. We investigated the properties of the likelihood ratio tests and compare them. An extension of the model to detect a group of outliers is also given. We illustrate the proposed methodology using simulated datasets and a real dataset.  相似文献   

2.
ABSTRACT

Cylindrical data are bivariate data from the combination of circular and linear variables. However, up to now no work has been done on the detection of outlier in cylindrical data. We introduce a definition of outlier for cylindrical data and present a new test of discordancy to detect outlier in this type of data, based on the k-nearest neighbor’s distance. Cut-off points of the new test statistic based on the Johnson-Wehrly distribution are calculated and its performance is examined using simulation. A practical example is presented using wind speed and wind direction data obtained from the Malaysian Meteorological Department.  相似文献   

3.
Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings.  相似文献   

4.
Testing for homogeneity in finite mixture models has been investigated by many researchers. The asymptotic null distribution of the likelihood ratio test (LRT) is very complex and difficult to use in practice. We propose a modified LRT for homogeneity in finite mixture models with a general parametric kernel distribution family. The modified LRT has a χ-type of null limiting distribution and is asymptotically most powerful under local alternatives. Simulations show that it performs better than competing tests. They also reveal that the limiting distribution with some adjustment can satisfactorily approximate the quantiles of the test statistic, even for moderate sample sizes.  相似文献   

5.
The presence of outliers would inevitably lead to distorted analysis and inappropriate prediction, especially for multiple outliers in high-dimensional regression, where the high dimensionality of the data might amplify the chance of an observation or multiple observations being outlying. Noting that the detection of outliers is not only necessary but also important in high-dimensional regression analysis, we, in this paper, propose a feasible outlier detection approach in sparse high-dimensional linear regression model. Firstly, we search a clean subset by use of the sure independence screening method and the least trimmed square regression estimates. Then, we define a high-dimensional outlier detection measure and propose a multiple outliers detection approach through multiple testing procedures. In addition, to enhance efficiency, we refine the outlier detection rule after obtaining a relatively reliable non-outlier subset based on the initial detection approach. By comparison studies based on Monte Carlo simulation, it is shown that the proposed method performs well for detecting multiple outliers in sparse high-dimensional linear regression model. We further illustrate the application of the proposed method by empirical analysis of a real-life protein and gene expression data.  相似文献   

6.
In mixed linear models, it is frequently of interest to test hypotheses on the variance components. F-test and likelihood ratio test (LRT) are commonly used for such purposes. Current LRTs available in literature are based on limiting distribution theory. With the development of finite sample distribution theory, it becomes possible to derive the exact test for likelihood ratio statistic. In this paper, we consider the problem of testing null hypotheses on the variance component in a one-way balanced random effects model. We use the exact test for the likelihood ratio statistic and compare the performance of F-test and LRT. Simulations provide strong support of the equivalence between these two tests. Furthermore, we prove the equivalence between these two tests mathematically.  相似文献   

7.
This study investigates the influences of additive outliers on financial durations. An outlier test statistic and an outlier detection procedure are proposed to detect and estimate outlier effects for the logarithmic Autoregressive Conditional Duration (Log-ACD) model. The proposed test statistic has an exact sampling distribution and performs very well, in terms of size and power, in a series of Monte Carlo simulations. Furthermore, the test statistic is robust to several alternative distribution assumptions. An empirical application shows that parameter estimates without considering outliers tend to be biased.  相似文献   

8.
In this paper we present a "model free' method of outlier detection for Gaussian time series by using the autocorrelation structure of the time series. We also present a graphic diagnostic method in order to distinguish an additive outlier (AO) from an innovation outlier (IO). The test statistic for detecting the outlier has a χ ² distribution with one degree of freedom. We show that this method works well when the time series contain either one type of the outliers or both additive and innovation type outliers, and this method has the advantage that no time series model needs to be estimated from the data. Simulation evidence shows that different types of outliers can be graphically distinguished by using the techniques proposed.  相似文献   

9.
We propose a new method to test the order between two high-dimensional mean curves. The new statistic extends the approach of Follmann (1996) to high-dimensional data by adapting the strategy of Bai and Saranadasa (1996). The proposed procedure is an alternative to the non-negative basis matrix factorization (NBMF) based test of Lee et al. (2008) for the same hypothesis, but it is much easier to implement. We derive the asymptotic mean and variance of the proposed test statistic under the null hypothesis of equal mean curves. Based on theoretical results, we put forward a permutation procedure to approximate the null distribution of the new test statistic. We compare the power of the proposed test with that of the NBMF-based test via simulations. We illustrate the approach by an application to tidal volume traces.  相似文献   

10.
The effect of a single variable data point, x, on the usual test statistics for traditional hypothesis tests for means is analyzed. It is shown that an outlier may have a profound and unexpected effect on the test statistic. Although it might appear that an outlier would tend to lend support to the alternate hypothesis, it may in fact detract from the significance of the test. In one-population tests and analysis of variance (ANOVA), the value of x that maximizes the significance of the test statistic is given. This value does not have to be unusually large or small. In fact, it often falls within the range of the other sample points. In the general one-population case, the limiting value for the test statistic is shown to be +1. In the case involving more than one population, it is shown that the limiting value of the test statistic is a function only of the number of members in the samples and not their relative values. Special cases are identified in which the test statistic is shown to have unique characteristics depending on the characteristics of the data.  相似文献   

11.
An outlier is defined as an observation that is significantly different from the others in its dataset. In high-dimensional regression analysis, datasets often contain a portion of outliers. It is important to identify and eliminate the outliers for fitting a model to a dataset. In this paper, a novel outlier detection method is proposed for high-dimensional regression problems. The leave-one-out idea is utilized to construct a novel outlier detection measure based on distance correlation, and then an outlier detection procedure is proposed. The proposed method enjoys several advantages. First, the outlier detection measure can be simply calculated, and the detection procedure works efficiently even for high-dimensional regression data. Moreover, it can deal with a general regression, which does not require specification of a linear regression model. Finally, simulation studies show that the proposed method behaves well for detecting outliers in high-dimensional regression model and performs better than some other competing methods.  相似文献   

12.
A. Roy  D. Klein 《Statistics》2018,52(2):393-408
Testing hypotheses about the structure of a covariance matrix for doubly multivariate data is often considered in the literature. In this paper the Rao's score test (RST) is derived to test the block exchangeable covariance matrix or block compound symmetry (BCS) covariance structure under the assumption of multivariate normality. It is shown that the empirical distribution of the RST statistic under the null hypothesis is independent of the true values of the mean and the matrix components of a BCS structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Simulation studies are performed for the sample size consideration, and for the estimation of the empirical quantiles of the null distribution of the test statistic. The RST procedure is illustrated on a real data set from the medical studies.  相似文献   

13.
The likelihood ratio test (LRT) for the mean direction in the von Mises distribution is modified for possessing a common asymptotic distribution both for large sample size and for large concentration parameter. The test statistic of the modified LRT is compared with the F distribution but not with the chi-square distribution usually employed, Good performances of the modified LRT are shown by analytical studies and Monte Carlo simulation studies, A notable advantage of the test is that it takes part in the unified likelihood inference procedures including both the marginal MLE and the marginal LRT for the concentration parameter.  相似文献   

14.
Test procedures on outlier detection problems for Gumbel distribution are rarely available. Hence, a test statistic is proposed here for detection of a pair of upper and lower outliers from a Gumbel distribution with known scale parameter. The critical values of the statistic are obtained and some examples are also given to highlight the use of the statistic. The advantage of the proposed statistic is that the scale parameter, though assumed to be known is not explicitly involved in the determination of the critical values.  相似文献   

15.
The likelihood-ratio test (LRT) is considered as a goodness-of-fit test for the null hypothesis that several distribution functions are uniformly stochastically ordered. Under the null hypothesis, H1 : F1 ? F2 ?···? FN, the asymptotic distribution of the LRT statistic is a convolution of several chi-bar-square distributions each of which depends upon the location parameter. The least-favourable parameter configuration for the LRT is not unique. It can be two different types and depends on the number of distributions, the number of intervals and the significance level α. This testing method is illustrated with a data set of survival times of five groups of male fruit flies.  相似文献   

16.
For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative hypothesis requires complex analytic approximations, and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p≤20. On the other hand, assuming that the data dimension p as well as the number q of regression variables are fixed while the sample size n grows, several asymptotic approximations are proposed in the literature for Wilk's Λ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension p and a large sample size n. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null hypothesis and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large p and large n context, but also for moderately large data dimensions such as p=30 or p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in multivariate analysis of variance which is valid for high-dimensional data.  相似文献   

17.
In this study, testing the equality of mean vectors in a one-way multivariate analysis of variance (MANOVA) is considered when each dataset has a monotone pattern of missing observations. The likelihood ratio test (LRT) statistic in a one-way MANOVA with monotone missing data is given. Furthermore, the modified test (MT) statistic based on likelihood ratio (LR) and the modified LRT (MLRT) statistic with monotone missing data are proposed using the decomposition of the LR and an asymptotic expansion for each decomposed LR. The accuracy of the approximation for the Chi-square distribution is investigated using a Monte Carlo simulation. Finally, an example is given to illustrate the methods.  相似文献   

18.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

19.
In an affected‐sib‐pair genetic linkage analysis, identical by descent data for affected sib pairs are routinely collected at a large number of markers along chromosomes. Under very general genetic assumptions, the IBD distribution at each marker satisfies the possible triangle constraint. Statistical analysis of IBD data should thus utilize this information to improve efficiency. At the same time, this constraint renders the usual regularity conditions for likelihood‐based statistical methods unsatisfied. In this paper, the authors study the asymptotic properties of the likelihood ratio test (LRT) under the possible triangle constraint. They derive the limiting distribution of the LRT statistic based on data from a single locus. They investigate the precision of the asymptotic distribution and the power of the test by simulation. They also study the test based on the supremum of the LRT statistics over the markers distributed throughout a chromosome. Instead of deriving a limiting distribution for this test, they use a mixture of chi‐squared distributions to approximate its true distribution. Their simulation results show that this approach has desirable simplicity and satisfactory precision.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号