首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.  相似文献   

2.
A method of bootstrapping the two-sample t-test after a Box-Cox transformation is proposed. The procedure is shown to be consistent and asymptotically as efficient as the non-bootstrapped Box-Cox t-test. Because the bootstrap samples are drawn without the assumption of the same distributional shapes,the procedure may be more robust against violation of this assumption. Simulation results support this conjecture.  相似文献   

3.
In this paper, we investigate different procedures for testing the equality of two mean survival times in paired lifetime studies. We consider Owen’s M-test and Q-test, a likelihood ratio test, the paired t-test, the Wilcoxon signed rank test and a permutation test based on log-transformed survival times in the comparative study. We also consider the paired t-test, the Wilcoxon signed rank test and a permutation test based on original survival times for the sake of comparison. The size and power characteristics of these tests are studied by means of Monte Carlo simulations under a frailty Weibull model. For less skewed marginal distributions, the Wilcoxon signed rank test based on original survival times is found to be desirable. Otherwise, the M-test and the likelihood ratio test are the best choices in terms of power. In general, one can choose a test procedure based on information about the correlation between the two survival times and the skewness of the marginal survival distributions.  相似文献   

4.
In this paper, we consider the family of skew generalized t (SGT) distributions originally introduced by Theodossiou [P. Theodossiou, Financial data and the skewed generalized t distribution, Manage. Sci. Part 1 44 (12) ( 1998), pp. 1650–1661] as a skew extension of the generalized t (GT) distribution. The SGT distribution family warrants special attention, because it encompasses distributions having both heavy tails and skewness, and many of the widely used distributions such as Student's t, normal, Hansen's skew t, exponential power, and skew exponential power (SEP) distributions are included as limiting or special cases in the SGT family. We show that the SGT distribution can be obtained as the scale mixture of the SEP and generalized gamma distributions. We investigate several properties of the SGT distribution and consider the maximum likelihood estimation of the location, scale, and skewness parameters under the assumption that the shape parameters are known. We show that if the shape parameters are estimated along with the location, scale, and skewness parameters, the influence function for the maximum likelihood estimators becomes unbounded. We obtain the necessary conditions to ensure the uniqueness of the maximum likelihood estimators for the location, scale, and skewness parameters, with known shape parameters. We provide a simple iterative re-weighting algorithm to compute the maximum likelihood estimates for the location, scale, and skewness parameters and show that this simple algorithm can be identified as an EM-type algorithm. We finally present two applications of the SGT distributions in robust estimation.  相似文献   

5.
ABSTRACT

The one-sample Wilcoxon signed rank test was originally designed to test for a specified median, under the assumption that the distribution is symmetric, but it can also serve as a test for symmetry if the median is known. In this article we derive the Wilcoxon statistic as the first component of Pearson's X 2 statistic for independence in a particularly constructed contingency table. The second and third components are new test statistics for symmetry. In the second part of the article, the Wilcoxon test is extended so that symmetry around the median and symmetry in the tails can be examined seperately. A trimming proportion is used to split the observations in the tails from those around the median. We further extend the method so that no arbitrary choice for the trimming proportion has to be made. Finally, the new tests are compared to other tests for symmetry in a simulation study. It is concluded that our tests often have substantially greater powers than most other tests.  相似文献   

6.
We aimed to determine the most proper change measure among simple difference, percent, or symmetrized percent changes in simple paired designs. For this purpose, we devised a computer simulation program. Since distributions of percent and symmetrized percent change values are skewed and bimodal, paired t-test did not give good results according to Type I error and the test power. To be to able use percent change or symmetrized percent change as change measure, either the distribution of test statistics should be transformed to a known theoretical distribution by transformation methods or a new test statistic for these values should be developed.  相似文献   

7.
The present paper has as its objective an accurate quantification of the robustness of the two–sample t-test over an extensive practical range of distributions. The method is that of a major Monte Carlo study over the Pearson system of distributions and the details indicate that the results are quite accurate. The study was conducted over the range β 1 =0.0(0.4)2.0 (negative and positive skewness) and β 2 =1.4 (0.4)7.8 with equal sample sizes and for both the one-and two-tail t-tests. The significance level and power levels (for nominal values of 0.05, 0.50, and 0.95, respectively) were evaluated for each underlying distribution and for each sample size, with each probability evaluated from 100,000 generated values of the test-statistic. The results precisely quantify the degree of robustness inherent in the two-sample t-test and indicate to a user the degree of confidence one can have in this procedure over various regions of the Pearson system. The results indicate that the equal-sample size two-sample t-test is quite robust with respect to departures from normality, perhaps even more so than most people realize.  相似文献   

8.
Fisher's method of combining independent tests is used to construct tests of means of multivariate normal populations when the covariance matrix has intraclass correlation structure. Monte Carlo studies are reported which show that the tests are more powerful than Hotelling's T 2-test in both one and two sample situations.  相似文献   

9.
For a given significance level α, Welch's approximate t-test for the Behrens-Fisher Problem is modified to get a test with size α. A useful result for carrying out the Berger and Boos test is provided. Simulation results give power comparisons of several size α tests.  相似文献   

10.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

11.
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.  相似文献   

12.
Control charts have been used effectively for years to monitor processes and detect abnormal behaviors. However, most control charts require a specific distribution to establish their control limits. The bootstrap method is a nonparametric technique that does not rely on the assumption of a parametric distribution of the observed data. Although the bootstrap technique has been used to develop univariate control charts to monitor a single process, no effort has been made to integrate the effectiveness of the bootstrap technique with multivariate control charts. In the present study, we propose a bootstrap-based multivariate T 2 control chart that can efficiently monitor a process when the distribution of observed data is nonnormal or unknown. A simulation study was conducted to evaluate the performance of the proposed control chart and compare it with a traditional Hotelling's T 2 control chart and the kernel density estimation (KDE)-based T 2 control chart. The results showed that the proposed chart performed better than the traditional T 2 control chart and performed comparably with the KDE-based T 2 control chart. Furthermore, we present a case study to demonstrate the applicability of the proposed control chart to real situations.  相似文献   

13.
A power study suggests that a good test of fit analysis for the binomial distribution is provided by a data-dependent Chernoff–Lehmann X 2 test with class expectations greater than unity, and its components. These data-dependent statistics involve arithmetically simple parameter estimation, convenient approximate distributions and provide a comprehensive assessment of how well the data agree with a binomial distribution. We suggest that a well-performed single test of fit statistic is the Anderson–Darling statistic.  相似文献   

14.
A sequential probability ratio test (SPET) of the mean of a normal distribution with unknown variance, based on an independent sequence of groups of observations, is investigated and its efficiency compared with that of the WAGE sequential t-test, which is based on an invariantly sufficient sequence of test statistics.  相似文献   

15.
Book Reviews     
The Levene test is a widely used test for detecting differences in dispersion. The modified Levene transformation using sample medians is considered in this article. After Levene's transformation the data are not normally distributed, hence, nonparametric tests may be useful. As the Wilcoxon rank sum test applied to the transformed data cannot control the type I error rate for asymmetric distributions, a permutation test based on reallocations of the original observations rather than the absolute deviations was investigated. Levene's transformation is then only an intermediate step to compute the test statistic. Such a Levene test, however, cannot control the type I error rate when the Wilcoxon statistic is used; with the Fisher–Pitman permutation test it can be extremely conservative. The Fisher–Pitman test based on reallocations of the transformed data seems to be the only acceptable nonparametric test. Simulation results indicate that this test is on average more powerful than applying the t test after Levene's transformation, even when the t test is improved by the deletion of structural zeros.  相似文献   

16.
Skew normal distribution is an alternative distribution to the normal distribution to accommodate asymmetry. Since then extensive studies have been done on applying Azzalini’s skewness mechanism to other well-known distributions, such as skew-t distribution, which is more flexible and can better accommodate long tailed data than the skew normal one. The Kumaraswamy generalized distribution (Kw ? F) is another new class of distribution which is capable of fitting skewed data that can not be fitted well by existing distributions. Such a distribution has been widely studied and various versions of generalization of this distribution family have been introduced. In this article, we introduce a new generalization of the skew-t distribution based on the Kumaraswamy generalized distribution. The new class of distribution, which we call the Kumaraswamy skew-t (KwST) has the ability of fitting skewed, long, and heavy-tailed data and is more flexible than the skew-t distribution as it contains the skew-t distribution as a special case. Related properties of this distribution family such as mathematical properties, moments, and order statistics are discussed. The proposed distribution is applied to a real dataset to illustrate the estimation procedure.  相似文献   

17.
This study proposes a simple way to perform a power analysis of Mantel's test applied to squared Euclidean distance matrices. The general statistical aspects of the simple Mantel's test are reviewed. The Monte Carlo method is used to generate bivariate Gaussian variables in order to create squared Euclidean distance matrices. The power of the parametric correlation t-test applied to raw data is also evaluated and compared with that of Mantel's test. The standard procedure for calculating punctual power levels is used for validation. The proposed procedure allows one to draw the power curve by running the test only once, dispensing with the time demanding standard procedure of Monte Carlo simulations. Unlike the standard procedure, it does not depend on a knowledge of the distribution of the raw data. The simulated power function has all the properties of the power analysis theory and is in agreement with the results of the standard procedure.  相似文献   

18.
Abstract

Use of the MVUE for the inverse-Gaussian distribution has been recently proposed by Nguyen and Dinh [Nguyen, T. T., Dinh, K. T. (2003). Exact EDF goodnes-of-fit tests for inverse Gaussian distributions. Comm. Statist. (Simulation and Computation) 32(2):505–516] where a sequential application based on Rosenblatt's transformation [Rosenblatt, M. (1952). Remarks on a multivariate transformation. Ann. Math. Statist. 23:470–472] led the authors to solve the composite goodness-of-fit problem by solving the surrogate simple goodness-of-fit problem, of testing uniformity of the independent transformed variables. In this note, we observe first that the proposal is not new since it was proposed in a rather general setting in O'Reilly and Quesenberry [O'Reilly, F., Quesenberry, C. P. (1973). The conditional probability integral transformation and applications to obtain composite chi-square goodness-of-fit tests. Ann. Statist. I:74–83]. It is shown on the other hand that the results in the paper of Nguyen and Dinh (2003) are incorrect in their Sec. 4, specially the Monte Carlo figures reported. Power simulations are provided here comparing these corrected results with two previously reported goodness-of-fit tests for the inverse-Gaussian; the modified Kolmogorov–Smirnov test in Edgeman et al. [Edgeman, R. L., Scott, R. C., Pavur, R. J. (1988). A modified Kolmogorov-Smirnov test for inverse Gaussian distribution with unknown parameters. Comm. Statist. 17(B): 1203–1212] and the A 2 based method in O'Reilly and Rueda [O'Reilly, F., Rueda, R. (1992). Goodness of fit for the inverse Gaussian distribution. T Can. J. Statist. 20(4):387–397]. The results show clearly that there is a large loss of power in the method explored in Nguyen and Dinh (2003) due to an implicit exogenous randomization.  相似文献   

19.
In this article, we study the power of one-sample location tests under classical distributions and two supermodels which include the normal distribution as a special case. The distributions of the supermodels are chosen in such a way that they have equal distance to the normal as the logistic, uniform, double exponential, and the Cauchy, respectively. As a measure of distance we use the Lévy metric. The tests considered are two parametric tests, the t-test and a trimmed t-test, and two nonparametric tests, the sign test and the Wilcoxon signed-rank tests. It turns out that the power of the tests, first of all, does not depend on the Lévy distance but on the special chosen supermodel.  相似文献   

20.
The distribution of the sample correlation coefficient is derived when the population is a mixture of two bivariate normal distributions with zero mean but different covariances and mixing proportions 1 - λ and λ respectively; λ will be called the proportion of contamination. The test of ρ = 0 based on Student's t, Fisher's z, arcsine, or Ruben's transformation is shown numerically to be nonrobust when λ, the proportion of contamination, lies between 0.05 and 0.50 and the contaminated population has 9 times the variance of the standard (bivariate normal) population. These tests are also sensitive to the presence of outliers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号