期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparative study of the K-means algorithm and the normal mixture model for clustering: Bivariate homoscedastic case

Dingxi Qiu 《Journal of statistical planning and inference》2010

The K-means algorithm and the normal mixture model method are two common clustering methods. The K-means algorithm is a popular heuristic approach which gives reasonable clustering results if the component clusters are ball-shaped. Currently, there are no analytical results for this algorithm if the component distributions deviate from the ball-shape. This paper analytically studies how the K-means algorithm changes its classification rule as the normal component distributions become more elongated under the homoscedastic assumption and compares this rule with that of the Bayes rule from the mixture model method. We show that the classification rules of both methods are linear, but the slopes of the two classification lines change in the opposite direction as the component distributions become more elongated. The classification performance of the K-means algorithm is then compared to that of the mixture model method via simulation. The comparison, which is limited to two clusters, shows that the K-means algorithm provides poor classification performances consistently as the component distributions become more elongated while the mixture model method can potentially, but not necessarily, take advantage of this change and provide a much better classification performance. 相似文献

2.

On the comparison of the pre-test and shrinkage estimators for the univariate normal mean 总被引：1，自引：1，他引：0

Shahjahan Khan A. K. Md. E. Saleh 《Statistical Papers》2001,42(4):451-473

The estimation of the mean of an univariate normal population with unknown variance is considered when uncertain non-sample prior information is available. Alternative estimators are defined to incorporate both the sample as well as the non-sample information in the estimation process. Some of the important statistical properties of the restricted, preliminary test, and shrinkage estimators are investigated. The performances of the estimators are compared based on the criteria of unbiasedness and mean square error in order to search for a ‘best’ estimator. Both analytical and graphical methods are explored. There is no superior estimator that uniformly dominates the others. However, if the non-sample information regarding the value of the mean is close to its true value, the shrinkage estimator over performs the rest of the estimators. Received: June 19, 1999; revised version: March 23, 2000 相似文献

3.

Consistent estimation of the minimum normal mean under the tree-order restriction

Sanjay Chaudhuri Michael. D. Perlman 《Journal of statistical planning and inference》2007

相似文献

4.

Partitioning k multivariate normal populations according to equivalence with respect to a standard vector

Weixing Cai Pinyuen Chen 《Journal of statistical planning and inference》2009

We propose optimal procedures to achieve the goal of partitioning k multivariate normal populations into two disjoint subsets with respect to a given standard vector. Definition of good or bad multivariate normal populations is given according to their Mahalanobis distances to a known standard vector as being small or large. Partitioning k multivariate normal populations is reduced to partitioning k non-central Chi-square or non-central F distributions with respect to the corresponding non-centrality parameters depending on whether the covariance matrices are known or unknown. The minimum required sample size for each population is determined to ensure that the probability of correct decision attains a certain level. An example is given to illustrate our procedures. 相似文献

5.

On testing a class of restricted hypotheses

Céline Delmas Jean-Louis Foulley 《Journal of statistical planning and inference》2007

相似文献

6.

A p-value for testing the equivalence of the variances of a bivariate normal distribution

Thomas Mathew Gitanjali Paul 《Journal of statistical planning and inference》2008

A p-value is developed for testing the equivalence of the variances of a bivariate normal distribution. The unknown correlation coefficient is a nuisance parameter in the problem. If the correlation is known, the proposed p-value provides an exact test. For large samples, the p-value can be computed by replacing the unknown correlation by the sample correlation, and the resulting test is quite satisfactory. For small samples, it is proposed to compute the p-value by replacing the unknown correlation by a scalar multiple of the sample correlation. However, a single scalar is not satisfactory, and it is proposed to use different scalars depending on the magnitude of the sample correlation coefficient. In order to implement this approach, tables are obtained providing sub-intervals for the sample correlation coefficient, and the scalars to be used if the sample correlation coefficient belongs to a particular sub-interval. Once such tables are available, the proposed p-value is quite easy to compute since it has an explicit analytic expression. Numerical results on the type I error probability and power are reported on the performance of such a test, and the proposed p-value test is also compared to another test based on a rejection region. The results are illustrated with two examples: an example dealing with the comparability of two measuring devices, and an example dealing with the assessment of bioequivalence. 相似文献

7.

Canonical correlation analysis for the vector AR(1) model with ARCH innovations

Ruey S. Tsay Shiqing Ling 《Journal of statistical planning and inference》2008

This paper extends the results of canonical correlation analysis of Anderson [2002. Canonical correlation analysis and reduced-rank regression in autoregressive models. Ann. Statist. 30, 1134–1154] to a vector AR(1) process with a vector ARCH(1) innovations. We obtain the limiting distributions of the sample matrices, the canonical correlations and the canonical vectors of the process. The extension is important because many time series in economics and finance exhibit conditional heteroscedasticity. We also use simulation to demonstrate the effects of ARCH innovations on the canonical correlation analysis in finite sample. Both the limiting distributions and simulation results show that overlooking the ARCH effects in canonical correlation analysis can easily lead to erroneous inference. 相似文献

8.

A mixture of generalized hyperbolic distributions 总被引：1，自引：0，他引：1

下载免费PDF全文

Ryan P. Browne Paul D. McNicholas 《Revue canadienne de statistique》2015,43(2):176-198

相似文献

9.

Asymptotic properties of the MAMSE adaptive likelihood weights

Jean-François Plante 《Journal of statistical planning and inference》2009

The weighted likelihood is a generalization of the likelihood designed to borrow strength from similar populations while making minimal assumptions. If the weights are properly chosen, the maximum weighted likelihood estimate may perform better than the maximum likelihood estimate (MLE). In a previous article, the minimum averaged mean squared error (MAMSE) weights are proposed and simulations show that they allow to outperform the MLE in many cases. In this paper, we study the asymptotic properties of the MAMSE weights. In particular, we prove that the MAMSE-weighted mixture of empirical distribution functions converges uniformly to the target distribution and that the maximum weighted likelihood estimate is strongly consistent. A short simulation illustrates the use of bootstrap in this context. 相似文献

10.

Log-linear models for mutations in the HIV genome

C. Ahn G.G. Koch L. Paynter J.S. Preisser F. Seillier-Moiseiwitsch 《Journal of statistical planning and inference》2007

We discuss a general application of categorical data analysis to mutations along the HIV genome. We consider a multidimensional table for several positions at the same time. Due to the complexity of the multidimensional table, we may collapse it by pooling some categories. However, the association between the remaining variables may not be the same as before collapsing. We discuss the collapsibility of tables and the change in the meaning of parameters after collapsing categories. We also address this problem with a log-linear model. We present a parameterization with the consensus output as the reference cell as is appropriate to explain genomic mutations in HIV. We also consider five null hypotheses and some classical methods to address them. We illustrate methods for six positions along the HIV genome, through consideration of all triples of positions. 相似文献

11.

On the ABLUE of the normal mean from a censored sample

Smiley W. Cheng 《Journal of statistical planning and inference》1980,4(3):259-265

The asymptotically best linear unbiased estimate (ABLUE) of the normal mean is discussed. The estimate is based on k selected order statistics chosen from a singly or doubly censored large sample of size n(>k). The coefficients, the asymptotic relative efficiency of the estimate, and the optimum spacing of k real numbers between 0 and 1 which determines the optimum ranks of order statistics, are provided. A comparison between the ABLUE and the iterated maximum likelihood estimate is made. 相似文献

12.

Asymptotic study of the multivariate functional model in the case of a random number of observations for each mean

Jeanne Fine 《Statistics》2013,47(4):285-306

相似文献

13.

Improving on the sample covariance matrix for a complex elliptically contoured distribution

Yoshihiko Konno 《Journal of statistical planning and inference》2007

In this paper the problem of estimating the scale matrix in a complex elliptically contoured distribution (complex ECD) is addressed. An extended Haff–Stein identity for this model is derived. It is shown that the minimax estimators of the covariance matrix obtained under the complex normal model remain robust under the complex ECD model when the Stein loss function is employed. 相似文献

14.

Review: Reversed low-rank ANOVA model for transforming high dimensional genetic data into low dimension

Yoonsuh Jung Jianhua Hu 《Journal of the Korean Statistical Society》2019,48(2):169-178

A general modeling procedure for analyzing genetic data is reviewed. We review ANOVA type model that can handle both the continuous and discrete genetic variables in one modeling framework. Unlike the regression type models which typically set the phenotype variable as a response, this ANOVA model treats the phenotype variable as an explanatory variable. By reversely treating the phenotype variable, usual high dimensional problem is turned into low dimension. Instead, the ANOVA model always includes interaction term between the genetic locations and phenotype variable to find potential association between them. The interaction term is designed to be low rank with the multiplication of bilinear terms so that the required number of parameters is kept in a manageable degree. We compare the performance of the reviewed ANOVA model to the other popular methods via microarray and SNP data sets. 相似文献

15.

A class of Shrunken estimators for normal mean

V.K. Srivastava 《Journal of statistical planning and inference》1982,6(3):297-299

This paper presents a class of estimators for the mean of a normal population and determines the conditions on characterizing scalars under which the class of estimators uniformly dominates over the conventional sample mean according to the mean-square-error criterion. 相似文献

16.

On some entropy and divergence type measures of variability and dependence for mixed continuous and discrete variables

K. Zografos 《Journal of statistical planning and inference》2008

相似文献

17.

A set of independent sequential residuals for the multivariate regression model

Joaquín Diaz Federico J. O&#x;Reilly Santiago Rincon-Gallardo 《Journal of statistical planning and inference》1983,8(1):21-25

In this paper a set of residuals for the multivariate linear regression model is introduced. These residuals are shown to be independent with known distributions which do not depend on the parameters of the model. Transformations of the mentioned residuals may be used to construct exact α goodness-of-fit tests for the multivariate regression model. 相似文献

18.

Regularization and selection in Gaussian mixture of autoregressive models

Abbas Khalili Jiahua Chen David A. Stephens 《Revue canadienne de statistique》2017,45(4):356-374

相似文献

19.

R. Van de Ven N. C. Weber 《Statistics》2013,47(3-4):345-352

Upper and lower bounds are obtained for the mean of the negative binomial distribution. These bounds are simple functions of a percentile determined by the shape parameter. The result is then used to obtain a robust estimate of the mean when the shape parameter is known. 相似文献

20.

A Kolmogorov–Smirnov type test for skew normal distributions based on the empirical moment generating function

Simos G. Meintanis 《Journal of statistical planning and inference》2007

In this paper tests of hypothesis are constructed for the family of skew normal distributions. The proposed tests utilize the fact that the moment generating function of the skew normal variable satisfies a simple differential equation. The empirical counterpart of this equation, involving the empirical moment generating function, yields simple consistent test statistics. Finite-sample results as well as results from real data are provided for the proposed procedures. 相似文献