首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Statistics are developed for predicting the effect of data transformations on the F statistic when the assumptions of homoscedasticity and normality underlying the AN OVA are not necessarily satisfied. These statistics are useful for determining whether and how to transform, They are developed by partitioning the change in the observed value of the jF-statistic under the transformation, into two expressions, one of which depends on the "truth" of HQ while the other does not. Using this partition, desirable properties are derived for transformations. Criteria are developed defining transformations which tend to preserve the type 1 error while increasing power when needed. Using these criteria, the notion of model robustness is introduced. It is shown that the Box-Cox methodology for selecting a power transform may, under certain conditions, produce a transformation which does not permit inferences to be made about the parent population from the transformed population. An alternative approach suggested here does permit such inferences.  相似文献   

2.
A general method is proposed by which nonnormally distributed data can be transformed to achieve approximate normality. The method uses an empirical nonlinear data-fitting approach and can be applied to a broad class of transformations including the Box-Cox, arcsine, generalized logit, and Weibull-type transformations. It is easy to implement using standard statistical software packages. Several examples are provided.  相似文献   

3.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

4.
This paper studies four methods for estimating the Box-Cox parameter used to transform data to normality. Three of these are based on optimizing test statistics for standard normality tests (the Shapiro-Wilk. skewness, and kurtosis tests); the fourth uses the maximum likelihood estimator of the Box-Cox parameter. The four methods are compared and evaluated with a simulation study, where their performances under different skewness and kurtosis conditions are analyzed. The estimator based on optimizing the Shapiro-Wilk statistic generally gives rise to the best transformations, while the maximum likelihood estimator performs almost as well. Estimators based on optimizing skewness and kurtosis do not perform well in general.  相似文献   

5.
The Box-Cox power family of transformations for multivariate regression data is considered. The influence of cases on the maximum likelihood estimators of the transformation parameters is investigated using the local influence approach, An example is given to- illustrate the local influence method and to show the effectiveness of the method.  相似文献   

6.
In this paper, we consider a Bayesian analysis of the unbalanced (general) growth curve model with AR(1) autoregressive dependence, while applying the Box-Cox power transformations. We propose exact, simple and Markov chain Monte Carlo approximate parameter estimation and prediction of future values. Numerical results are illustrated with real and simulated data.  相似文献   

7.
This article describes estimation and inference procedures for the parameters of the Box-Cox and foided-power transformations in repeated measures and growth curve models. Procedures for computing maximum likelihood estimates of the transformation and covariance parameters under several covanance structures (omnibus sphericity, local sphericity, and unstructured) are described. Lack of fit statistics and hypothesis tests for comparing these structures also are described. The procedures are illustrated on three data sets. Software for performing the analyses in the SAS System is described and is available from the authors.  相似文献   

8.
It is desirable that the data for a statistical control chart be normally distributed. However, if the data are not normal, then a transformation can be used, e.g. Box-Cox transformations, to produce a suitable control chart. In this paper we will discuss a quantile approach to produce a control chart and to estimate median rankit for various non-normal distributions. We will also provide examples of logistic data to indicate how a quantile approach could be used to construct a control chart for a non-normal distribution using a median rankit.  相似文献   

9.
The behavior of the Box-Cox estimate of power transformation is further examined. Through the asymptotic expansions and small-σ approximations, the exact nature of dependence of transformation estimation on the model structure, the spread of the means and the error variance is revealed. The results are shown to be useful in assessing what Box and Cox called transformation potential of a particular data set.  相似文献   

10.
Data on the weights and heights of children 2-18 yeas old in Iran were obtained in a National Health Survey of 10 660 families in 1990-92. Data were 'cleaned' in 1 year age groups. After excluding gross outliers by inspection of bivariate scatter plots, Box-Cox power transformations were used to normalize the distributions of height and weight. If a multivariate Box-Cox power transformation to normality exists, then it is equivalent to normalizing the data variable by variable. After excluding gross outliers, exclusions based on the Mahalanobis distance were almost identical to those identified by Hadi's iterative procedure, because the percentages of outliers were small. In all, 1% of the observations were gross outliers and a further 0.4% were identified by multivariate analysis. Review of records showed that the outliers identified by multivariate analysis resulted from data-processing errors. After transformation and 'cleaning', the data quality was excellent and suitable for the construction of growth charts.  相似文献   

11.
We develop semiparametric and parametric transformation models for estimation and comparison of ROC curves derived from measurements from two diagnostic tests on the same subjects. We assume the existence of transformed measurement scales, one for each test, on which the paired measurements have bivariate normal distributions. The resulting pair of ROC curves are estimated by maximum likelihood algorithms, using joint rank data in the semiparametric model with unspecified transformations and using Box-Cox transformations in the parametric transformation case. Several hypothesis tests for comparing the two ROC curves, or characteristics of them, are developed. Two clinical examples are presented and simulation results are provided.  相似文献   

12.
The emphasis in the literature is on normalizing transformations, despite the greater importance of the homogeneity of variance in analysis. A strategy for a choice of variance-stabilizing transformation is suggested. The relevant component of variation must be identified and, when this is not within-subject variation, a major explanatory variable must also be selected to subdivide the data. A plot of group standard deviation against group mean, or log standard deviation against log mean, may identify a simple power transformation or shifted log transformation. In other cases, within the shifted Box-Cox family of transformations, a contour plot to show the region of minimum heterogeneity defined by an appropriate index is proposed to enable an informed choice of transformation. If used in conjunction with the maximum-likelihood contour plot for the normalizing transformation, then it is possible to assess whether or not there exists a transformation that satisfies both criteria.  相似文献   

13.
We consider a semiparametric and a parametric transformation-to-normality model for bivariate data. After an unstructured or structured monotone transformation of the measurement scales, the measurements are assumed to have a bivariate normal distribution with correlation coefficient ρ, here termed the 'transformation correlation coefficient'. Under the semiparametric model with unstructured transformation, the principle of invariance leads to basing inference on the marginal ranks. The resulting rank-based likelihood function of ρis maximized via a Monte Carlo procedure. Under the parametric model, we consider Box-Cox type transformations and maximize the likelihood of ρalong with the nuisance parameters. Efficiencies of competing methods are reported, both theoretically and by simulations. The methods are illustrated on a real-data example.  相似文献   

14.
We have tested alternative models of the demand for medical care using experimental data. The estimated response of demand to insurance plan is sensitive to the model used. We therefore use a split-sample analysis and find that a model that more closely approximates distributional assumptions and uses a nonparametric retransformation factor performs better in terms of mean squared forecast error. Simpler models are inferior either because they are not robust to outliers (e.g., ANOVA, ANOCOVA), or because they are inconsistent when strong distributional assumptions are violated (e.g., a two-parameter Box-Cox transformation).  相似文献   

15.
Testing of a composite null hypothesis versus a composite alternative is considered when both have a related invariance structure. The goal is to develop conditional frequentist tests that allow the reporting of data-dependent error probabilities, error probabilities that have a strict frequentist interpretation and that reflect the actual amount of evidence in the data. The resulting tests are also seen to be Bayesian tests, in the strong sense that the reported frequentist error probabilities are also the posterior probabilities of the hypotheses under default choices of the prior distribution. The new procedures are illustrated in a variety of applications to model selection and multivariate hypothesis testing.  相似文献   

16.
In predicting a future lifetime based on a sample of past lifetimes, the Box-Cox transformation method provides a simple and unified procedure that is shown in this article to meet or often outperform the corresponding frequentist solution in terms of coverage probability and average length of prediction intervals. Kullback-Leibler information and second-order asymptotic expansion are used to justify the Box-Cox procedure. Extensive Monte Carlo simulations are also performed to evaluate the small sample behavior of the procedure. Certain popular lifetime distributions, such as Weibull, inverse Gaussian and Birnbaum-Saunders are served as illustrative examples. One important advantage of the Box-Cox procedure lies in its easy extension to linear model predictions where the exact frequentist solutions are often not available.  相似文献   

17.
Series evaluation of Tweedie exponential dispersion model densities   总被引:2,自引:0,他引:2  
Exponential dispersion models, which are linear exponential families with a dispersion parameter, are the prototype response distributions for generalized linear models. The Tweedie family comprises those exponential dispersion models with power mean-variance relationships. The normal, Poisson, gamma and inverse Gaussian distributions belong to theTweedie family. Apart from these special cases, Tweedie distributions do not have density functions which can be written in closed form. Instead, the densities can be represented as infinite summations derived from series expansions. This article describes how the series expansions can be summed in an numerically efficient fashion. The usefulness of the approach is demonstrated, but full machine accuracy is shown not to be obtainable using the series expansion method for all parameter values. Derivatives of the density with respect to the dispersion parameter are also derived to facilitate maximum likelihood estimation. The methods are demonstrated on two data examples and compared with with Box-Cox transformations and extended quasi-likelihoood.  相似文献   

18.
In practice, it often happens that we have a number of base methods of classification. We are not able to clearly determine which method is optimal in the sense of the smallest error rate. Then we have a combined method that allows us to consolidate information from multiple sources in a better classifier. I propose a different approach, a sequential approach. Sequentiality is understood here in the sense of adding posterior probabilities to the original data set and so created data are used during classification process. We combine posterior probabilities obtained from base classifiers using all combining methods. Finally, we combine these probabilities using a mean combining method. To the original data set we add obtained posterior probabilities as additional features. In each step we change our additional probabilities to achieve the minimum error rate for base methods. Experimental results on different data sets demonstrate that the method is efficient and that this approach outperforms base methods providing a reduction in the mean classification error rate.  相似文献   

19.
Bayesian inclusion probabilities have become a popular tool for variable assessment. From a frequentist perspective, it is often difficult to evaluate these probabilities as typically no Type I error rates are considered, neither are any explorations of power of the methods given. This paper considers how a frequentist may evaluate Bayesian inclusion probabilities for screening predictors. This evaluation looks at both unrestricted and restricted model spaces and develops a framework which a frequentist can utilize inclusion probabilities that preserve Type I error rates. Furthermore, this framework is applied to an analysis of the Arabidopsis thaliana with respect to determining quantitative trait loci associated with cotelydon opening angle.  相似文献   

20.
The general approach to generating random variates through transformations with multiple roots is discussed. Multinomial probabilities are determined for the selection of the different roots. An application of the general result yields a new and simple technique for the generation of variates from the inverse Gaussian distribution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号