首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In many areas of application mixed linear models serve as a popular tool for analyzing highly complex data sets. For inference about fixed effects and variance components, likelihood-based methods such as (restricted) maximum likelihood estimators, (RE)ML, are commonly pursued. However, it is well-known that these fully efficient estimators are extremely sensitive to small deviations from hypothesized normality of random components as well as to other violations of distributional assumptions. In this article, we propose a new class of robust-efficient estimators for inference in mixed linear models. The new three-step estimation procedure provides truncated generalized least squares and variance components' estimators with hard-rejection weights adaptively computed from the data. More specifically, our data re-weighting mechanism first detects and removes within-subject outliers, then identifies and discards between-subject outliers, and finally it employs maximum likelihood procedures on the “clean” data. Theoretical efficiency and robustness properties of this approach are established.  相似文献   

2.
Data from recordings of ore assays from the Western Australian goldfields provide motivation to devise new tests for outliers when observations are distributed with the same mean but diff ering variances. In the case of equal variances, tests for a single outlier reduce to well-known tests of discordancy. A block discordancy test for k outliers is also described. The question of whether or not one should omit any observation(s) in the calculation of the mean recoverable gold content is addressed in the context of whether or not the data contain outliers, as judged by a normal model for the 'logged' ore assay values. The given data suggest that models with 'logged' values that follow long-tailed approximately normal distributions may be appropriate.  相似文献   

3.
This paper studies subset selection procedures for screening in two-factor treatment designs that employ either a split-plot or strip-plot randomization restricted experimental design laid out in blocks. The goal is to select a subset of treatment combinations associated with the largest mean. In the split-plot design, it is assumed that the block effects, the confounding effects (whole-plot error) and the measurement errors are normally distributed. None of the selection procedures developed depend on the block variances. Subset selection procedures are given for both the case of additive and non-additive factors and for a variety of circumstances concerning the confounding effect and measurement error variances. In particular, procedures are given for (1) known confounding effect and measurement error variances (2) unknown measurement error variance but known confounding effect (3) unknown confounding effect and measurement error variances. The constants required to implement the procedures are shown to be obtainable from available FORTRAN programs and tables. Generalization to the case of strip-plot randomization restriction is considered.  相似文献   

4.
This article presents a synthetic control chart for detection of shifts in the process median. The synthetic chart is a combination of sign chart and conforming run-length chart. The performance evaluation of the proposed chart indicates that the synthetic chart has a higher power of detecting shifts in process median than the Shewhart charts based on sign statistic as well as the classical Shewhart X-bar chart for various symmetric distributions. The improvement is significant for shifts of moderate to large shifts in the median. The robustness studies of the proposed synthetic control chart against outliers indicate that the proposed synthetic control chart is robust against contamination by outliers.  相似文献   

5.
We show that the existing tests for asymptotic independence are sensitive to outliers. A robust test is proposed. The new test is made stable under contamination through a shrinkage scheme. Simulations show that the new test performs well in the presence of contaminated data while maintaining good properties when there is no contamination. An application to real data shows the added value of our new robust approach.  相似文献   

6.
This paper discusses asymptotically distribution free tests for the lack-of-fit of a parametric regression model in the Berkson measurement error model. These tests are based on a martingale transform of a certain marked empirical process of calibrated residuals. A simulation study is included to assess the effect of measurement error on the proposed test. It is observed that empirical level is more stable across the chosen measurement error variances when fitting a linear model compared to when fitting a nonlinear model, while, in both cases, the empirical power decreases as this error variance increases, against all chosen alternatives.  相似文献   

7.
Abstract

There are three main problems in the existing procedures for detecting outliers in ARIMA models. The first one is the biased estimation of the initial parameter values that may strongly affect the power to detect outliers. The second problem is the confusion between level shifts and innovative outliers when the series has a level shift. The third problem is masking. We propose a procedure that keeps the powerful features of previous methods but improves the initial parameter estimate, avoids the confusion between innovative outliers and level shifts and includes joint tests for sequences of additive outliers in order to solve the masking problem. A Monte Carlo study and one example of the performance of the proposed procedure are presented.  相似文献   

8.
Rank tests are known to be robust to outliers and violation of distributional assumptions. Two major issues besetting microarray data are violation of the normality assumption and contamination by outliers. In this article, we formulate the normal theory simultaneous tests and their aligned rank transformation (ART) analog for detecting differentially expressed genes. These tests are based on the least-squares estimates of the effects when data follow a linear model. Application of the two methods are then demonstrated on a real data set. To evaluate the performance of the aligned rank transform method with the corresponding normal theory method, data were simulated according to the characteristics of a real gene expression data. These simulated data are then used to compare the two methods with respect to their sensitivity to the distributional assumption and to outliers for controlling the family-wise Type I error rate, power, and false discovery rate. It is demonstrated that the ART generally possesses the robustness of validity property even for microarray data with small number of replications. Although these methods can be applied to more general designs, in this article the simulation study is carried out for a dye-swap design since this design is broadly used in cDNA microarray experiments.  相似文献   

9.
We study the problem of merging homogeneous groups of pre-classified observations from a robust perspective motivated by the anti-fraud analysis of international trade data. This problem may be seen as a clustering task which exploits preliminary information on the potential clusters, available in the form of group-wise linear regressions. Robustness is then needed because of the sensitivity of likelihood-based regression methods to deviations from the postulated model. Through simulations run under different contamination scenarios, we assess the impact of outliers both on group-wise regression fitting and on the quality of the final clusters. We also compare alternative robust methods that can be adopted to detect the outliers and thus to clean the data. One major conclusion of our study is that the use of robust procedures for preliminary outlier detection is generally recommended, except perhaps when contamination is weak and the identification of cluster labels is more important than the estimation of group-specific population parameters. We also apply the methodology to find homogeneous groups of transactions in one empirical example that illustrates our motivating anti-fraud framework.  相似文献   

10.
We consider the two-sample t-test where error variances are unknown but with known relationships between them. This situation arises, for example, when two measuring instruments average different number of replicates to report the response. In particular we compare our procedure with the usual Satterthwaite approximation in the two sample t-test with variances unequal. Our procedure uses the knowledge of a known ratio of variances while the Satterthwaite approximation assumes only that the two variances are unequal. Simulations show that our procedure has both better size and better power than the Satterthwaite approximation. Finally, we consider an extension of our results to the General Linear Model.  相似文献   

11.
The pronerties of the tests and confidence regions for the parameters in the classical general linear model depend upon the equality of the variances of the error terms. The level and power of tests and the confidence coefficients associated with confidence regions are vitiated when the assumption of equality is not true. Even when the error variances are equal the power of tests and the size of confidence regions depend upon the unknown common variance and hence are uncontrollable. This paper presents a two-stage procedure which yields tests and confidence regions which are completely independent of the variances of the errors and hence tests with controllable power and confidence regions of fixed controllable size are obtained.  相似文献   

12.
A number of nonparametric tests are compared empirically for a randomized block layout. We assess tests appropriate for when the data are not consistent with normality or when outliers invalidate traditional analysis of variance (ANOVA) tests. The objective is to assess, within this setting, tests that use ranks within blocks, the rank transform procedure that ranks the complete sample and continuous analogs of the Cochran–Mantel–Haenszel tests. The usual linear model is assumed, and our primary foci are tests of equality of means and component tests that assess linear and quadratic trends in the means. These tests include the traditional Page and Friedman tests. We conclude that the rank transform tests have competitive power and warrant greater use than is currently apparent.  相似文献   

13.
All-pairs power in a one-way ANOVA is the probability of detecting all true differences between pairs of means. Ramsey (1978) found that for normal distributions having equal variances, step-down multiple comparison procedures can have substantially more all-pairs power than single-step procedures, such as Tukey’s HSD, when equal sample sizes are randomly sampled from each group. This paper suggests a step-down procedure for the case of unequal variances and compares it to Dunnett's T3 technique. The new procedure is similar in spirit to one of the heteroscedastic procedures described by Hochberg and Tamhane (1987), but it has certain advantages that are discussed in the paper. Included are results on unequal sample sizes.  相似文献   

14.
Three procedures for testing the adequacy of a proposed linear multiresponse regression model against unspecified general alternatives are considered. The model has an error structure with a matrix normal distribution which allows the vector of responses for a particular run to have an unknown covariance matrix while the responses for different runs are uncorrelated. Furthermore, each response variable may be modeled by a separate design matrix. Multivariate statistics corresponding to the classical univariate lack of fit and pure error sums of squares are defined and used to determine the multivariate lack of fit tests. A simulation study was performed to compare the power functions of the test procedures in the case of replication. Generalizations of the tests for the case in which there are no independent replicates on all responses are also presented.  相似文献   

15.
Often, the response variables on sampling units are observed repeatedly over time. The sampling units may come from different populations, such as treatment groups. This setting is routinely modeled by a random coefficients growth curve model, and the techniques of general linear mixed models are applied to address the primary research aim. An alternative approach is to reduce each subject’s data to summary measures, such as within-subject averages or regression coefficients. One may then test for equality of means of the summary measures (or functions of them) among treatment groups. Here, we compare by simulation the performance characteristics of three approximate tests based on summary measures and one based on the full data, focusing mainly on accuracy of p-values. We find that performances of these procedures can be quite different for small samples in several different configurations of parameter values. The summary-measures approach performed at least as well as the full-data mixed models approach.  相似文献   

16.
The power of some rank tests, used for testing the hypothesis of shift, is found when the underlying distributions contain outliers. The outliers are assumed to occur as the result of mixing two normal distributions with common variance. A small sample case shows how the scores for the rank tests are found and the exact power is computed for each of these rank tests. A Monte Carlo study provides an estimate of the power of the usual two sample t-test.  相似文献   

17.
Psychometric growth curve modeling techniques are used to describe a person’s latent ability and how that ability changes over time based on a specific measurement instrument. However, the same instrument cannot always be used over a period of time to measure that latent ability. This is often the case when measuring traits longitudinally in children. Reasons may be that over time some measurement tools that were difficult for young children become too easy as they age resulting in floor effects or ceiling effects or both. We propose a Bayesian hierarchical model for such a scenario. Within the Bayesian model we combine information from multiple instruments used at different age ranges and having different scoring schemes to examine growth in latent ability over time. The model includes between-subject variance and within-subject variance and does not require linking item specific difficulty between the measurement tools. The model’s utility is demonstrated on a study of language ability in children from ages one to ten who are hard of hearing where measurement tool specific growth and subject-specific growth are shown in addition to a group level latent growth curve comparing the hard of hearing children to children with normal hearing.KEYWORDS: Bayesian hierarchical models, psychometric modeling, language ability, growth curve modeling, longitudinal analysis  相似文献   

18.
Designs based on any number of replicated Latin squares are examined for their robustness against the loss of up to three observations randomly scattered throughout the design. The information matrix for the treatment effects is used to evaluate the average variances of the treatment differences for each design in terms of the number of missing values and the size of the design. The resulting average variances are used to assess the overall robustness of the designs. In general, there are 16 different situations for the case of three missing values when there are at least three Latin square replicates in the design. Algebraic expressions may be determined for all possible configurations, but here the best and worst cases are given in detail. Numerical illustrations are provided for the average variances, relative efficiencies, minimum and maximum variances and the frequency counts, showing the effects of the missing values for a range of design sizes and levels of replication.  相似文献   

19.
This paper presents an investigation of the behavior of the levels of significance of the two-sample t and its related tests and the Mann-Whitney test when the samples are randomly drawn from mixtures of two normal populations (compound normals) and when the sample sizes are small (combined sample sizes ? 15). The use of the compound normal allows for investigation when the underlying populations are unequal, nonnormal, heterogeneous in variances, unimodal or bimodal, possessing smaller than normal kurtosis or containing contamination. The exact distribution of the t and its related tests are given. However, they are not readily amenable to calculations. Most of the numerical results presented were obtained by simulations  相似文献   

20.
The coefficient of variation (CV) is commonly used to measure relative dispersion. However, since it is based on the sample mean and standard deviation, outliers can adversely affect it. Additionally, for skewed distributions the mean and standard deviation may be difficult to interpret and, consequently, that may also be the case for the CV. Here we investigate the extent to which quantile-based measures of relative dispersion can provide appropriate summary information as an alternative to the CV. In particular, we investigate two measures, the first being the interquartile range (in lieu of the standard deviation), divided by the median (in lieu of the mean), and the second being the median absolute deviation, divided by the median, as robust estimators of relative dispersion. In addition to comparing the influence functions of the competing estimators and their asymptotic biases and variances, we compare interval estimators using simulation studies to assess coverage.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号