首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
While analyzing 2 × 2 contingency tables, the log odds ratio for measuring the strength of association is often approximated by a normal distribution with some variance. We show that the expression of that variance needs to be modified in the presence of correlation between two binomial distributions of the contingency table. In the present paper, we derive a correlation-adjusted variance of the limiting normal distribution of log odds ratio. We also propose a correlation adjusted test based on the standard odds ratio for analyzing matched-pair studies and any other study settings that induce correlated binary outcomes. We demonstrate that our proposed test outperforms the classical McNemar’s test. Simulation studies show the gains in power are especially manifest when sample size is small and strong correlation is present. Two examples of real data sets are used to demonstrate that the proposed method may lead to conclusions significantly different from those reached using McNemar’s test.  相似文献   

2.
In this second part of this paper, reproducibility of discrete ordinal and nominal outcomes is addressed. The first part deals with continuous outcomes, concentrating on intraclass correlation (ρ) in the context of one‐way analysis of variance. For categorical data, the focus has generally not been on a meaningful population parameter such as ρ. However, intraclass correlation has been defined for discrete ordinal data, ρc, and for nominal data, κI. Therefore, a unified approach to reproducibility is proposed. The relevance of these parameters is outlined. Estimation and inferential procedures for ρc and κI are reviewed, together with worked examples. Topics related to reproducibility that are not addressed in either this or the previous paper are highlighted. Considerations for designing reproducibility studies and for interpreting their results are provided. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

3.
Zero-inflated power series distribution is commonly used for modelling count data with extra zeros. Inflation at point zero has been investigated and several tests for zero inflation have been examined. However sometimes, inflation occurs at a point apart from zero. In this case, we say inflation occurs at an arbitrary point j. The j-inflation has been discussed less than zero inflation. In this paper, inflation at an arbitrary point j is studied with more details and a Bayesian test for detecting inflation at point j is presented. The Bayesian method is extended to inflation at arbitrary points i and j. The relationship between the distribution for inflation at point j, inflation at points i and j and missing value imputation is studied. It is shown how to obtain a proper estimate of the population variance if a mean-imputed missing at random data set is used. Some simulation studies are conducted and the proposed Bayesian test is applied on two real data sets.  相似文献   

4.
When a two-level multilevel model (MLM) is used for repeated growth data, the individuals constitute level 2 and the successive measurements constitute level 1, which is nested within the individuals that make up level 2. The heterogeneity among individuals is represented by either the random-intercept or random-coefficient (slope) model. The variance components at level 1 involve serial effects and measurement errors under constant variance or heteroscedasticity. This study hypothesizes that missing serial effects or/and heteroscedasticity may bias the results obtained from two-level models. To illustrate this effect, we conducted two simulation studies, where the simulated data were based on the characteristics of an empirical mouse tumour data set. The results suggest that for repeated growth data with constant variance (measurement error) and misspecified serial effects (ρ > 0.3), the proportion of level-2 variation (intra-class correlation coefficient) increases with ρ and the two-level random-coefficient model is the minimum AIC (or AICc) model when compared with the fixed model, heteroscedasticity model, and random-intercept model. In addition, the serial effect (ρ > 0.1) and heteroscedasticity are both misspecified, implying that the two-level random-coefficient model is the minimum AIC (or AICc) model when compared with the fixed model and random-intercept model. This study demonstrates that missing serial effects and/or heteroscedasticity may indicate heterogeneity among individuals in repeated growth data (mixed or two-level MLM). This issue is critical in biomedical research.  相似文献   

5.
Hartley's test for homogeneity of k normal‐distribution variances is based on the ratio between the maximum sample variance and the minimum sample variance. In this paper, the author uses the same statistic to test for equivalence of k variances. Equivalence is defined in terms of the ratio between the maximum and minimum population variances, and one concludes equivalence when Hartley's ratio is small. Exact critical values for this test are obtained by using an integral expression for the power function and some theoretical results about the power function. These exact critical values are available both when sample sizes are equal and when sample sizes are unequal. One related result in the paper is that Hartley's test for homogeneity of variances is no longer unbiased when the sample sizes are unequal. The Canadian Journal of Statistics 38: 647–664; 2010 © 2010 Statistical Society of Canada  相似文献   

6.
Generalized variance is a measure of dispersion of multivariate data. Comparison of dispersion of multivariate data is one of the favorite issues for multivariate quality control, generalized homogeneity of multidimensional scatter, etc. In this article, the problem of testing equality of generalized variances of k multivariate normal populations by using the Bartlett's modified likelihood ratio test (BMLRT) is proposed. Simulations to compare the Type I error rate and power of the BMLRT and the likelihood ratio test (LRT) methods are performed. These simulations show that the BMLRT method has a better chi-square approximation under the null hypothesis. Finally, a practical example is given.  相似文献   

7.
When examining the effect of treatment A versus B, there may be a choice between a parallel group design, an AA/BB design, an AB/BA cross‐over and Balaam's design. In case of a linear mixed effects regression, it is examined, starting from a flexible function of the costs involved and allowing for subject dropout, which design is most efficient in estimating this effect. For no carry‐over, the AB/BA cross‐over design is most efficient as long as the dropout rate at the second measurement does not exceed /(1 + ρ), ρ being the intraclass correlation. For steady‐state carry‐over, depending on the costs involved, the dropout rate and ρ, either a parallel design or an AA/BB design is most efficient. For types of carry‐over that allow for self carry‐over, interest is in the direct treatment effect plus the self carry‐over effect, with either an AA/BB or Balaam's design being most efficient. In case of insufficient knowledge on the dropout rate or ρ, a maximin strategy is devised: choose the design that minimizes the maximum variance of the treatment estimator. Such maximin designs are derived for each type of carry‐over. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

8.
When the error terms are autocorrelated, the conventional t-tests for individual regression coefficients mislead us to over-rejection of the null hypothesis. We examine, by Monte Carlo experiments, the small sample properties of the unrestricted estimator of ρ and of the estimator of ρ restricted by the null hypothesis. We compare the small sample properties of the Wald, likelihood ratio and Lagrange multiplier test statistics for individual regression coefficients. It is shown that when the null hypothesis is true, the unrestricted estimator of ρ is biased. It is also shown that the Lagrange multiplier test using the maximum likelihood estimator of ρ performs better than the Wald and likelihood ratio tests.  相似文献   

9.
Across a variety of clinical settings, repeated measurements on an individual, obtained under identical circumstances, often differ from one another. This implies the measurements lack perfect reproducibility. Topics related to reproducibility of clinical measurements are introduced in this paper. In this first of two parts, continuous outcomes are addressed. The intraclass correlation coefficient, ρ, has been the traditional coefficient of reproducibility for continuous outcomes. The importance of ρ regarding observations on an individual, and observations among populations, is outlined. Estimation and inferential procedures for ρ are reviewed and worked examples are provided. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

10.
Abstract. In this study we are concerned with inference on the correlation parameter ρ of two Brownian motions, when only high‐frequency observations from two one‐dimensional continuous Itô semimartingales, driven by these particular Brownian motions, are available. Estimators for ρ are constructed in two situations: either when both components are observed (at the same time), or when only one component is observed and the other one represents its volatility process and thus has to be estimated from the data as well. In the first case it is shown that our estimator has the same asymptotic behaviour as the standard one for i.i.d. normal observations, whereas a feasible estimator can still be defined in the second framework, but with a slower rate of convergence.  相似文献   

11.
The standard log-rank test has been extended by adopting various weight functions. Cancer vaccine or immunotherapy trials have shown a delayed onset of effect for the experimental therapy. This is manifested as a delayed separation of the survival curves. This work proposes new weighted log-rank tests to account for such delay. The weight function is motivated by the time-varying hazard ratio between the experimental and the control therapies. We implement a numerical evaluation of the Schoenfeld approximation (NESA) for the mean of the test statistic. The NESA enables us to assess the power and to calculate the sample size for detecting such delayed treatment effect and also for a more general specification of the non-proportional hazards in a trial. We further show a connection between our proposed test and the weighted Cox regression. Then the average hazard ratio using the same weight is obtained as an estimand of the treatment effect. Extensive simulation studies are conducted to compare the performance of the proposed tests with the standard log-rank test and to assess their robustness to model mis-specifications. Our tests outperform the Gρ,γ class in general and have performance close to the optimal test. We demonstrate our methods on two cancer immunotherapy trials.  相似文献   

12.
Levene's tests of homogeneity of treatment variances in completely randomised and randomised complete block experiments are examined. These tests are essentially standard analysis of variance F-tests performed on functions of the absolute values of residuals. It is found that in order to achieve (i) equality of component mean squares under the null hypothesis, and (ii) nominal significance levels, the various standard degrees of freedom need to be modified.  相似文献   

13.
Abstract

In a quantitative linear model with errors following a stationary Gaussian, first-order autoregressive or AR(1) process, Generalized Least Squares (GLS) on raw data and Ordinary Least Squares (OLS) on prewhitened data are efficient methods of estimation of the slope parameters when the autocorrelation parameter of the error AR(1) process, ρ, is known. In practice, ρ is generally unknown. In the so-called two-stage estimation procedures, ρ is then estimated first before using the estimate of ρ to transform the data and estimate the slope parameters by OLS on the transformed data. Different estimators of ρ have been considered in previous studies. In this article, we study nine two-stage estimation procedures for their efficiency in estimating the slope parameters. Six of them (i.e., three noniterative, three iterative) are based on three estimators of ρ that have been considered previously. Two more (i.e., one noniterative, one iterative) are based on a new estimator of ρ that we propose: it is provided by the sample autocorrelation coefficient of the OLS residuals at lag 1, denoted r(1). Lastly, REstricted Maximum Likelihood (REML) represents a different type of two-stage estimation procedure whose efficiency has not been compared to the others yet. We also study the validity of the testing procedures derived from GLS and the nine two-stage estimation procedures. Efficiency and validity are analyzed in a Monte Carlo study. Three types of explanatory variable x in a simple quantitative linear model with AR(1) errors are considered in the time domain: Case 1, x is fixed; Case 2, x is purely random; and Case 3, x follows an AR(1) process with the same autocorrelation parameter value as the error AR(1) process. In a preliminary step, the number of inadmissible estimates and the efficiency of the different estimators of ρ are compared empirically, whereas their approximate expected value in finite samples and their asymptotic variance are derived theoretically. Thereafter, the efficiency of the estimation procedures and the validity of the derived testing procedures are discussed in terms of the sample size and the magnitude and sign of ρ. The noniterative two-stage estimation procedure based on the new estimator of ρ is shown to be more efficient for moderate values of ρ at small sample sizes. With the exception of small sample sizes, REML and its derived F-test perform the best overall. The asymptotic equivalence of two-stage estimation procedures, besides REML, is observed empirically. Differences related to the nature, fixed or random (uncorrelated or autocorrelated), of the explanatory variable are also discussed.  相似文献   

14.
Inference for the general linear model makes several assumptions, including independence of errors, normality, and homogeneity of variance. Departure from the latter two of these assumptions may indicate the need for data transformation or removal of outlying observations. Informal procedures such as diagnostic plots of residuals are frequently used to assess the validity of these assumptions or to identify possible outliers. A simulation-based approach is proposed, which facilitates the interpretation of various diagnostic plots by adding simultaneous tolerance bounds. Several tests exist for normality or homoscedasticity in simple random samples. These tests are often applied to residuals from a linear model fit. The resulting procedures are approximate in that correlation among residuals is ignored. The simulation-based approach accounts for the correlation structure of residuals in the linear model and allows simultaneously checking for possible outliers, non normality, and heteroscedasticity, and it does not rely on formal testing.

[Supplementary materials are available for this article. Go to the publisher's online edition of Communications in Statistics—Simulation and Computation® for the following three supplemental resource: a word file containing figures illustrating the mode of operation for the bisectional algorithm, QQ-plots, and a residual plot for the mussels data.]  相似文献   

15.
Zero-inflated models are commonly used for modeling count and continuous data with extra zeros. Inflations at one point or two points apart from zero for modeling continuous data have been discussed less than that of zero inflation. In this article, inflation at an arbitrary point α as a semicontinuous distribution is presented and the mean imputation for a continuous response is discussed as a cause of having semicontinuous data. Also, inflation at two points and generally at k arbitrary points and their relation to cell-mean imputation in the mixture of continuous distributions are studied. To analyze the imputed data, a mixture of semicontinuous distributions is used. The effects of covariates on the dependent variable in a mixture of k semicontinuous distributions with inflation at k points are also investigated. In order to find the parameter estimates, the method of expectation–maximization (EM) algorithm is used. In a real data of Iranian Households Income and Expenditure Survey (IHIES), it is shown how to obtain a proper estimate of the population variance when continuous missing at random responses are mean imputed.  相似文献   

16.
Based on the large-sample normal distribution of the sample log odds ratio and its asymptotic variance from maximum likelihood logistic regression, shortest 95% confidence intervals for the odds ratio are developed. Although the usual confidence interval on the odds ratio is unbiased, the shortest interval is not. That is, while covering the true odds ratio with the stated probability, the shortest interval covers some values below the true odds ratio with higher probability. The upper and lower limits of the shortest interval are shifted to the left of those of the usual interval, with greater shifts in the upper limits. With the log odds model γ + , in which X is binary, simulation studies showed that the approximate average percent difference in length is 7.4% for n (sample size) = 100, and 3.8% for n = 200. Precise estimates of the covering probabilities of the two types of intervals were obtained from simulation studies, and are compared graphically. For odds ratio estimates greater (less) than one, shortest intervals are more (less) likely to include one than are the usual intervals. The usual intervals are likelihood-based and the shortest intervals are not. The usual intervals have minimum expected length among the class of unbiased intervals. Shortest intervals do not provide important advantages over the usual intervals, which we recommend for practical use.  相似文献   

17.
In practice, the variance of the response variable may change as some specific factors change from one setting to another in a factorial experiment. These factors affecting the variation of the response are called dispersion factors, which can violate the usual assumption of variance homogeneity. In this study, we modify the conventional minimum aberration criterion to take the impact of dispersion factors into account. The situations of one or two dispersion factors are investigated. As a result, we present regular 2n ? p designs with run sizes equal to 16 and 32 using the modified minimum aberration criterion.  相似文献   

18.
Variance estimators for probability sample-based predictions of species richness (S) are typically conditional on the sample (expected variance). In practical applications, sample sizes are typically small, and the variance of input parameters to a richness estimator should not be ignored. We propose a modified bootstrap variance estimator that attempts to capture the sampling variance by generating B replications of the richness prediction from stochastically resampled data of species incidence. The variance estimator is demonstrated for the observed richness (SO), five richness estimators, and with simulated cluster sampling (without replacement) in 11 finite populations of forest tree species. A key feature of the bootstrap procedure is a probabilistic augmentation of a species incidence matrix by the number of species expected to be ‘lost’ in a conventional bootstrap resampling scheme. In Monte-Carlo (MC) simulations, the modified bootstrap procedure performed well in terms of tracking the average MC estimates of richness and standard errors. Bootstrap-based estimates of standard errors were as a rule conservative. Extensions to other sampling designs, estimators of species richness and diversity, and estimates of change are possible.  相似文献   

19.
Fosdick and Raftery (2012) recently encountered the problem of inference for a bivariate normal correlation coefficient ρ with known variances. We derive a variance-stabilizing transformation y(ρ) analogous to Fisher’s classical z-transformation for the unknown-variance case. Adjusting y for the sample size n produces an improved “confidence-stabilizing” transformation yn(ρ) that provides more accurate interval estimates for ρ than the known-variance MLE. Interestingly, the z transformation applied to the unknown-but-equal-variance MLE performs well in the known-variance case for smaller values of |ρ|. Both methods are useful for comparing two or more correlation coefficients in the known-variance case.  相似文献   

20.
This article considers the problem of testing the null hypothesis of stochastic stationarity in time series characterized by variance shifts at some (known or unknown) point in the sample. It is shown that existing stationarity tests can be severely biased in the presence of such shifts, either oversized or undersized, with associated spurious power gains or losses, depending on the values of the breakpoint parameter and on the ratio of the prebreak to postbreak variance. Under the assumption of a serially independent Gaussian error term with known break date and known variance ratio, a locally best invariant (LBI) test of the null hypothesis of stationarity in the presence of variance shifts is then derived. Both the test statistic and its asymptotic null distribution depend on the breakpoint parameter and also, in general, on the variance ratio. Modifications of the LBI test statistic are proposed for which the limiting distribution is independent of such nuisance parameters and belongs to the family of Cramér–von Mises distributions. One such modification is particularly appealing in that it is simultaneously exact invariant to variance shifts and to structural breaks in the slope and/or level of the series. Monte Carlo simulations demonstrate that the power loss from using our modified statistics in place of the LBI statistic is not large, even in the neighborhood of the null hypothesis, and particularly for series with shifts in the slope and/or level. The tests are extended to cover the cases of weakly dependent error processes and unknown breakpoints. The implementation of the tests are illustrated using output, inflation, and exchange rate data series.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号