期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Note on Sample Size Determination for the Estimation of the Mean Vector of a Multivariate Population

Wen Cui Feiqi Zhu 《统计学通讯:理论与方法》2013,42(8):1607-1610

In this article, we present a straightforward Bonferroni approach for determining sample size for estimating the mean vector of a multivariate population under two scenarios: (1) a pre-specified overall confidence level is desired; and (2) a pre-specified confidence level needs to be guaranteed for each individual variable. It is demonstrated that correlation between variables helps reduce the sample size. The formula to calculate the reduced sample size is derived. A binormal example is presented to illustrate the effect of correlation on sample size reduction for various values of the correlation coefficient. 相似文献

2.

Sample Size Calculation for the Therapeutic Equivalence Problem

Yiannis C. Bassiakos Panos C. Katerelos 《统计学通讯:模拟与计算》2013,42(4):1019-1026

A method is proposed for the sample size calculation in the case of therapeutic equivalence of two pharmaceuticals, when the decision is based on post-treatment differences and the post-treatment values are dependent on the pretreatment ones. When the correlation coefficient is large (over 0.7), it is shown that sample size calculation (and the corresponding hypothesis test) based on the sample statistic formed by the mean difference of the post–pre differences of each group has smaller variance and hence leads to smaller sample sizes. 相似文献

3.

Bootstrap power of the generalized correlation coefficient

Reza Modarres 《Statistics and Computing》1996,6(2):139-145

We present a bootstrap Monte Carlo algorithm for computing the power function of the generalized correlation coefficient. The proposed method makes no assumptions about the form of the underlying probability distribution and may be used with observed data to approximate the power function and pilot data for sample size determination. In particular, the bootstrap power functions of the Pearson product moment correlation and the Spearman rank correlation are examined. Monte Carlo experiments indicate that the proposed algorithm is reliable and compares well with the asymptotic values. An example which demonstrates how this method can be used for sample size determination and power calculations is provided. 相似文献

4.

Comparison of the estimators of the intra-cluster correlation for the nested error regression model

Sukanya Intarapak Rawee Suwandechochai 《统计学通讯:模拟与计算》2017,46(3):2057-2070

The intra-cluster correlation is insisted on nested error regression model that, in practice, is rarely known. This article demonstrates the size in generalized least squares (GLS) F-test using Fuller–Battese transformation and modification F-test. For the balanced case, the former using strictly positive, analysis of covariance (ANCOVA) and analysis of variance (ANOVA) estimators of intra-cluster correlation can control the size for moderate intra-cluster correlations. For small intra-cluster correlation, they perform well when the numbers of cluster are large. The latter using the ANOVA estimator performs well except for small numbers of cluster. When intra-cluster correlation is large, it cannot control the size. For the unbalanced case, the GLS F-test using the Fuller–Battese transformation and the modification F-test using the strictly positive, the ANCOVA and the ANOVA estimators maintain the significance level for small total sample size and small intra-cluster correlations when there is a large variation in cluster sizes, but they perform well in controlling the size for large total sample size and small different variation in cluster sizes. Besides, Henderson’s method 3 estimator maintains the significance level for a few situations. 相似文献

5.

Bounding sample size projections for the area under a ROC curve

Jeffrey D. Blume 《Journal of statistical planning and inference》2009

Studies of diagnostic tests are often designed with the goal of estimating the area under the receiver operating characteristic curve (AUC) because the AUC is a natural summary of a test's overall diagnostic ability. However, sample size projections dealing with AUCs are very sensitive to assumptions about the variance of the empirical AUC estimator, which depends on two correlation parameters. While these correlation parameters can be estimated from the available data, in practice it is hard to find reliable estimates before the study is conducted. Here we derive achievable bounds on the projected sample size that are free of these two correlation parameters. The lower bound is the smallest sample size that would yield the desired level of precision for some model, while the upper bound is the smallest sample size that would yield the desired level of precision for all models. These bounds are important reference points when designing a single or multi-arm study; they are the absolute minimum and maximum sample size that would ever be required. When the study design includes multiple readers or interpreters of the test, we derive bounds pertaining to the average reader AUC and the ‘pooled’ or overall AUC for the population of readers. These upper bounds for multireader studies are not too conservative when several readers are involved. 相似文献

6.

Evaluating the Significance Test When the Correlation Coefficient is Different from Zero in the Test of Hypothesis

Ersin Ogus A. Canan Yazici Fikret Gurbuz 《统计学通讯:模拟与计算》2013,42(4):847-854

Sample size and correlation coefficient of populations are the most important factors which influence the statistical significance of the sample correlation coefficient. It is observed that for evaluating the hypothesis when the observed value of the correlation coefficient's r is different from zero, Fisher's Z transformation may be incorrect for small samples especially when population correlation coefficient ρ has big values. In this study, a simulation program has been generated for to illustrate how the bias in the Fisher transformation of the correlation coefficient affects estimate precision when sample size is small and ρ has big value. By the simulation results, 90 and 95% confidence intervals of correlation coefficients have been created and tabled. As a result, it is suggested that especially when ρ is greater than 0.2 and sample sizes of 18 or less, Tables 1 and 2 can be used for the significance test in correlations. 相似文献

7.

Sample Sizes in the Interval Estimation of the Correlation Coefficient

Girma Wolde-Tsadik 《The American statistician》2013,67(4):227-228

An expression is derived for the maximum length of the interval estimator of the correlation coefficient, p, under bivariate normal assumptions. The prespecification of this minimum attainable precision and the confidence level results in an expression for the sample size required. An approximate expression for the sample size is proposed and is numerically shown to be as good as or better than that based on the Fisher's Z transformation. 相似文献

8.

THE EFFECTS OF CORRELATION AMONG OBSERVATIONS ON THE ACCURACY OF APPROXIMATING THE DISTRIBUTION OF SAMPLE MEAN BY ITS ASYMPTOTIC DISTRIBUTION1

Subhash C. Sharma 《Australian & New Zealand Journal of Statistics》1985,27(2):138-150

In this paper, we investigate the effects of correlation among observations on the accuracy of approximating the distribution of sample mean by its asymptotic distribution. The accuracy is investigated by the Berry-Esseen bound (BEB), which gives an upper bound on the error of approximation of the distribution function of the sample mean from its asymptotic distribution for independent observations. For a given sample size (n₀) the BEB is obtained when the observations are independent. Let this be BEB. We then find the sample size (n_*) required to have BEB below BEB₀, when the observations are dependent. Comparison of n_* with n₀ reveals the effects of correlation among observations on the accuracy of the asymptotic distribution as an approximation. It is shown that the effects of correlation among observations are not appreciable if the correlation is moderate to small but it can be severe for extreme correlations. 相似文献

9.

Sample size calculations for time-averaged difference of longitudinal binary outcomes

Ying Lou Jing Cao Song Zhang 《统计学通讯:理论与方法》2017,46(1):344-353

In clinical trials with repeated measurements, the responses from each subject are measured multiple times during the study period. Two approaches have been widely used to assess the treatment effect, one that compares the rate of change between two groups and the other that tests the time-averaged difference (TAD). While sample size calculations based on comparing the rate of change between two groups have been reported by many investigators, the literature has paid relatively little attention to the sample size estimation for time-averaged difference (TAD) in the presence of heterogeneous correlation structure and missing data in repeated measurement studies. In this study, we investigate sample size calculation for the comparison of time-averaged responses between treatment groups in clinical trials with longitudinally observed binary outcomes. The generalized estimating equation (GEE) approach is used to derive a closed-form sample size formula, which is flexible enough to account for arbitrary missing patterns and correlation structures. In particular, we demonstrate that the proposed sample size can accommodate a mixture of missing patterns, which is frequently encountered by practitioners in clinical trials. To our knowledge, this is the first study that considers the mixture of missing patterns in sample size calculation. Our simulation shows that the nominal power and type I error are well preserved over a wide range of design parameters. Sample size calculation is illustrated through an example. 相似文献

10.

A test for the complete independence of high-dimensional random vectors

《Journal of Statistical Computation and Simulation》2012,82(16):3135-3140

ABSTRACT

This paper discusses the problem of testing the complete independence of random variables when the dimension of observations can be much larger than the sample size. It is reported that two typical tests based on, respectively, the biggest off-diagonal entry and the largest eigenvalue of the sample correlation matrix lose their control of type I error in such high-dimensional scenarios, and exhibit distinct behaviours in type II error under different types of alternative hypothesis. Given these facts, we propose a permutation test procedure by synthesizing these two extreme statistics. Simulation results show that for finite dimension and sample size the proposed test outperforms the existing methods in various cases. 相似文献

11.

Some sampling effects of pairwise correlated observations on likelihood ratio tests for the difference between two means

Nicholas J. Schork M. Anthony Schork 《统计学通讯:理论与方法》2013,42(9):123-129

We study the effects of the inclusion of pairs of correlated observations in a sample on likelihood ratio tests for the difference in two means. In particular, we assess how the inclusion of correlated data pairs (e.g., such as data inadvertently collected from sib-pairs) affects the sample size requirements necessary for the implementation of a Likelihood Ratio (LR) test for the difference between two means. Our results suggest that correlated data pairs beneficially or adversely effect sample size requirements for an LR test to a degree functionally related to the mixture parameters dictating their relative frequencies in the larger sample on which the test will be performed, the strength of the correlation between the observations, and the size of imbalances in the sample with respect to the number of observations in each group. The relevance and implications of the results for genetic and epidemiologic research are discussed. 相似文献

12.

On marginal likelihood inference for the intra-class correlation coefficient

M. Safiul Haq V. Ming Ng 《统计学通讯:理论与方法》2013,42(2):179-189

A p-component set of responses have been constructed by a location-scale transformation to a p-component set of error variables, the covariance matrix of the set of error variables being of intra-class covariance structure:all variances being unity, and covariance being equal [IML0001]. A sample of size n has been described as a conditional structural model, conditional on the value of the intra-class correlation coefficient ρ. The conditional technique of structural inference provides the marginal likelihood function of ρ based on the standardized residuals. For the normal case, the marginal likelihood function of ρ is seen to be dependent on the standardized residuals through the sample intra-class correlation coefficient. By the likelihood modulation technique, the nonnull distribution of the sample intra-class correlation coefficient has also been obtained. 相似文献

13.

Learning causal structure from mixed data with missing values using Gaussian copula models

Cui Ruifei Groot Perry Heskes Tom 《Statistics and Computing》2019,29(2):311-333

We consider the problem of causal structure learning from data with missing values, assumed to be drawn from a Gaussian copula model. First, we extend the ‘Rank PC’ algorithm, designed for Gaussian copula models with purely continuous data (so-called nonparanormal models), to incomplete data by applying rank correlation to pairwise complete observations and replacing the sample size with an effective sample size in the conditional independence tests to account for the information loss from missing values. When the data are missing completely at random (MCAR), we provide an error bound on the accuracy of ‘Rank PC’ and show its high-dimensional consistency. However, when the data are missing at random (MAR), ‘Rank PC’ fails dramatically. Therefore, we propose a Gibbs sampling procedure to draw correlation matrix samples from mixed data that still works correctly under MAR. These samples are translated into an average correlation matrix and an effective sample size, resulting in the ‘Copula PC’ algorithm for incomplete data. Simulation study shows that: (1) ‘Copula PC’ estimates a more accurate correlation matrix and causal structure than ‘Rank PC’ under MCAR and, even more so, under MAR and (2) the usage of the effective sample size significantly improves the performance of ‘Rank PC’ and ‘Copula PC.’ We illustrate our methods on two real-world datasets: riboflavin production data and chronic fatigue syndrome data.

相似文献

14.

Conservative confidence intervals for the intraclass correlation coefficient for clustered binary data

Guogen Shan 《Journal of applied statistics》2022,49(10):2535

Asymptotic approaches are traditionally used to calculate confidence intervals for intraclass correlation coefficient in a clustered binary study. When sample size is small to medium, or correlation or response rate is near the boundary, asymptotic intervals often do not have satisfactory performance with regard to coverage. We propose using the importance sampling method to construct the profile confidence limits for the intraclass correlation coefficient. Importance sampling is a simulation based approach to reduce the variance of the estimated parameter. Four existing asymptotic limits are used as statistical quantities for sample space ordering in the importance sampling method. Simulation studies are performed to evaluate the performance of the proposed accurate intervals with regard to coverage and interval width. Simulation results indicate that the accurate intervals based on the asymptotic limits by Fleiss and Cuzick generally have shorter width than others in many cases, while the accurate intervals based on Zou and Donner asymptotic limits outperform others when correlation and response rate are close to their boundaries. 相似文献

15.

Path analysis and determining the distribution of indirect effects via simulation

Öznur İşçi Güneri Atilla Göktaş Uğur Kayalı 《Journal of applied statistics》2017,44(7):1181-1210

The difference between a path analysis and the other multivariate analyses is that the path analysis has the ability to compute the indirect effects apart from the direct effects. The aim of this study is to investigate the distribution of indirect effects that is one of the components of path analysis via generated data. To realize this, a simulation study has been conducted with four different sample sizes, three different numbers of explanatory variables and with three different correlation matrices. A replication of 1000 has been applied for every single combination. According to the results obtained, it is found that irrespective of the sample size path coefficients tend to be stable. Moreover, path coefficients are not affected by correlation types either. Since the replication number is 1000, which is fairly large, the indirect effects from the path models have been treated as normal and their confidence intervals have been presented as well. It is also found that the path analysis should not be used with three explanatory variables. We think that this study would help scientists who are working in both natural and social sciences to determine sample size and different number of variables in the path analysis. 相似文献

16.

A note on the accuracy of fisher's approximation to the large sample variance of an intraclass correlationa

Allan Donner John J. Koval 《统计学通讯:模拟与计算》2013,42(4):443-449

The adequacy of Fisher's approximation to the large sample variance of an intraclass correlation is investigated in the context of family studies. It is found that the approximation is highly accurate in samples of moderately large size (≧ 30 families), and can also be used for significance-testing under a broad range of circumstances. The exact sampling of distribution of the intraclass correlation coefficient is also derived. 相似文献

17.

Robust confidence interval for a residual standard deviation

Douglas G. Bonett 《Journal of applied statistics》2005,32(10):1089-1094

The residual standard deviation of a general linear model provides information about predictive accuracy that is not revealed by the multiple correlation or regression coefficients. The classic confidence interval for a residual standard deviation is hypersensitive to minor violations of the normality assumption and its robustness does not improve with increasing sample size. An approximate confidence interval for the residual standard deviation is proposed and shown to be robust to moderate violations of the normality assumption with robustness to extreme non-normality that improves with increasing sample size. 相似文献

18.

Asymptotic distributions of functions of a sample covariance matrix under the elliptical distribution

Toshiya Iwashita Minoru Siotani 《Revue canadienne de statistique》1994,22(2):273-283

This paper is concerned with asymptotic distributions of functions of a sample covariance matrix under the elliptical model. Simple but useful formulae for calculating asymptotic variances and covariances of the functions are derived. Also, an asymptotic expansion formula for the expectation of a function of a sample covariance matrix is derived; it is given up to the second-order term with respect to the inverse of the sample size. Two examples are given: one of calculating the asymptotic variances and covariances of the stepdown multiple correlation coefficients, and the other of obtaining the asymptotic expansion formula for the moments of sample generalized variance. 相似文献

19.

An estimator of population total in multiple characteristics and its robustness to product moment correlation coefficient

《Journal of Statistical Computation and Simulation》2012,82(5):377-386

We developed an alternative estimator for the probability proportional to size with replacement sampling scheme when certain characteristics under study have low correlation with the size measured used for sample selection. The performance of the proposed estimator has been studied with other related alternative estimators by comparing biases and the variances of respective alternative estimators. Most of the alternative estimators assume the knowledge of the product moment correlation coefficient. Therefore an empirical study, with the help of wide variety of populations, has been carried out to study their respective efficiency when correlation coefficient is departed from its true value. 相似文献

20.

Sample size estimation for a two-group comparison of repeated count outcomes using GEE

Ying Lou Jing Cao Song Zhang 《统计学通讯:理论与方法》2017,46(14):6743-6753

Randomized clinical trials with count measurements as the primary outcome are common in various medical areas such as seizure counts in epilepsy trials, or relapse counts in multiple sclerosis trials. Controlled clinical trials frequently use a conventional parallel-group design that assigns subjects randomly to one of two treatment groups and repeatedly evaluates them at baseline and intervals across a treatment period of a fixed duration. The primary interest is to compare the rates of change between treatment groups. Generalized estimating equations (GEEs) have been widely used to compare rates of change between treatment groups because of its robustness to misspecification of the true correlation structure. In this paper, we derive a sample size formula for comparing the rates of change between two groups in a repeatedly measured count outcome using GEE. The sample size formula incorporates general missing patterns such as independent missing and monotone missing, and general correlation structures such as AR(1) and compound symmetry (CS). The performance of the sample size formula is evaluated through simulation studies. Sample size estimation is illustrated by a clinical trial example from epilepsy. 相似文献