期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimation under generalized sampling of cell proportions for contingency tables subject to marginal constraints

Beverley D. Causey 《统计学通讯:理论与方法》2013,42(20):2487-2494

Under simple random (multinomial) sampling the problem of estimating cell proportions for a contingency table subject to marginal constraints has been well explored. We briefly review methods that have been considered; then we develop a general method, for more complicated sampling, which reflects the variance structure of the estimated cell proportions. For stratified and cluster sampling we compare our method against earlier methods for the 2×2 table and find it potentially advantageous. 相似文献

2.

Bayesian inference for categorical data analysis 总被引：1，自引：0，他引：1

Alan Agresti David B. Hitchcock 《Statistical Methods and Applications》2005,14(3):297-330

This article surveys Bayesian methods for categorical data analysis, with primary emphasis on contingency table analysis. Early innovations were proposed by Good (1953, 1956, 1965) for smoothing proportions in contingency tables and by Lindley (1964) for inference about odds ratios. These approaches primarily used conjugate beta and Dirichlet priors. Altham (1969, 1971) presented Bayesian analogs of small-sample frequentist tests for 2 x 2 tables using such priors. An alternative approach using normal priors for logits received considerable attention in the 1970s by Leonard and others (e.g., Leonard 1972). Adopted usually in a hierarchical form, the logit-normal approach allows greater flexibility and scope for generalization. The 1970s also saw considerable interest in loglinear modeling. The advent of modern computational methods since the mid-1980s has led to a growing literature on fully Bayesian analyses with models for categorical data, with main emphasis on generalized linear models such as logistic regression for binary and multi-category response variables. 相似文献

3.

Reporting cumulative proportion of subjects with an adverse event based on data from multiple studies

Christy Chuang‐Stein Mohan Beltangady 《Pharmaceutical statistics》2011,10(1):3-7

Experience has shown us that when data are pooled from multiple studies to create an integrated summary, an analysis based on naïvely‐pooled data is vulnerable to the mischief of Simpson's Paradox. Using the proportions of patients with a target adverse event (AE) as an example, we demonstrate the Paradox's effect on both the comparison and the estimation of the proportions. While meta analytic approaches have been recommended and increasingly used for comparing safety data between treatments, reporting proportions of subjects experiencing a target AE based on data from multiple studies has received little attention. In this paper, we suggest two possible approaches to report these cumulative proportions. In addition, we urge that regulatory guidelines on reporting such proportions be established so that risks can be communicated in a scientifically defensible and balanced manner. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

4.

Logratio approach to statistical analysis of 2×2 compositional tables

K. Fačevicová K. Hron V. Todorov D. Guo M. Templ 《Journal of applied statistics》2014,41(5):944-958

Compositional tables represent a continuous counterpart to well-known contingency tables. Their cells contain quantitatively expressed relative contributions of a whole, carrying exclusively relative information and are popularly represented in proportions or percentages. The resulting factors, corresponding to rows and columns of the table, can be inspected similarly as with contingency tables, e.g. for their mutual independent behaviour. The nature of compositional tables requires a specific geometrical treatment, represented by the Aitchison geometry on the simplex. The properties of the Aitchison geometry allow a decomposition of the original table into its independent and interactive parts. Moreover, the specific case of 2×2 compositional tables allows the construction of easily interpretable orthonormal coordinates (resulting from the isometric logratio transformation) for the original table and its decompositions. Consequently, for a sample of compositional tables both explorative statistical analysis like graphical inspection of the independent and interactive parts or any statistical inference (odds-ratio-like testing of independence) can be performed. Theoretical advancements of the presented approach are demonstrated using two economic applications. 相似文献

5.

Detection of outlying proportions

Flavio Mignone 《Journal of applied statistics》2018,45(8):1382-1395

In this paper we introduce a new method for detecting outliers in a set of proportions. It is based on the construction of a suitable two-way contingency table and on the application of an algorithm for the detection of outlying cells in such table. We exploit the special structure of the relevant contingency table to increase the efficiency of the method. The main properties of our algorithm, together with a guide for the choice of the parameters, are investigated through simulations, and in simple cases some theoretical justifications are provided. Several examples on synthetic data and an example based on pseudo-real data from biological experiments demonstrate the good performances of our algorithm. 相似文献

6.

Exact Tests for 2 × 2 Contingency Tables

Linda June Davis 《The American statistician》2013,67(2):139-141

Fisher's exact test, difference in proportions, log odds ratio, Pearson's chi-squared, and likelihood ratio are compared as test statistics for testing independence of two dichotomous factors when the associated p values are computed by using the conditional distribution given the marginals. The statistics listed above that can be used for a one-sided alternative give identical p values. For a two-sided alternative, many of the above statistics lead to different p values. The p values are shown to differ only by which tables in the opposite tail from the observed table are considered more extreme than the observed table. 相似文献

7.

Meta-analysis of Proportions of Rare Events–A Comparison of Exact Likelihood Methods with Robust Variance Estimation

Yan Ma Haitao Chu Madhu Mazumdar 《统计学通讯:模拟与计算》2016,45(8):3036-3052

The conventional random effects model for meta-analysis of proportions approximates within-study variation using a normal distribution. Due to potential approximation bias, particularly for the estimation of rare events such as some adverse drug reactions, the conventional method is considered inferior to the exact methods based on binomial distributions. In this article, we compare two existing exact approaches—beta binomial (B-B) and normal-binomial (N-B)—through an extensive simulation study with focus on the case of rare events that are commonly encountered in medical research. In addition, we implement the empirical (“sandwich”) estimator of variance into the two models to improve the robustness of the statistical inferences. To our knowledge, it is the first such application of sandwich estimator of variance to meta-analysis of proportions. The simulation study shows that the B-B approach tends to have substantially smaller bias and mean squared error than N-B for rare events with occurrences under 5%, while N-B outperforms B-B for relatively common events. Use of the sandwich estimator of variance improves the precision of estimation for both models. We illustrate the two approaches by applying them to two published meta-analysis from the fields of orthopedic surgery and prevention of adverse drug reactions. 相似文献

8.

Confidence intervals for two sample binomial distribution

Lawrence Brown Xuefeng Li 《Journal of statistical planning and inference》2005,130(1-2):359-375

This paper considers confidence intervals for the difference of two binomial proportions. Some currently used approaches are discussed. A new approach is proposed. Under several generally used criteria, these approaches are thoroughly compared. The widely used Wald confidence interval (CI) is far from satisfactory, while the Newcombe's CI, new recentered CI and score CI have very good performance. Recommendations for which approach is applicable under different situations are given. 相似文献

9.

comparing several accelerated life models

Josemar Rodrigues Heleno Bolfarine Francisco Louzada-Neto 《统计学通讯:理论与方法》2013,42(8):2297-2308

This paper presents two approaches for comparing several exponential accelerated life models under the usual stress levels. The approaches are based on on the likelihood ratio statistics and on the posterior Bayes factor ( Aitkin, 1991). These procedures can be useful in many practical situations. An exact distribution and a table of critical values of the likelihood ratio statistics are presented. A simulation study is also performed. 相似文献

10.

A Bayesian approach to estimating linear mixtures with unknown covariance structure 总被引：1，自引：0，他引：1

Hannes Kazianka Michael Mulyk Jürgen Pilz 《Journal of applied statistics》2011,38(9):1801-1817

In this paper, we study a new Bayesian approach for the analysis of linearly mixed structures. In particular, we consider the case of hyperspectral images, which have to be decomposed into a collection of distinct spectra, called endmembers, and a set of associated proportions for every pixel in the scene. This problem, often referred to as spectral unmixing, is usually considered on the basis of the linear mixing model (LMM). In unsupervised approaches, the endmember signatures have to be calculated by an endmember extraction algorithm, which generally relies on the supposition that there are pure (unmixed) pixels contained in the image. In practice, this assumption may not hold for highly mixed data and consequently the extracted endmember spectra differ from the true ones. A way out of this dilemma is to consider the problem under the normal compositional model (NCM). Contrary to the LMM, the NCM treats the endmembers as random Gaussian vectors and not as deterministic quantities. Existing Bayesian approaches for estimating the proportions under the NCM are restricted to the case that the covariance matrix of the Gaussian endmembers is a multiple of the identity matrix. The self-evident conclusion is that this model is not suitable when the variance differs from one spectral channel to the other, which is a common phenomenon in practice. In this paper, we first propose a Bayesian strategy for the estimation of the mixing proportions under the assumption of varying variances in the spectral bands. Then we generalize this model to handle the case of a completely unknown covariance structure. For both algorithms, we present Gibbs sampling strategies and compare their performance with other, state of the art, unmixing routines on synthetic as well as on real hyperspectral fluorescence spectroscopy data. 相似文献

11.

Editor's Report

William R. Schucany 《The American statistician》2013,67(4):239-240

There are two common methods for statistical inference on 2 × 2 contingency tables. One is the widely taught Pearson chi-square test, which uses the well-known χ²statistic. The chi-square test is appropriate for large sample inference, and it is equivalent to the Z-test that uses the difference between the two sample proportions for the 2 × 2 case. Another method is Fisher’s exact test, which evaluates the likelihood of each table with the same marginal totals. This article mathematically justifies that these two methods for determining extreme do not completely agree with each other. Our analysis obtains one-sided and two-sided conditions under which a disagreement in determining extreme between the two tests could occur. We also address the question whether or not their discrepancy in determining extreme would make them draw different conclusions when testing homogeneity or independence. Our examination of the two tests casts light on which test should be trusted when the two tests draw different conclusions. 相似文献

12.

A comparative study of the K-means algorithm and the normal mixture model for clustering: Univariate case

Dingxi Qiu Ajit C. Tamhane 《Journal of statistical planning and inference》2007

This paper gives a comparative study of the K-means algorithm and the mixture model (MM) method for clustering normal data. The EM algorithm is used to compute the maximum likelihood estimators (MLEs) of the parameters of the MM model. These parameters include mixing proportions, which may be thought of as the prior probabilities of different clusters; the maximum posterior (Bayes) rule is used for clustering. Hence, asymptotically the MM method approaches the Bayes rule for known parameters, which is optimal in terms of minimizing the expected misclassification rate (EMCR). 相似文献

13.

Empirical bayes estimation in contingency tables

James H. Albert 《统计学通讯:理论与方法》2013,42(8):2459-2485

In the estimation of cell probabilities from a two–way contingency table, suppose that a priori the classification variables are believed independent. New empirical Bayes and Bayes estimators are proposed which shrink the observed proportions towards classical estimates under the model of independence. The estimators, based on a Dirichlet mixture class of priors, compare favorably to an estimator of Laird (1978) that is based on a normal prior on terms of a log–linear model. The methods are generalized to three–way tables. 相似文献

14.

Wilson Confidence Intervals for the Two-Sample Log-Odds-Ratio in Stratified 2 × 2 Contingency Tables

B. M. Brown Von Bing Yap 《统计学通讯:理论与方法》2013,42(18):3355-3370

Large-sample Wilson-type confidence intervals (CIs) are derived for a parameter of interest in many clinical trials situations: the log-odds-ratio, in a two-sample experiment comparing binomial success proportions, say between cases and controls. The methods cover several scenarios: (i) results embedded in a single 2 × 2 contingency table; (ii) a series of K 2 × 2 tables with common parameter; or (iii) K tables, where the parameter may change across tables under the influence of a covariate. The calculations of the Wilson CI require only simple numerical assistance, and for example are easily carried out using Excel. The main competitor, the exact CI, has two disadvantages: It requires burdensome search algorithms for the multi-table case and results in strong over-coverage associated with long confidence intervals. All the application cases are illustrated through a well-known example. A simulation study then investigates how the Wilson CI performs among several competing methods. The Wilson interval is shortest, except for very large odds ratios, while maintaining coverage similar to Wald-type intervals. An alternative to the Wald CI is the Agresti-Coull CI, calculated from the Wilson and Wald CIs, which has same length as the Wald CI but improved coverage. 相似文献

15.

Bayesian models for relative archaeological chronology building 总被引：1，自引：0，他引：1

Caitlin E. Buck & Sujit K. Sahu 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(4):423-440

For many years, archaeologists have postulated that the numbers of various artefact types found within excavated features should give insight about their relative dates of deposition even when stratigraphic information is not present. A typical data set used in such studies can be reported as a cross-classification table (often called an abundance matrix or, equivalently, a contingency table) of excavated features against artefact types. Each entry of the table represents the number of a particular artefact type found in a particular archaeological feature. Methodologies for attempting to identify temporal sequences on the basis of such data are commonly referred to as seriation techniques. Several different procedures for seriation including both parametric and non-parametric statistics have been used in an attempt to reconstruct relative chronological orders on the basis of such contingency tables. We develop some possible model-based approaches that might be used to aid in relative, archaeological chronology building. We use the recently developed Markov chain Monte Carlo method based on Langevin diffusions to fit some of the models proposed. Predictive Bayesian model choice techniques are then employed to ascertain which of the models that we develop are most plausible. We analyse two data sets taken from the literature on archaeological seriation. 相似文献

16.

A procedure for approximate negative binomial tolerance intervals

《Journal of Statistical Computation and Simulation》2012,82(2):438-450

In this article, we present a procedure for approximate negative binomial tolerance intervals. We utilize an approach that has been well-studied to approximate tolerance intervals for the binomial and Poisson settings, which is based on the confidence interval for the parameter in the respective distribution. A simulation study is performed to assess the coverage probabilities and expected widths of the tolerance intervals. The simulation study also compares eight different confidence interval approaches for the negative binomial proportions. We recommend using those in practice that perform the best based on our simulation results. The method is also illustrated using two real data examples. 相似文献

17.

A statistical method to convert published response rates into marginal distributions with an example application in psoriasis

Helmut Petto Ulrich Mrowietz Stefan Wilhelm Alexander Schacht 《Pharmaceutical statistics》2019,18(1):4-21

Assessment of severity is essential for the management of chronic diseases. Continuous variables like scores obtained from the Hamilton Rating Scale for Depression or the Psoriasis Area and Severity Index (PASI) are standard measures used in clinical trials of depression and psoriasis. In clinical trials of psoriasis, for example, the reduction of PASI from baseline in response to therapy, in particular the proportion of patients achieving at least 75%, 90%, or 100% improvement of disease (PASI 75, PASI 90, or PASI 100), is typically used to evaluate treatment efficacy. However, evaluation of the proportions of patients reaching absolute PASI values (eg, ≤1, ≤2, ≤3, or ≤5) has recently gained greater clinical interest and is increasingly being reported. When relative versus absolute scores are standard, as is the case with the PASI in psoriasis, it is difficult to compare absolute changes using existing published data. Thus, we developed a method to estimate absolute PASI levels from aggregated relative levels. This conversion method is based on a latent 2‐dimensional normal distribution for the absolute score at baseline and at a specific endpoint with a truncation to allow for baseline inclusion criterion. The model was fitted to aggregated results from simulations and from 3 phase III studies that had known absolute PASI proportions. The predictions represented the actual results quite precisely. This model might be applied to other conditions, such as depression, to estimate proportions of patients achieving an absolute low level of disease activity, given absolute values at baseline and proportions of patients achieving relative improvements at a subsequent time point. 相似文献

18.

Classical multilevel and Bayesian approaches to population size estimation using multiple lists 总被引：3，自引：0，他引：3

S. E. Fienberg M. S. Johnson & B. W. Junker 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1999,162(3):383-405

One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modelling approaches to capture–recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co-workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log-linear model; however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the 'size' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and the Bayesian contexts. 相似文献

19.

Cohen’s kappa is a weighted average 总被引：1，自引：0，他引：1

Matthijs J. Warrens 《Statistical Methodology》2011,8(6):473-484

相似文献

20.

An empirical comparison of several methods for testing the equality of dependent proportions

Kenneth J. Levy Subhash C. Narula 《统计学通讯:模拟与计算》2013,42(4):189-195

Three methods for testing the equality of nonindependent proportions were compared with, the use of Monte Carlo techniques. The three methods included Cochran's test, an ANOVA F test, and Hotelling's T² test. With respect to empirical significance levels, the ANOVA F test is recommended as the preferred method of analysis.

Oftentimes an experimenter is interested in testing the equality of several proportions. When the proportions are independent Kemp and Butcher (1972) and Butcher and Kemp (1974) compared several methods for analysing large sample binomial data for the case of a 3 x 3 factorial design without replication. In addition, Levy and Narula (1977) compared many of the same methods for analyzing binomial data; however, Levy and Narula investigated the relative utility of the methods for small sample sizes. 相似文献