期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Objective Bayesian analysis of counting experiments with correlated sources of background

Diego Casadei Cornelius Grunwald Florian Mentzel 《Journal of applied statistics》2018,45(4):649-667

Searches for faint signals in counting experiments are often encountered in particle physics and astrophysics, as well as in other fields. Many problems can be reduced to the case of a model with independent and Poisson-distributed signal and background. Often several background contributions are present at the same time, possibly correlated. We provide the analytic solution of the statistical inference problem of estimating the signal in the presence of multiple backgrounds, in the framework of objective Bayes statistics. The model can be written in the form of a product of a single Poisson distribution with a multinomial distribution. The first is related to the total number of events, whereas the latter describes the fraction of events coming from each individual source. Correlations among different backgrounds can be included in the inference problem by a suitable choice of the priors. 相似文献

2.

Large sample results for tests of association ii. multinomial and stratified sampling

Elena Kulinskaya 《统计学通讯:理论与方法》2013,42(5):1121-1150

This paper deals with the asymptotics of a class of tests for association in 2-way contingency tables based on square forms in cell frequencies, given the total number of observations (multinomial sampling) or one set of marginal totals (stratified sampling). The case when both row and column marginal totals are fixed (hypergeometric sampling) was studied in Kulinskaya (1994), The class of tests under consideration includes a number of classical measures for association, Its two subclasses are the tests based on statistics using centralized cell frequencies (asymptotically distributed as weighted sums of central chi-squares) and those using the non-centralized cell frequencies (asymptotically normal). The parameters of asymptotic distributions depend on the sampling model and on true marginal probabilities. Maximum efficiency for asymptotically normal statistics is achieved under hypergeometric sampling, If the cell frequencies or the statistic as a whole are centralized using marginal proportions as estimates for marginal probabilities, the asymptotic distribution does not differ much between models and it is equivalent to that under hypergeometric sampling. These findings give an extra justification for the use of permutation tests for association (which are based on hypergeometric sampling). As an application, several well known measures of association are analysed. 相似文献

3.

On square ordinal contingency tables: a comparison of social class and income mobility for the same individuals

D. R. Cox Michelle Jackson Shiwei Lu 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2009,172(2):483-493

Summary. Square contingency tables with matching ordinal rows and columns arise in particular as empirical transition matrices and the paper considers these in the context of social class and income mobility tables. Such tables relate the socio-economic position of parents to the socio-economic position of their child in adulthood. The level of association between parental and child socio-economic position is taken as a measure of mobility. Several approaches to analysis are described and illustrated by UK data in which interest focuses on comparisons of social class and income mobility tables that are derived from the same individuals. Account is taken of the use of the same individuals in the two tables. Additionally comparisons over time are considered. 相似文献

4.

Conditional statistical inference and quantification of relevance

Rolf Sundberg 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(1):299-315

Summary. We argue that it can be fruitful to take a predictive view on notions such as the precision of a point estimator and the confidence of an interval estimator in frequentist inference. This predictive approach has implications for conditional inference, because it immediately allows a quantification of the concept of relevance for conditional inference. Conditioning on an ancillary statistic makes inference more relevant in this sense, provided that the ancillary is a precision index. Not all ancillary statistics satisfy this demand. We discuss the problem of choice between alternative ancillary statistics. The approach also has implications for the best choice of variance estimator, taking account of correlations with the squared error of estimation itself. The theory is illustrated by numerous examples, many of which are classical. 相似文献

5.

Interpretation of Kappa and B statistics measures of agreement

SERGIO R. MUNOZ SHRIKANT I. BANGDIWALA 《Journal of applied statistics》1997,24(1):105-112

SUMMARY The Kappa statistic proposed by Cohen and the B statistic proposed by Bangdiwala are used to quantify the agreement between two observers, independently classifying the same n units into the same k categories. Both statistics correct for the agreement expected to result from chance alone, but the Kappa statistic is a measure that adjusts the observed proportion of agreement and ranges from- pc/(1- pc) to 1, where pc is the expected agreement that results from chance, and the B statistic is a measure that adjusts the observed area of agreement with that expected to result from chance, and ranges from 0 to 1. Statistical guidelines for the interpretation of either statistic are not available. For the Kappa statistic, the suggested arbitrary interpretation given by Landis and Koch is commonly quoted. This paper compares the behavior of the Kappa statistic and the B statistic in 3 3 and 4 4 contingency tables, under different agreement patterns. Based on simulation results, non-arbitrary guidelines for the interpretation of both statistics are provided. 相似文献

6.

Data augmentation in multi-way contingency tables with fixed marginal totals

《Journal of statistical planning and inference》2006,136(2):355-372

We describe and illustrate approaches to data augmentation in multi-way contingency tables for which partial information, in the form of subsets of marginal totals, is available. In such problems, interest lies in questions of inference about the parameters of models underlying the table together with imputation for the individual cell entries. We discuss questions of structure related to the implications for inference on cell counts arising from assumptions about log-linear model forms, and a class of simple and useful prior distributions on the parameters of log-linear models. We then discuss “local move” and “global move” Metropolis–Hastings simulation methods for exploring the posterior distributions for parameters and cell counts, focusing particularly on higher-dimensional problems. As a by-product, we note potential uses of the “global move” approach for inference about numbers of tables consistent with a prescribed subset of marginal counts. Illustration and comparison of MCMC approaches is given, and we conclude with discussion of areas for further developments and current open issues. 相似文献

7.

The effect of category choice on the odds ratio and several measures of association in case-control studies

Thomas W. O'Gorman Robert F. Woolson 《统计学通讯:理论与方法》2013,42(4):1157-1171

In many case-control studies the risk factors are categorized in order to clarify the analysis and presentation of the data. However, inconsistent categorization of continuous risk factors may make interpretation difficult. This paper attempts to evaluate the effect of the categorization procedure on the odds ratio and several measures of association. Often the risk factor is dichotomized and the data linking the risk factor and the disease is presented in a 2 x 2 table. We show that the odds ratio obtained from the 2x2 table is usually considerably larger than the comparable statistic that would have been obtained had a large number of outpoints been used. Also, if 2 x 2, 2 x 3, or 2 x 4 tables are obtained by using a few outpoints on the risk factor, the measures of association for these tables are usually greater than the measure that would have been obtained had a large number of cntpoints been used. We propose an odds ratio measure that more closely approximates the odds ratio between the continuous risk factor and disease. A corresponding measure of association is also proposed for 2 x 2, 2x3, and 2x4 tables. 相似文献

8.

Spectral density-based and measure-preserving ABC for partially observed diffusion processes. An illustration on Hamiltonian SDEs

Buckwar Evelyn Tamborrino Massimiliano Tubikanec Irene 《Statistics and Computing》2020,30(3):627-648

Approximate Bayesian computation (ABC) has become one of the major tools of likelihood-free statistical inference in complex mathematical models. Simultaneously, stochastic differential equations (SDEs) have developed to an established tool for modelling time-dependent, real-world phenomena with underlying random effects. When applying ABC to stochastic models, two major difficulties arise: First, the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the stochastic process under the same parameter configuration result in different trajectories. Second, exact simulation schemes to generate trajectories from the stochastic model are rarely available, requiring the derivation of suitable numerical methods for the synthetic data generation. To obtain summaries that are less sensitive to the intrinsic stochasticity of the model, we propose to build up the statistical method (e.g. the choice of the summary statistics) on the underlying structural properties of the model. Here, we focus on the existence of an invariant measure and we map the data to their estimated invariant density and invariant spectral density. Then, to ensure that these model properties are kept in the synthetic data generation, we adopt measure-preserving numerical splitting schemes. The derived property-based and measure-preserving ABC method is illustrated on the broad class of partially observed Hamiltonian type SDEs, both with simulated data and with real electroencephalography data. The derived summaries are particularly robust to the model simulation, and this fact, combined with the proposed reliable numerical scheme, yields accurate ABC inference. In contrast, the inference returned using standard numerical methods (Euler–Maruyama discretisation) fails. The proposed ingredients can be incorporated into any type of ABC algorithm and directly applied to all SDEs that are characterised by an invariant distribution and for which a measure-preserving numerical method can be derived.

相似文献

9.

Improved likelihood inference in beta regression

《Journal of Statistical Computation and Simulation》2012,82(4):431-443

We consider the issue of performing accurate small-sample likelihood-based inference in beta regression models, which are useful for modelling continuous proportions that are affected by independent variables. We derive small-sample adjustments to the likelihood ratio statistic in this class of models. The adjusted statistics can be easily implemented from standard statistical software. We present Monte Carlo simulations showing that inference based on the adjusted statistics we propose is much more reliable than that based on the usual likelihood ratio statistic. A real data example is presented. 相似文献

10.

Non-linear regression models for Approximate Bayesian Computation

Michael G. B. Blum Olivier François 《Statistics and Computing》2010,20(1):63-73

Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model. 相似文献

11.

Certain nonparametric classification rules and their asymptotic efficiencies

Z. Govindarajulu A. K. Gupta 《Revue canadienne de statistique》1977,5(2):167-178

Two nonparametric classification rules for e-univariace populations are proposed, one in which the probability of correct classification is a specified number and the other in which one has to evaluate the probability of correct classification. In each case the classification is with respect to the Chernoff and Savage (1958) class of statistics, with possible specialization to populations having different location shifts and different changes of scale. An optimum property, namely the consistency of the classification procedure, is established for the second rule, when the distributions are either fixed or “near” in the Pitman sense and are tending to a common distribution at a specified rate. A measure of asymptotic efficiency is defined for the second rule and its asymptotic efficiency based on the Chernoff-Savage class of statistics relative to the parametric competitors ie the case of location shifts and scale changes is shown to be equal to the analogous Pitman efficiency. 相似文献

12.

Semiparametric Mixtures of Generalized Exponential Families

RICHARD CHARNIGO RAMANI S. PILLA 《Scandinavian Journal of Statistics》2007,34(3):535-551

Abstract. A semiparametric mixture model is characterized by a non-parametric mixing distribution Q (with respect to a parameter θ ) and a structural parameter β common to all components. Much of the literature on mixture models has focused on fixing β and estimating Q . However, this can lead to inconsistent estimation of both Q and the order of the model m . Creating a framework for consistent estimation remains an open problem and is the focus of this article. We formulate a class of generalized exponential family (GEF) models and establish sufficient conditions for the identifiability of finite mixtures formed from a GEF along with sufficient conditions for a nesting structure. Finite identifiability and nesting structure lead to the central result that semiparametric maximum likelihood estimation of Q and β fails. However, consistent estimation is possible if we restrict the class of mixing distributions and employ an information-theoretic approach. This article provides a foundation for inference in semiparametric mixture models, in which GEFs and their structural properties play an instrumental role. 相似文献

13.

Using information on realized effects to determine prospective causal effects

Marshall M. Joffe 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(4):759-774

The potential outcomes approach to causal inference postulates that each individual has a number of possibly latent outcomes, each of which would be observed under a different treatment. For any individual, some of these outcomes will be unobservable or counterfactual. Information about post-treatment characteristics sometimes allows statements about what would have happened if an individual or group with these characteristics had received a different treatment. These are statements about the realized effects of the treatment. Determining the likely effect of an intervention before making a decision involves inference about effects in populations defined only by characteristics observed before decisions about treatment are made. Information on realized effects can tighten bounds on these prospectively defined measures of the intervention effect. We derive formulae for the bounds and their sampling variances and illustrate these points with data from a hypothetical study of the efficacy of screening mammography. 相似文献

14.

Fast and Stable Algorithms for Computing and Sampling From the Noncentral Hypergeometric Distribution

《The American statistician》2013,67(4):366-369

Although the noncentral hypergeometric distribution underlies conditional inference for 2 × 2 tables, major statistical packages lack support for this distribution. This article introduces fast and stable algorithms for computing the noncentral hypergeometric distribution and for sampling from it. The algorithms avoid the expensive and explosive combinatorial numbers by using a recursive relation. The algorithms also take advantage of the sharp concentration of the distribution around its mode to save computing time. A modified inverse method substantially reduces the number of searches in generating a random deviate. The algorithms are implemented in a Java class, Hypergeometric, available on the World Wide Web. 相似文献

15.

Non-parametric predictive inference for future order statistics

Frank P. A. Coolen Tahani Coolen-Maturi Hana N. Alqifari 《统计学通讯:理论与方法》2018,47(10):2527-2548

This article presents non-parametric predictive inference for future order statistics. Given the data consisting of n real-valued observations, m future observations are considered and predictive probabilities are presented for the rth-ordered future observation. In addition, joint and conditional probabilities for events involving multiple future order statistics are presented. The article further presents the use of such predictive probabilities for order statistics in statistical inference, in particular considering pairwise and multiple comparisons based on two or more independent groups of data. 相似文献

16.

Extended tables for moments of gamma distribution order statistics

S. D. Walter L. W. Stitt 《统计学通讯:模拟与计算》2013,42(2):471-487

Apart from having intrinsic mathematical interest, order statistics are also useful in the solution of many applied sampling and analysis problems. For a general review of the properties and uses of order statistics, see David (1981). This paper provides tabulations of means and variances of certain order statistics from the gamma distribution, for parameter values not previously available. The work was motivated by a particular quota sampling problem, for which existing tables are not adequate. The solution to this sampling problem actually requires the moments of the highest order statistic within a given set; however the calculation algorithm used involves a recurrence relation, which causes all the lower order statistics to be calculated first. Therefore we took the opportunity to develop more extensive tables for the gamma order statistic moments in general. Our tables provide values for the order statistic moments which were not available in previous tables, notably those for higher values of m, the gamma distribution shape parameter. However we have also retained the corresponding statistics for lower values of m, first to allow for checking accuracy of the computtions agtainst previous tables, and second to provide an integrated presentation of our new results with the previously known values in a consistent format 相似文献

17.

A bayesian test for a two-way contingency table using independence priors

James H. Albert 《Revue canadienne de statistique》1990,18(4):347-363

One method of testing for independence in a two-way table is based on the Bayes factor, the ratio of the likelihoods under the independence hypothesis H and the alternative hypothesis H. The main difficulty in this approach is the specification of prior distributions on the composite hypotheses H and H. A new Bayesian test statistic is constructed by using a prior distribution on H that is concentrated about the “independence surface” H. Approximations are proposed which simplify the computation of the test statistic. The values of the Bayes factor are compared with values of statistics proposed by Gunel and Dickey (1974), Good and Crook (1987), and Spiegelhalter and Smith (1982) for a number of two-way tables. This investigation suggests a strong relationship between the new statistic and the p-value. 相似文献

18.

The Effects of Rounding on Likelihood Procedures

L. Pace A. Salvan L. Ventura 《Journal of applied statistics》2004,31(1):29-48

The aim of this paper is to investigate the robustness properties of likelihood inference with respect to rounding effects. Attention is focused on exponential families and on inference about a scalar parameter of interest, also in the presence of nuisance parameters. A summary value of the influence function of a given statistic, the local-shift sensitivity, is considered. It accounts for small fluctuations in the observations. The main result is that the local-shift sensitivity is bounded for the usual likelihood-based statistics, i.e. the directed likelihood, the Wald and score statistics. It is also bounded for the modified directed likelihood, which is a higher-order adjustment of the directed likelihood. The practical implication is that likelihood inference is expected to be robust with respect to rounding effects. Theoretical analysis is supplemented and confirmed by a number of Monte Carlo studies, performed to assess the coverage probabilities of confidence intervals based on likelihood procedures when data are rounded. In addition, simulations indicate that the directed likelihood is less sensitive to rounding effects than the Wald and score statistics. This provides another criterion for choosing among first-order equivalent likelihood procedures. The modified directed likelihood shows the same robustness as the directed likelihood, so that its gain in inferential accuracy does not come at the price of an increase in instability with respect to rounding. 相似文献

19.

Deprivation analysis based on Bayesian latent class models

Carla Machado Francisco Nunes 《Journal of applied statistics》2009,36(8):871-891

This article seeks to measure deprivation among Portuguese households, taking into account four well-being dimensions – housing, durable goods, economic strain and social relationships – with survey data from the European Community Household Panel. We propose a multi-stage approach to a cross-sectional analysis, side-stepping the sparse nature of the contingency tables caused by the large number of variables considered and bringing together partial and overall analyses of deprivation that are based on Bayesian latent class models via Markov Chain Monte Carlo methods. The outcomes demonstrate that there was a substantial improvement on household overall well-being between 1995 and 2001. The dimensions that most contributed to the risk of household deprivation were found to be economic strain and social relationships. 相似文献

20.

Catanova for multidimensional contingency tables: Nominal-scale response

Robert J. Anderson J. Richard Landis 《统计学通讯:理论与方法》2013,42(11):1191-1206

This paper extends an analysis of variance for categorical data (CATANOVA) procedure to multidimensional contingency tables involving several factors and a response variable measured on a nominal scale. Using an appropriate measure of total variation for multinomial data, partial and multiple association measures are developed as R² quantities which parallel the analogous statistics in multiple linear regression for quantitative data. In addition, test statistics are derived in terms of these R² criteria. Finally, this CATANOVA approach is illustrated within the context of 2 three-way contingency table from a multicenter clinicaltrial. 相似文献