期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian testing for independence of two categorical variables under two-stage cluster sampling with covariates

Dilli Bhatta Balgobin Nandram Joseph Sedransk 《Journal of applied statistics》2018,45(13):2365-2393

We consider Bayesian testing for independence of two categorical variables with covariates for a two-stage cluster sample. This is a difficult problem because we have a complex sample (i.e. cluster sample), not a simple random sample. Our approach is to convert the cluster sample with covariates into an equivalent simple random sample without covariates, which provides a surrogate of the original sample. Then, this surrogate sample is used to compute the Bayes factor to make an inference about independence. We apply our methodology to the data from the Trend in International Mathematics and Science Study [30] for fourth grade US students to assess the association between the mathematics and science scores represented as categorical variables. We show that if there is strong association between two categorical variables, there is no significant difference between the tests with and without the covariates. We also performed a simulation study to further understand the effect of covariates in various situations. We found that for borderline cases (moderate association between the two categorical variables), there are noticeable differences in the test with and without covariates. 相似文献

2.

Testing for Serial Independence: Beyond the Portmanteau Approach

Luca Bagnato Lucio De Capitani Antonio Punzo 《The American statistician》2018,72(3):219-238

Portmanteau tests are typically used to test serial independence even if, by construction, they are generally powerful only in presence of pairwise dependence between lagged variables. In this article, we present a simple statistic defining a new serial independence test, which is able to detect more general forms of dependence. In particular, differently from the Portmanteau tests, the resulting test is powerful also under a dependent process characterized by pairwise independence. A diagram, based on p-values from the proposed test, is introduced to investigate serial dependence. Finally, the effectiveness of the proposal is evaluated in a simulation study and with an application on financial data. Both show that the new test, used in synergy with the existing ones, helps in the identification of the true data-generating process. Supplementary materials for this article are available online. 相似文献

3.

Correlation-adjusted estimation of sensitivity and specificity of two diagnostic tests 总被引：7，自引：0，他引：7

Marios P. Georgiadis Wesley O. Johnson Ian A. Gardner Ramanpreet Singh 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(1):63-76

Summary. Models for multiple-test screening data generally require the assumption that the tests are independent conditional on disease state. This assumption may be unreasonable, especially when the biological basis of the tests is the same. We propose a model that allows for correlation between two diagnostic test results. Since models that incorporate test correlation involve more parameters than can be estimated with the available data, posterior inferences will depend more heavily on prior distributions, even with large sample sizes. If we have reasonably accurate information about one of the two screening tests (perhaps the standard currently used test) or the prevalences of the populations tested, accurate inferences about all the parameters, including the test correlation, are possible. We present a model for evaluating dependent diagnostic tests and analyse real and simulated data sets. Our analysis shows that, when the tests are correlated, a model that assumes conditional independence can perform very poorly. We recommend that, if the tests are only moderately accurate and measure the same biological responses, researchers use the dependence model for their analyses. 相似文献

4.

PARTIAL CORRELATION AND CONDITIONAL CORRELATION AS MEASURES OF CONDITIONAL INDEPENDENCE 总被引：1，自引：0，他引：1

Kunihiro Baba Ritei Shibata Masaaki Sibuya 《Australian & New Zealand Journal of Statistics》2004,46(4):657-664

This paper investigates the roles of partial correlation and conditional correlation as measures of the conditional independence of two random variables. It first establishes a sufficient condition for the coincidence of the partial correlation with the conditional correlation. The condition is satisfied not only for multivariate normal but also for elliptical, multivariate hypergeometric, multivariate negative hypergeometric, multinomial and Dirichlet distributions. Such families of distributions are characterized by a semigroup property as a parametric family of distributions. A necessary and sufficient condition for the coincidence of the partial covariance with the conditional covariance is also derived. However, a known family of multivariate distributions which satisfies this condition cannot be found, except for the multivariate normal. The paper also shows that conditional independence has no close ties with zero partial correlation except in the case of the multivariate normal distribution; it has rather close ties to the zero conditional correlation. It shows that the equivalence between zero conditional covariance and conditional independence for normal variables is retained by any monotone transformation of each variable. The results suggest that care must be taken when using such correlations as measures of conditional independence unless the joint distribution is known to be normal. Otherwise a new concept of conditional independence may need to be introduced in place of conditional independence through zero conditional correlation or other statistics. 相似文献

5.

A likelihood ratio test of quasi-independence for sparse two-way contingency tables

《Journal of Statistical Computation and Simulation》2012,82(2):284-304

We consider a likelihood ratio test of independence for large two-way contingency tables having both structural (non-random) and sampling (random) zeros in many cells. The solution of this problem is not available using standard likelihood ratio tests. One way to bypass this problem is to remove the structural zeroes from the table and implement a test on the remaining cells which incorporate the randomness in the sampling zeros; the resulting test is a test of quasi-independence of the two categorical variables. This test is based only on the positive counts in the contingency table and is valid when there is at least one sampling (random) zero. The proposed (likelihood ratio) test is an alternative to the commonly used ad hoc procedures of converting the zero cells to positive ones by adding a small constant. One practical advantage of our procedure is that there is no need to know if a zero cell is structural zero or a sampling zero. We model the positive counts using a truncated multinomial distribution. In fact, we have two truncated multinomial distributions; one for the null hypothesis of independence and the other for the unrestricted parameter space. We use Monte Carlo methods to obtain the maximum likelihood estimators of the parameters and also the p-value of our proposed test. To obtain the sampling distribution of the likelihood ratio test statistic, we use bootstrap methods. We discuss many examples, and also empirically compare the power function of the likelihood ratio test relative to those of some well-known test statistics. 相似文献

6.

In Memoriam: Bernard George Greenberg 1919-1985

James R. Abernathy Pranab K. Sen 《The American statistician》2013,67(3):183-184

The bivariate normal density with unit variance and correlation ρ is well known. We show that by integrating out ρ, the result is a function of the maximum norm. The Bayesian interpretation of this result is that if we put a uniform prior over ρ, then the marginal bivariate density depends only on the maximal magnitude of the variables. The square-shaped isodensity contour of this resulting marginal bivariate density can also be regarded as the equally weighted mixture of bivariate normal distributions over all possible correlation coefficients. This density links to the Khintchine mixture method of generating random variables. We use this method to construct the higher dimensional generalizations of this distribution. We further show that for each dimension, there is a unique multivariate density that is a differentiable function of the maximum norm and is marginally normal, and the bivariate density from the integral over ρ is its special case in two dimensions. 相似文献

7.

Nonlinear measures of association with kernel canonical correlation analysis and applications

Su-Yun Huang Mei-Hsien Lee Chuhsing Kate Hsiao 《Journal of statistical planning and inference》2009

Measures of association between two sets of random variables have long been of interest to statisticians. The classical canonical correlation analysis (LCCA) can characterize, but also is limited to, linear association. This article introduces a nonlinear and nonparametric kernel method for association study and proposes a new independence test for two sets of variables. This nonlinear kernel canonical correlation analysis (KCCA) can also be applied to the nonlinear discriminant analysis. Implementation issues are discussed. We place the implementation of KCCA in the framework of classical LCCA via a sequence of independent systems in the kernel associated Hilbert spaces. Such a placement provides an easy way to carry out the KCCA. Numerical experiments and comparison with other nonparametric methods are presented. 相似文献

8.

The difference-sign runs length distribution in testing for serial independence

Camillo Cammarota 《Journal of applied statistics》2011,38(5):1033-1043

We investigate the sequence of difference-sign runs length of a time series in the context of non-parametric tests for serial independence. This sequence is, under suitable conditioning, a stationary sequence and we prove that the normalized correlation of two consecutive runs length is small (≈0.0427). We use this result in a test based on the relative entropy of the empirical distribution of the runs length. We investigate the performance of the test in simulated series and test serial independence of cardiac data series in atrial fibrillation. 相似文献

9.

Global sensitivity analysis with dependence measures

Sebastien Da Veiga 《Journal of Statistical Computation and Simulation》2015,85(7):1283-1305

Global sensitivity analysis with variance-based measures suffers from several theoretical and practical limitations, since they focus only on the variance of the output and handle multivariate variables in a limited way. In this paper, we introduce a new class of sensitivity indices based on dependence measures which overcomes these insufficiencies. Our approach originates from the idea to compare the output distribution with its conditional counterpart when one of the input variables is fixed. We establish that this comparison yields previously proposed indices when it is performed with Csiszár f-divergences, as well as sensitivity indices which are well-known dependence measures between random variables. This leads us to investigate completely new sensitivity indices based on recent state-of-the-art dependence measures, such as distance correlation and the Hilbert–Schmidt independence criterion. We also emphasize the potential of feature selection techniques relying on such dependence measures as alternatives to screening in high dimension. 相似文献

10.

Kernel partial correlation: a novel approach to capturing conditional independence in graphical models for noisy data

Jihwan Oh Faye Zheng R. W. Doerge 《Journal of applied statistics》2018,45(14):2677-2696

Graphical models capture the conditional independence structure among random variables via existence of edges among vertices. One way of inferring a graph is to identify zero partial correlation coefficients, which is an effective way of finding conditional independence under a multivariate Gaussian setting. For more general settings, we propose kernel partial correlation which extends partial correlation with a combination of two kernel methods. First, a nonparametric function estimation is employed to remove effects from other variables, and then the dependence between remaining random components is assessed through a nonparametric association measure. The proposed approach is not only flexible but also robust under high levels of noise owing to the robustness of the nonparametric approaches. 相似文献

11.

Efficient estimation and model selection in large graphical models

Dag Wedelin 《Statistics and Computing》1996,6(4):313-323

We develop a computationally efficient method to determine the interaction structure in a multidimensional binary sample. We use an interaction model based on orthogonal functions, and give a result on independence properties in this model. Using this result we develop an efficient approximation algorithm for estimating the parameters in a given undirected model. To find the best model, we use a heuristic search algorithm in which the structure is determined incrementally. We also give an algorithm for reconstructing the causal directions, if such exist. We demonstrate that together these algorithms are capable of discovering almost all of the true structure for a problem with 121 variables, including many of the directions. 相似文献

12.

Nonparametric Testing for Asymmetric Information

Liangjun Su Martin Spindler 《商业与经济统计学杂志》2013,31(2):208-225

Asymmetric information is an important phenomenon in many markets and in particular in insurance markets. Testing for asymmetric information has become a very important issue in the literature in the last two decades. Almost all testing procedures that are used in empirical studies are parametric, which may yield misleading conclusions in the case of misspecification of either functional or distributional relationships among the variables of interest. Motivated by the literature on testing conditional independence, we propose a new nonparametric test for asymmetric information, which is applicable in a variety of situations. We demonstrate that the test works reasonably well through Monte Carlo simulations and apply it to an automobile insurance dataset and a long-term care insurance (LTCI) dataset. Our empirical results consolidate Chiappori and Salanié’s findings that there is no evidence for the presence of asymmetric information in the French automobile insurance market. While Finkelstein and McGarry found no positive correlation between risk and coverage in the LTCI market in the United States, our test detects asymmetric information using only the information that is available to the insurance company, and our investigation of the source of asymmetric information suggests some sort of asymmetric information that is related to risk preferences as opposed to risk types and thus lends support to Finkelstein and McGarry. 相似文献

13.

A test for the complete independence of high-dimensional random vectors

《Journal of Statistical Computation and Simulation》2012,82(16):3135-3140

ABSTRACT

This paper discusses the problem of testing the complete independence of random variables when the dimension of observations can be much larger than the sample size. It is reported that two typical tests based on, respectively, the biggest off-diagonal entry and the largest eigenvalue of the sample correlation matrix lose their control of type I error in such high-dimensional scenarios, and exhibit distinct behaviours in type II error under different types of alternative hypothesis. Given these facts, we propose a permutation test procedure by synthesizing these two extreme statistics. Simulation results show that for finite dimension and sample size the proposed test outperforms the existing methods in various cases. 相似文献

14.

Taux de résistance des tests de rang d'indépendance

Philippe Capra Ana Isabel Garralda Guillem 《Revue canadienne de statistique》1997,25(1):113-124

The resistance of tests to acceptance and rejection of null hypotheses was denned and studied by Ylvisaker in the context of one-sample problems. This notion provides a measure of a test's resistance to outliers. In this paper, we propose an extension of this notion to rank-based tests of independence for bivariate random variables. We show, among other things, that Kendall's test of independence is more resistant than Spearman's test. 相似文献

15.

Score test of homogeneity for survival data 总被引：3，自引：0，他引：3

D. Commenges P. K. Andersen 《Lifetime data analysis》1995,1(2):145-156

If follow-up is made for subjects which are grouped into units, such as familial or spatial units then it may be interesting to test whether the groups are homogeneous (or independent for given explanatory variables). The effect of the groups is modelled as random and we consider a frailty proportional hazards model which allows to adjust for explanatory variables. We derive the score test of homogeneity from the marginal partial likelihood and it turns out to be the sum of a pairwise correlation term of martingale residuals and an overdispersion term. In the particular case where the sizes of the groups are equal to one, this statistic can be used for testing overdispersion. The asymptotic variance of this statistic is derived using counting process arguments. An extension to the case of several strata is given. The resulting test is computationally simple; its use is illustrated using both simulated and real data. In addition a decomposition of the score statistic is proposed as a sum of a pairwise correlation term and an overdispersion term. The pairwise correlation term can be used for constructing a statistic more robust to departure from the proportional hazard model, and the overdispesion term for constructing a test of fit of the proportional hazard model. 相似文献

16.

On testing for independence between the innovations of several time series

Pierre Duchesne Kilani Ghoudi Bruno Rémillard 《Revue canadienne de statistique》2012,40(3):447-479

Test statistics for checking the independence between the innovations of several time series are developed. The time series models considered allow for general specifications for the conditional mean and variance functions that could depend on common explanatory variables. In testing for independence between more than two time series, checking pairwise independence does not lead to consistent procedures. Thus a finite family of empirical processes relying on multivariate lagged residuals are constructed, and we derive their asymptotic distributions. In order to obtain simple asymptotic covariance structures, Möbius transformations of the empirical processes are studied, and simplifications occur. Under the null hypothesis of independence, we show that these transformed processes are asymptotically Gaussian, independent, and with tractable covariance functions not depending on the estimated parameters. Various procedures are discussed, including Cramér–von Mises test statistics and tests based on non‐parametric measures. The ranks of the residuals are considered in the new methods, giving test statistics which are asymptotically margin‐free. Generalized cross‐correlations are introduced, extending the concept of cross‐correlation to an arbitrary number of time series; portmanteau procedures based on them are discussed. In order to detect the dependence visually, graphical devices are proposed. Simulations are conducted to explore the finite sample properties of the methodology, which is found to be powerful against various types of alternatives when the independence is tested between two and three time series. An application is considered, using the daily log‐returns of Apple, Intel and Hewlett‐Packard traded on the Nasdaq financial market. The Canadian Journal of Statistics 40: 447–479; 2012 © 2012 Statistical Society of Canada 相似文献

17.

An approximated distribution of the gini's rank associatio coefficient

G. Landenna A. Scagni M. Boldrini 《统计学通讯:理论与方法》2013,42(6):2017-2026

An approximate distribution is proposed for the Gini's rank association coefficient g which is, like Kendall's and Spearman's rank correlation coefficient, a statistic to test independence between two random variables. The purposed distribution can be simply transformed into a Student's T distribution; so, hypothesis testing is made much easier. 相似文献

18.

The modified permutation entropy-based independence test of time series

Emad Ashtari Nezhad G. R. Mohtashami Borzadaran H. R. Nilli Sani Hadi Alizadeh Noughabi 《统计学通讯:模拟与计算》2013,42(10):2877-2897

Abstract

In time series, it is essential to check the independence of data by means of a proper method or an appropriate statistical test before any further analysis. Therefore, among different independence tests, a powerful and productive test has been introduced by Matilla-García and Marín via m-dimensional vectorial process, in which the value of the process at time t includes m-histories of the primary process. However, this method causes a dependency for the vectors even when the independence assumption of random variables is considered. Considering this dependency, a modified test is obtained in this article through presenting a new asymptotic distribution based on weighted chi-square random variables. Also, some other alterations to the test have been made via bootstrap method and by controlling the overlap. Compared with the primary test, it is obtained that not only the modified test is more accurate but also, it possesses higher power. 相似文献

19.

A Projection-Based Nonparametric Test of Conditional Quantile Independence

《Econometric Reviews》2012,31(1):1-26

Abstract

This paper proposes a nonparametric procedure for testing conditional quantile independence using projections. Relative to existing smoothed nonparametric tests, the resulting test statistic: (i) detects the high frequency local alternatives that converge to the null hypothesis in probability at faster rate and, (ii) yields improvements in the finite sample power when a large number of variables are included under the alternative. In addition, it allows the researcher to include qualitative information and, if desired, direct the test against specific subsets of alternatives without imposing any functional form on them. We use the weighted Nadaraya-Watson (WNW) estimator of the conditional quantile function avoiding the boundary problems in estimation and testing and prove weak uniform consistency (with rate) of the WNW estimator for absolutely regular processes. The procedure is applied to a study of risk spillovers among the banks. We show that the methodology generalizes some of the recently proposed measures of systemic risk and we use the quantile framework to assess the intensity of risk spillovers among individual financial institutions. 相似文献

20.

Test of Association Between Two Ordinal Variables While Adjusting for Covariates

Li C Shepherd BE 《Journal of the American Statistical Association》2010,105(490):612-620

We propose a new set of test statistics to examine the association between two ordinal categorical variables X and Y after adjusting for continuous and/or categorical covariates Z. Our approach first fits multinomial (e.g., proportional odds) models of X and Y, separately, on Z. For each subject, we then compute the conditional distributions of X and Y given Z. If there is no relationship between X and Y after adjusting for Z, then these conditional distributions will be independent, and the observed value of (X, Y) for a subject is expected to follow the product distribution of these conditional distributions. We consider two simple ways of testing the null of conditional independence, both of which treat X and Y equally, in the sense that they do not require specifying an outcome and a predictor variable. The first approach adds these product distributions across all subjects to obtain the expected distribution of (X, Y) under the null and then contrasts it with the observed unconditional distribution of (X, Y). Our second approach computes "residuals" from the two multinomial models and then tests for correlation between these residuals; we define a new individual-level residual for models with ordinal outcomes. We present methods for computing p-values using either the empirical or asymptotic distributions of our test statistics. Through simulations, we demonstrate that our test statistics perform well in terms of power and Type I error rate when compared to proportional odds models which treat X as either a continuous or categorical predictor. We apply our methods to data from a study of visual impairment in children and to a study of cervical abnormalities in human immunodeficiency virus (HIV)-infected women. Supplemental materials for the article are available online. 相似文献