首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a test for detecting 'multivariate structure' in data sets. This procedure consists of transforming the data to remove the correlations, then discretizing the data and, finally, studying the cell counts in the resulting contingency table. A formal test can be performed using the usual chi-squared test statistic. We give the limiting distribution of the chi-squared statistic and also present simulation results to examine the accuracy of this limiting distribution in finite samples. Several examples show that our procedure can detect a variety of different types of structure. Our examples include data with clustering, digitized speech data, and residuals from a fitted time series model. The chi-squared statistic can also be used as a test for multivariate normality.  相似文献   

2.
Summary.  We propose an approach for assessing the risk of individual identification in the release of categorical data. This requires the accurate calculation of predictive probabilities for those cells in a contingency table which have small sample frequencies, making the problem somewhat different from usual contingency table estimation, where interest is generally focused on regions of high probability. Our approach is Bayesian and provides posterior predictive probabilities of identification risk. By incorporating model uncertainty in our analysis, we can provide more realistic estimates of disclosure risk for individual cell counts than are provided by methods which ignore the multivariate structure of the data set.  相似文献   

3.
We consider a Bayesian approach to the study of independence in a two-way contingency table which has been obtained from a two-stage cluster sampling design. If a procedure based on single-stage simple random sampling (rather than the appropriate cluster sampling) is used to test for independence, the p-value may be too small, resulting in a conclusion that the null hypothesis is false when it is, in fact, true. For many large complex surveys the Rao–Scott corrections to the standard chi-squared (or likelihood ratio) statistic provide appropriate inference. For smaller surveys, though, the Rao–Scott corrections may not be accurate, partly because the chi-squared test is inaccurate. In this paper, we use a hierarchical Bayesian model to convert the observed cluster samples to simple random samples. This provides surrogate samples which can be used to derive the distribution of the Bayes factor. We demonstrate the utility of our procedure using an example and also provide a simulation study which establishes our methodology as a viable alternative to the Rao–Scott approximations for relatively small two-stage cluster samples. We also show the additional insight gained by displaying the distribution of the Bayes factor rather than simply relying on a summary of the distribution.  相似文献   

4.
ABSTRACT

This paper extends the classical methods of analysis of a two-way contingency table to the fuzzy environment for two cases: (1) when the available sample of observations is reported as imprecise data, and (2) the case in which we prefer to categorize the variables based on linguistic terms rather than as crisp quantities. For this purpose, the α-cuts approach is used to extend the usual concepts of the test statistic and p-value to the fuzzy test statistic and fuzzy p-value. In addition, some measures of association are extended to the fuzzy version in order to evaluate the dependence in such contingency tables. Some practical examples are provided to explain the applicability of the proposed methods in real-world problems.  相似文献   

5.
We consider a likelihood ratio test of independence for large two-way contingency tables having both structural (non-random) and sampling (random) zeros in many cells. The solution of this problem is not available using standard likelihood ratio tests. One way to bypass this problem is to remove the structural zeroes from the table and implement a test on the remaining cells which incorporate the randomness in the sampling zeros; the resulting test is a test of quasi-independence of the two categorical variables. This test is based only on the positive counts in the contingency table and is valid when there is at least one sampling (random) zero. The proposed (likelihood ratio) test is an alternative to the commonly used ad hoc procedures of converting the zero cells to positive ones by adding a small constant. One practical advantage of our procedure is that there is no need to know if a zero cell is structural zero or a sampling zero. We model the positive counts using a truncated multinomial distribution. In fact, we have two truncated multinomial distributions; one for the null hypothesis of independence and the other for the unrestricted parameter space. We use Monte Carlo methods to obtain the maximum likelihood estimators of the parameters and also the p-value of our proposed test. To obtain the sampling distribution of the likelihood ratio test statistic, we use bootstrap methods. We discuss many examples, and also empirically compare the power function of the likelihood ratio test relative to those of some well-known test statistics.  相似文献   

6.
7.
ABSTRACT

We investigated the empirical likelihood inference approach under a general class of semiparametric hazards regression models with survival data subject to right-censoring. An empirical likelihood ratio for the full 2p regression parameters involved in the model is obtained. We showed that it converged weakly to a random variable which could be written as a weighted sum of 2p independent chi-squared variables with one degree of freedom. Using this, we could construct a confidence region for parameters. We also suggested an adjusted version for the preceding statistic, whose limit followed a standard chi-squared distribution with 2p degrees of freedom.  相似文献   

8.
Square contingency tables with the same row and column classification occur frequently in a wide range of statistical applications, e.g. whenever the members of a matched pair are classified on the same scale, which is usually ordinal. Such tables are analysed by choosing an appropriate loglinear model. We focus on the models of symmetry, triangular, diagonal and ordinal quasi symmetry. The fit of a specific model is tested by the chi-squared test or the likelihood-ratio test, where p-values are calculated from the asymptotic chi-square distribution of the test statistic or, if this seems unjustified, from the exact conditional distribution. Since the calculation of exact p-values is often not feasible, we propose alternatives based on algebraic statistics combined with MCMC methods.  相似文献   

9.
The multivariate maximum squared-radii (MMSR) statistic is commonly used to detect multivariate outliers. We characterize the general form of the nonnegative-definite observation covariance structure for which the distribution of the MMSR statistic is the sameas the distribution resulting from the usual independence covariance structure. Thus, we extend the work of Young, Seaman, and Meaux (1992), who have characterized the general form of the positive-definite independence-distribution-preserving (IDP) dependency structure for the MMSR statistic. We also improve upon the results of Younget al (1992) in that we give a more complete and simple proof of the characterization of the general positive-definite IDP covariance structure for the MMSR statistic.  相似文献   

10.
This paper presents a generalization of the partition of the chi-squared statistic presented in Beh & Davy (1998). For a three-way contingency table with one or two sets of ordered categories, the chi-squared statistic partition is defined using orthogonal polynomials. Using this partition, information about the relationship between the variables can be obtained by identifying important associations in terms of the location (linear), dispersion (quadratic) and higher order components. The paper compares these partitions with log-linear models for ordinal data.  相似文献   

11.
Under the hypothesis of independence, the chi-squared test statistic for independence in a two-way contingency table follows an asymptotic chi-squared distribution under both a multinomial and a product-multinomial models. Alalouf(1987) showed the same result holds for the third case where both margins are fixed. In this paper an intuitively easier way of proof using the conditional limit theorems is suggested and some points are discussed.  相似文献   

12.
Taguchi's statistic has long been known to be a more appropriate measure of association of the dependence for ordinal variables compared to the Pearson chi-squared statistic. Therefore, there is some advantage in using Taguchi's statistic in the correspondence analysis context when a two-way contingency table consists at least of an ordinal categorical variable. The aim of this paper, considering the contingency table with two ordinal categorical variables, is to show a decomposition of Taguchi's index into linear, quadratic and higher-order components. This decomposition has been developed using Emerson's orthogonal polynomials. Moreover, two case studies to explain the methodology have been analyzed.  相似文献   

13.
We propose a Bayesian computation and inference method for the Pearson-type chi-squared goodness-of-fit test with right-censored survival data. Our test statistic is derived from the classical Pearson chi-squared test using the differences between the observed and expected counts in the partitioned bins. In the Bayesian paradigm, we generate posterior samples of the model parameter using the Markov chain Monte Carlo procedure. By replacing the maximum likelihood estimator in the quadratic form with a random observation from the posterior distribution of the model parameter, we can easily construct a chi-squared test statistic. The degrees of freedom of the test equal the number of bins and thus is independent of the dimensionality of the underlying parameter vector. The test statistic recovers the conventional Pearson-type chi-squared structure. Moreover, the proposed algorithm circumvents the burden of evaluating the Fisher information matrix, its inverse and the rank of the variance–covariance matrix. We examine the proposed model diagnostic method using simulation studies and illustrate it with a real data set from a prostate cancer study.  相似文献   

14.
Taguchi's statistic has long been known to be a more appropriate measure of association for ordinal variables than the Pearson chi-squared statistic. Therefore, there is some advantage in using Taguchi's statistic for performing correspondence analysis when a two-way contingency table consists of one ordinal categorical variable. This article will explore the development of correspondence analysis using a decomposition of Taguchi's statistic.  相似文献   

15.
In this paper, we obtain an adjusted version of the likelihood ratio (LR) test for errors-in-variables multivariate linear regression models. The error terms are allowed to follow a multivariate distribution in the class of the elliptical distributions, which has the multivariate normal distribution as a special case. We derive a modified LR statistic that follows a chi-squared distribution with a high degree of accuracy. Our results generalize those in Melo and Ferrari (Advances in Statistical Analysis, 2010, 94, pp. 75–87) by allowing the parameter of interest to be vector-valued in the multivariate errors-in-variables model. We report a simulation study which shows that the proposed test displays superior finite sample behavior relative to the standard LR test.  相似文献   

16.
A limiting distribution of the likelihood ratio statistic for the test of the equality of the q smallest eigenvalues of a covariance matrix is obtained. This distribution can be used as an alternative to the chi-squared distribution which is usually used with this test. It is shown that this new method yields reasonable significance levels for those situations in which the chi-squared approximation is inadequate.  相似文献   

17.
Consider the problem of estimating the coverage function of an usual confidence interval for a randomly chosen linear combination of the elements of the mean vector of a p-dimensional normal distribution. The usual constant coverage probability estimator is shown to be admissible under the ancillary statistic everywhere-valid constraint. Note that this estimator is not admissible under the usual sense if p⩾5. Since the criterion of admissibility under the ancillary statistic everywhere-valid constraint is a reasonable one, that the constant coverage probability estimator has been commonly accepted is justified.  相似文献   

18.
In the mid-1950s S.N. Roy and his students contributed two landmark articles to the contingency table literature [Roy, S.N., Kastenbaum, M.A., 1956. On the hypothesis of no “interaction” in a multiway contingency table. Ann. Math. Statist. 27, 749–757; Roy, S.N., Mitra, S.K., 1956. An introduction to some nonparametric generalizations of analysis of variance and multivariate analysis. Biometrika 43, 361–376]. The first article generalized concepts of interaction from 2×2×22×2×2 contingency tables to three-way tables of arbitrary size and to larger tables. In the second article, which is the source of our primary focus, various notions of independence were clarified for three-way contingency tables, Roy's union–intersection test was applied to construct chi-squared tests of hypotheses about the structure of such tables, and the chi-squared statistics were shown not to depend on the distinction between response and explanatory variables. This work pre-dates by many years later developments that expressed such results in the context of loglinear models. It pre-dates by a quarter century the development of graphical models. We summarize the main results in these key articles and discuss the connection between them and the later developments of loglinear modeling and of graphical modeling. We also mention ways in which these later developments have themselves been further generalized.  相似文献   

19.
We propose a multivariate extension of the univariate chi-squared normality test. Using a known result for the distribution of quadratic forms in normal variables, we show that the proposed test statistic has an approximated chi-squared distribution under the null hypothesis of multivariate normality. As in the univariate case, the new test statistic is based on a comparison of observed and expected frequencies for specified events in sample space. In the univariate case, these events are the standard class intervals, but in the multivariate extension we propose these become hyper-ellipsoidal annuli in multivariate sample space. We assess the performance of the new test using Monte Carlo simulation. Keeping the type I error rate fixed, we show that the new test has power that compares favourably with other standard normality tests, though no uniformly most powerful test has been found. We recommend the new test due to its competitive advantages.  相似文献   

20.
This article considers statistical inference for partially linear varying-coefficient models when the responses are missing at random. We propose a profile least-squares estimator for the parametric component with complete-case data and show that the resulting estimator is asymptotically normal. To avoid to estimate the asymptotic covariance in establishing confidence region of the parametric component with the normal-approximation method, we define an empirical likelihood based statistic and show that its limiting distribution is chi-squared distribution. Then, the confidence regions of the parametric component with asymptotically correct coverage probabilities can be constructed by the result. To check the validity of the linear constraints on the parametric component, we construct a modified generalized likelihood ratio test statistic and demonstrate that it follows asymptotically chi-squared distribution under the null hypothesis. Then, we extend the generalized likelihood ratio technique to the context of missing data. Finally, some simulations are conducted to illustrate the proposed methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号