首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 890 毫秒
1.
In this paper, we consider the auto-odds ratio function (AORF) as a measure of serial association for a stationary time series process of categorical data at two different time points. Numerical measures such as the autocorrelation function (ACF) have no meaningful interpretation, unless the time series data are numerical. Instead, we use the AORF as a measure of association to study the serial dependency of the categorical time series for both ordinal and nominal categories. Biswas and Song [Discrete-valued ARMA processes. Stat Probab Lett. 2009;79(17):1884–1889] provided some results on this measure for Pegram's operator-based AR(1) process with binary responses. Here, we extend this measure to more general set-ups, i.e. for AR(p) and MA(q) processes and for a general number of categories. We discuss how this method can effectively be used in parameter estimation and model selection. Following Weiß [Empirical measures of signed serial dependence in categorical time series. J Stat Comput Simul. 2011;81(4):411–429], we derive the large sample distribution of the estimator of the AORF under independent and identically distributed (iid) set-up. Some simulation results and two categorical data examples (one is ordinal and other nominal) are presented to illustrate the proposed method.  相似文献   

2.
In this paper, we first propose a new estimator of entropy for continuous random variables. Our estimator is obtained by correcting the coefficients of Vasicek's [A test for normality based on sample entropy, J. R. Statist. Soc. Ser. B 38 (1976), pp. 54–59] entropy estimator. We prove the consistency of our estimator. Monte Carlo studies show that our estimator is better than the entropy estimators proposed by Vasicek, Ebrahimi et al. [Two measures of sample entropy, Stat. Probab. Lett. 20 (1994), pp. 225–234] and Correa [A new estimator of entropy, Commun. Stat. Theory Methods 24 (1995), pp. 2439–2449] in terms of root mean square error. We then derive the non-parametric distribution function corresponding to our proposed entropy estimator as a piece-wise uniform distribution. We also introduce goodness-of-fit tests for testing exponentiality and normality based on the said distribution and compare its performance with their leading competitors.  相似文献   

3.
This paper extends an analysis of variance for categorical data (CATANOVA) procedure to multidimensional contingency tables involving several factors and a response variable measured on a nominal scale. Using an appropriate measure of total variation for multinomial data, partial and multiple association measures are developed as R2 quantities which parallel the analogous statistics in multiple linear regression for quantitative data. In addition, test statistics are derived in terms of these R2 criteria. Finally, this CATANOVA approach is illustrated within the context of 2 three-way contingency table from a multicenter clinicaltrial.  相似文献   

4.
For the analysis of square contingency tables with nominal categories, Tomizawa and coworkers have considered measures that represent the degree of departure from symmetry. This paper proposes a measure that represents the degree of asymmetry for square contingency tables with ordered categories (instead of those with nominal categories). The measure proposed is expressed using the Cressie–Read power-divergence or Patil–Taillie diversity index, defined for the cumulative probabilities that an observation falls in row (column) category i or below and column (row) category j (> i ) or above. The measure depends on the order of listing the categories. It should be useful for comparing the degree of asymmetry in several tables with ordered categories. The relationship between the measure and the normal distribution is shown.  相似文献   

5.
The paper introduces an estimator of the entropy of a continuous random variable. The estimator is obtained by modifying the estimator proposed by Ebrahimi et al. [Two measures of sample entropy, Statist. Probab. Lett. 20 (1994), pp. 225–234]. The consistency of the estimator is proved and comparisons are made with Vasicek's estimator [A test for normality based on sample entropy, J. R. Stat. Soc. Ser. B 38 (1976), pp. 54–59], van Es estimator [Estimating functionals related to a density by class of statistics based on spacings, Scand. J. Statist. 19 (1992), pp. 61–72], Ebrahimi et al. estimator and Correa estimator [A new estimator of entropy, Comm. Statist. Theory Methods 24 (1995), pp. 2439–2449]. The results indicate that the proposed estimator has smaller mean-squared error than above estimators. A real example is presented and analysed.  相似文献   

6.
For square contingency tables that have nominal categories, Tomizawa considered two kinds of measure to represent the degree of departure from symmetry. This paper proposes a generalization of those measures. The proposed measure is expressed by using the average of the power divergence of Cressie and Read, or the average of the diversity index of Patil and Taillie. Special cases of the proposed measure include Tomizawa's measures. The proposed measure would be useful for comparing the degree of departure from symmetry in several tables.  相似文献   

7.
In this paper we apply the sequential bootstrap method proposed by Collet et al. [Bootstrap Central Limit theorem for chains of infinite order via Markov approximations, Markov Processes and Related Fields 11(3) (2005), pp. 443–464] to estimate the variance of the empirical mean of a special class of chains of infinite order called sparse chains. For this process, we show that we are able to compute numerically the true value of the standard error with any fixed error.

Our main goal is to present a comparison, for sparse chains, among sequential bootstrap, the block bootstrap method proposed by Künsch [The jackknife and the Bootstrap for general stationary observations, Ann. Statist. 17 (1989), pp. 1217–1241] and improved by Liu and Singh [Moving blocks jackknife and Bootstrap capture week dependence, in Exploring the limits of the Bootstrap, R. Lepage and L. Billard, eds., Wiley, New York, 1992, pp. 225–248] and the bootstrap method proposed by Bühlmann [Blockwise bootstrapped empirical process for stationary sequences, Ann. Statist. 22 (1994), pp. 995–1012].  相似文献   

8.
9.
We study bias arising from rounding categorical variables following multivariate normal (MVN) imputation. This task has been well studied for binary variables, but not for more general categorical variables. Three methods that assign imputed values to categories based on fixed reference points are compared using 25 specific scenarios covering variables with k=3, …, 7 categories, and five distributional shapes, and for each k=3, …, 7, we examine the distribution of bias arising over 100,000 distributions drawn from a symmetric Dirichlet distribution. We observed, on both empirical and theoretical grounds, that one method (projected-distance-based rounding) is superior to the other two methods, and that the risk of invalid inference with the best method may be too high at sample sizes n≥150 at 50% missingness, n≥250 at 30% missingness and n≥1500 at 10% missingness. Therefore, these methods are generally unsatisfactory for rounding categorical variables (with up to seven categories) following MVN imputation.  相似文献   

10.
In this second part of this paper, reproducibility of discrete ordinal and nominal outcomes is addressed. The first part deals with continuous outcomes, concentrating on intraclass correlation (ρ) in the context of one‐way analysis of variance. For categorical data, the focus has generally not been on a meaningful population parameter such as ρ. However, intraclass correlation has been defined for discrete ordinal data, ρc, and for nominal data, κI. Therefore, a unified approach to reproducibility is proposed. The relevance of these parameters is outlined. Estimation and inferential procedures for ρc and κI are reviewed, together with worked examples. Topics related to reproducibility that are not addressed in either this or the previous paper are highlighted. Considerations for designing reproducibility studies and for interpreting their results are provided. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

11.
ABSTRACT

In this paper, we present new one-stage multiple comparison procedures with the average for location parameters of two-parameter exponential distributions under heteroscedasticity by modifying the existing one proposed by Wu [One stage multiple comparisons with the average for exponential location parameters under heteroscedasticity. Comput Stat Data Anal. 2013;68:352–360] with unequal sample sizes. A simulation study is done and the results show that the proposed procedures have shorter confidence length with coverage probabilities closer to the nominal ones. At last, an example of comparing the survival days of patients for four categories of lung cancer is given to demonstrate the proposed procedures.  相似文献   

12.
《Econometric Reviews》2013,32(4):337-349
Abstract

This paper reconsiders the nonlinearity test proposed by Ko[cbreve]enda (Ko[cbreve]enda, E. (2001). An alternative to the BDS test: integration across the correlation integral. Econometric Reviews20:337–351). When the analyzed series is non‐Gaussian, the empirical rejection rates can be much larger than the nominal size. In this context, the necessity of tabulating the empirical distribution of the statistic each time the test is computed is stressed. To that end, simple random permutation works reasonably well. This paper also shows, through Monte Carlo experiments, that Ko[cbreve]enda's test can be more powerful than the Brock et al. (Brock, W., Dechert, D., Scheickman, J., LeBaron, B. (1996). A test for independence based on the correlation dimension. Econometric Reviews15:197–235) procedure. However, more than one range of values for the proximity parameter should be used. Finally, empirical evidence on exchange rates is reassessed.  相似文献   

13.
The multinomial logistic regression model (MLRM) can be interpreted as a natural extension of the binomial model with logit link function to situations where the response variable can have three or more possible outcomes. In addition, when the categories of the response variable are nominal, the MLRM can be expressed in terms of two or more logistic models and analyzed in both frequentist and Bayesian approaches. However, few discussions about post modeling in categorical data models are found in the literature, and they mainly use Bayesian inference. The objective of this work is to present classic and Bayesian diagnostic measures for categorical data models. These measures are applied to a dataset (status) of patients undergoing kidney transplantation.  相似文献   

14.
In this paper, we consider the bootstrap procedure for the augmented Dickey–Fuller (ADF) unit root test by implementing the modified divergence information criterion (MDIC, Mantalos et al. [An improved divergence information criterion for the determination of the order of an AR process, Commun. Statist. Comput. Simul. 39(5) (2010a), pp. 865–879; Forecasting ARMA models: A comparative study of information criteria focusing on MDIC, J. Statist. Comput. Simul. 80(1) (2010b), pp. 61–73]) for the selection of the optimum number of lags in the estimated model. The asymptotic distribution of the resulting bootstrap ADF/MDIC test is established and its finite sample performance is investigated through Monte-Carlo simulations. The proposed bootstrap tests are found to have finite sample sizes that are generally much closer to their nominal values, than those tests that rely on other information criteria, like the Akaike information criterion [H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csáki, eds., Akademiai Kaido, Budapest, 1973, pp. 267–281]. The simulations reveal that the proposed procedure is quite satisfactory even for models with large negative moving average coefficients.  相似文献   

15.
ABSTRACT

The analysis of variance of cross-classified (categorical) data (CATANOVA) is a technique designed to identify the variation between treatments of interest to the researcher. There are well-established links between CATANOVA and the Goodman and Kruskal tau statistic as well as the Light and Margolin R 2 for the purposes of the graphical identification of this variation.

The aim of this article is to present a partition of the numerator of the tau statistic, or equivalently, the BSS measure in the CATANOVA framework, into location, dispersion, and higher order components. Even if a CATANOVA identifies an overall lack of variation, by considering this partition and calculations derived from them, it is possible to identify hidden, but statistically significant, sources of variation.  相似文献   

16.
Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions.  相似文献   

17.
A new measure for evaluating the strength of the association between a nominal variable and an ordered categorical response variable is introduced. The introduction of a new measure is justified by analysing the characteristics of a measure of the nominal-ordinal association proposed by Agresti (1981), especially with respect to the problem of the 'choice' of a predictive variable. The sample-based version of the index is studied, and its asymptotic standard error and asymptotic distribution are derived. Simulations are considered to evaluate the adequacy of the asymptotic approximation determined, following Goodman & Kruskal (1963).  相似文献   

18.
Clustered or correlated samples of categorical response data arise frequently in many fields of application. The method of generalized estimating equations (GEEs) introduced in Liang and Zeger [Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), pp. 13–22] is often used to analyse this type of data. GEEs give consistent estimates of the regression parameters and their variance based upon the Pearson residuals. Park et al. [Alternative GEE estimation procedures for discrete longitudinal data, Comput. Stat. Data Anal. 28 (1998), pp. 243–256] considered a modification of the GEE approach using the Anscombe residual and the deviance residual. In this work, we propose to extend this idea to a family of generalized residuals. A wide simulation study is conducted for binary and Poisson correlated outcomes and also two numerical illustrations are presented.  相似文献   

19.
As known, the least-squares estimator of the slope of a univariate linear model sets to zero the covariance between the regression residuals and the values of the explanatory variable. To prevent the estimation process from being influenced by outliers, which can be theoretically modelled by a heavy-tailed distribution for the error term, one can substitute covariance with some robust measures of association, for example Kendall's tau in the popular Theil–Sen estimator. In a scarcely known Italian paper, Cifarelli [(1978), ‘La Stima del Coefficiente di Regressione Mediante l'Indice di Cograduazione di Gini’, Rivista di matematica per le scienze economiche e sociali, 1, 7–38. A translation into English is available at http://arxiv.org/abs/1411.4809 and will appear in Decisions in Economics and Finance] shows that a gain of efficiency can be obtained by using Gini's cograduation index instead of Kendall's tau. This paper introduces a new estimator, derived from another association measure recently proposed. Such a measure is strongly related to Gini's cograduation index, as they are both built to vanish in the general framework of indifference. The newly proposed estimator is shown to be unbiased and asymptotically normally distributed. Moreover, all considered estimators are compared via their asymptotic relative efficiency and a small simulation study. Finally, some indications about the performance of the considered estimators in the presence of contaminated normal data are provided.  相似文献   

20.
When measuring units are expensive or time consuming, while ranking them is relatively easy and inexpensive, it is known that ranked set sampling (RSS) is preferable to simple random sampling (SRS). Many authors have suggested several extensions of RSS. As a variation, Al-Saleh and Al-Kadiri [Double ranked set sampling, Statist. Probab. Lett. 48 (2000), pp. 205–212] introduced double ranked set sampling (DRSS) and it was extended by Al-Saleh and Al-Omari [Multistage ranked set sampling, J. Statist. Plann. Inference 102 (2002), pp. 273–286] to multistage ranked set sampling (MSRSS). The entropy of a random variable (r.v.) is a measure of its uncertainty. It is a measure of the amount of information required on the average to determine the value of a (discrete) r.v.. In this work, we discuss entropy estimation in RSS design and aforementioned extensions and compare the results with those in SRS design in terms of bias and root mean square error (RMSE). Motivated by the above observed efficiency, we continue to investigate entropy-based goodness-of-fit test for the inverse Gaussian distribution using RSS. Critical values for some sample sizes determined by means of Monte Carlo simulations are presented for each design. A Monte Carlo power analysis is performed under various alternative hypotheses in order to compare the proposed testing procedure with the existing methods. The results indicate that tests based on RSS and its extensions are superior alternatives to the entropy test based on SRS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号