期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Time series analysis of categorical data using auto-odds ratio function

Raju Maiti Atanu Biswas 《Statistics》2018,52(2):426-444

In this paper, we consider the auto-odds ratio function (AORF) as a measure of serial association for a stationary time series process of categorical data at two different time points. Numerical measures such as the autocorrelation function (ACF) have no meaningful interpretation, unless the time series data are numerical. Instead, we use the AORF as a measure of association to study the serial dependency of the categorical time series for both ordinal and nominal categories. Biswas and Song [Discrete-valued ARMA processes. Stat Probab Lett. 2009;79(17):1884–1889] provided some results on this measure for Pegram's operator-based AR(1) process with binary responses. Here, we extend this measure to more general set-ups, i.e. for AR(p) and MA(q) processes and for a general number of categories. We discuss how this method can effectively be used in parameter estimation and model selection. Following Weiß [Empirical measures of signed serial dependence in categorical time series. J Stat Comput Simul. 2011;81(4):411–429], we derive the large sample distribution of the estimator of the AORF under independent and identically distributed (iid) set-up. Some simulation results and two categorical data examples (one is ordinal and other nominal) are presented to illustrate the proposed method. 相似文献

2.

Goodness-of-fit test based on correcting moments of modified entropy estimator

《Journal of Statistical Computation and Simulation》2012,82(12):2077-2093

In this paper, we first propose a new estimator of entropy for continuous random variables. Our estimator is obtained by correcting the coefficients of Vasicek's [A test for normality based on sample entropy, J. R. Statist. Soc. Ser. B 38 (1976), pp. 54–59] entropy estimator. We prove the consistency of our estimator. Monte Carlo studies show that our estimator is better than the entropy estimators proposed by Vasicek, Ebrahimi et al. [Two measures of sample entropy, Stat. Probab. Lett. 20 (1994), pp. 225–234] and Correa [A new estimator of entropy, Commun. Stat. Theory Methods 24 (1995), pp. 2439–2449] in terms of root mean square error. We then derive the non-parametric distribution function corresponding to our proposed entropy estimator as a piece-wise uniform distribution. We also introduce goodness-of-fit tests for testing exponentiality and normality based on the said distribution and compare its performance with their leading competitors. 相似文献

3.

Catanova for multidimensional contingency tables: Nominal-scale response

Robert J. Anderson J. Richard Landis 《统计学通讯:理论与方法》2013,42(11):1191-1206

This paper extends an analysis of variance for categorical data (CATANOVA) procedure to multidimensional contingency tables involving several factors and a response variable measured on a nominal scale. Using an appropriate measure of total variation for multinomial data, partial and multiple association measures are developed as R² quantities which parallel the analogous statistics in multiple linear regression for quantitative data. In addition, test statistics are derived in terms of these R² criteria. Finally, this CATANOVA approach is illustrated within the context of 2 three-way contingency table from a multicenter clinicaltrial. 相似文献

4.

Measure of Asymmetry for Square Contingency Tables Having Ordered Categories

Sadao Tomizawa Nobuko Miyamoto & Yusuke Hatanaka 《Australian & New Zealand Journal of Statistics》2001,43(3):335-349

For the analysis of square contingency tables with nominal categories, Tomizawa and coworkers have considered measures that represent the degree of departure from symmetry. This paper proposes a measure that represents the degree of asymmetry for square contingency tables with ordered categories (instead of those with nominal categories). The measure proposed is expressed using the Cressie–Read power-divergence or Patil–Taillie diversity index, defined for the cumulative probabilities that an observation falls in row (column) category i or below and column (row) category j (> i ) or above. The measure depends on the order of listing the categories. It should be useful for comparing the degree of asymmetry in several tables with ordered categories. The relationship between the measure and the normal distribution is shown. 相似文献

5.

On the entropy estimators

Havva Alizadeh Noughabi Reza Alizadeh Noughabi 《Journal of Statistical Computation and Simulation》2013,83(4):784-792

The paper introduces an estimator of the entropy of a continuous random variable. The estimator is obtained by modifying the estimator proposed by Ebrahimi et al. [Two measures of sample entropy, Statist. Probab. Lett. 20 (1994), pp. 225–234]. The consistency of the estimator is proved and comparisons are made with Vasicek's estimator [A test for normality based on sample entropy, J. R. Stat. Soc. Ser. B 38 (1976), pp. 54–59], van Es estimator [Estimating functionals related to a density by class of statistics based on spacings, Scand. J. Statist. 19 (1992), pp. 61–72], Ebrahimi et al. estimator and Correa estimator [A new estimator of entropy, Comm. Statist. Theory Methods 24 (1995), pp. 2439–2449]. The results indicate that the proposed estimator has smaller mean-squared error than above estimators. A real example is presented and analysed. 相似文献

6.

Power-divergence-type measure of departure from symmetry for square contingency tables that have nominal categories

Sadao Tomizawa Takashi Seo Hideharu Yamamoto 《Journal of applied statistics》1998,25(3):387-398

For square contingency tables that have nominal categories, Tomizawa considered two kinds of measure to represent the degree of departure from symmetry. This paper proposes a generalization of those measures. The proposed measure is expressed by using the average of the power divergence of Cressie and Read, or the average of the diversity index of Patil and Taillie. Special cases of the proposed measure include Tomizawa's measures. The proposed measure would be useful for comparing the degree of departure from symmetry in several tables. 相似文献

7.

A block bootstrap comparison for sparse chains

《Journal of Statistical Computation and Simulation》2012,82(8):889-902

In this paper we apply the sequential bootstrap method proposed by Collet et al. [Bootstrap Central Limit theorem for chains of infinite order via Markov approximations, Markov Processes and Related Fields 11(3) (2005), pp. 443–464] to estimate the variance of the empirical mean of a special class of chains of infinite order called sparse chains. For this process, we show that we are able to compute numerically the true value of the standard error with any fixed error.

Our main goal is to present a comparison, for sparse chains, among sequential bootstrap, the block bootstrap method proposed by Künsch [The jackknife and the Bootstrap for general stationary observations, Ann. Statist. 17 (1989), pp. 1217–1241] and improved by Liu and Singh [Moving blocks jackknife and Bootstrap capture week dependence, in Exploring the limits of the Bootstrap, R. Lepage and L. Billard, eds., Wiley, New York, 1992, pp. 225–248] and the bootstrap method proposed by Bühlmann [Blockwise bootstrapped empirical process for stationary sequences, Ann. Statist. 22 (1994), pp. 995–1012]. 相似文献

8.

Beyond kappa: A review of interrater agreement measures

Mousumi Banerjee Michelle Capozzoli Laura McSweeney Debajyoti Sinha 《Revue canadienne de statistique》1999,27(1):3-23

相似文献

9.

Rounding non-binary categorical variables following multivariate normal imputation: evaluation of simple methods and implications for practice

《Journal of Statistical Computation and Simulation》2012,82(4):798-811

We study bias arising from rounding categorical variables following multivariate normal (MVN) imputation. This task has been well studied for binary variables, but not for more general categorical variables. Three methods that assign imputed values to categories based on fixed reference points are compared using 25 specific scenarios covering variables with k=3, …, 7 categories, and five distributional shapes, and for each k=3, …, 7, we examine the distribution of bias arising over 100,000 distributions drawn from a symmetric Dirichlet distribution. We observed, on both empirical and theoretical grounds, that one method (projected-distance-based rounding) is superior to the other two methods, and that the risk of invalid inference with the best method may be too high at sample sizes n≥150 at 50% missingness, n≥250 at 30% missingness and n≥1500 at 10% missingness. Therefore, these methods are generally unsatisfactory for rounding categorical variables (with up to seven categories) following MVN imputation. 相似文献

10.

Reproducibility of clinical data II: categorical outcomes

Nicole Jill‐Marie Blackman 《Pharmaceutical statistics》2004,3(2):109-122

In this second part of this paper, reproducibility of discrete ordinal and nominal outcomes is addressed. The first part deals with continuous outcomes, concentrating on intraclass correlation (ρ) in the context of one‐way analysis of variance. For categorical data, the focus has generally not been on a meaningful population parameter such as ρ. However, intraclass correlation has been defined for discrete ordinal data, ρ_c, and for nominal data, κ_I. Therefore, a unified approach to reproducibility is proposed. The relevance of these parameters is outlined. Estimation and inferential procedures for ρ_c and κ_I are reviewed, together with worked examples. Topics related to reproducibility that are not addressed in either this or the previous paper are highlighted. Considerations for designing reproducibility studies and for interpreting their results are provided. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

11.

New one-stage multiple comparison procedures with the average for exponential location parameters under heteroscedasticity

《Journal of Statistical Computation and Simulation》2012,82(14):2740-2748

ABSTRACT

In this paper, we present new one-stage multiple comparison procedures with the average for location parameters of two-parameter exponential distributions under heteroscedasticity by modifying the existing one proposed by Wu [One stage multiple comparisons with the average for exponential location parameters under heteroscedasticity. Comput Stat Data Anal. 2013;68:352–360] with unequal sample sizes. A simulation study is done and the results show that the proposed procedures have shorter confidence length with coverage probabilities closer to the nominal ones. At last, an example of comparing the survival days of patients for four categories of lung cancer is given to demonstrate the proposed procedures. 相似文献

12.

A Note on Resampling the Integration Across the Correlation Integral with Alternative Ranges

《Econometric Reviews》2013,32(4):337-349

Abstract

This paper reconsiders the nonlinearity test proposed by Ko[cbreve]enda (Ko[cbreve]enda, E. (2001). An alternative to the BDS test: integration across the correlation integral. Econometric Reviews20:337–351). When the analyzed series is non‐Gaussian, the empirical rejection rates can be much larger than the nominal size. In this context, the necessity of tabulating the empirical distribution of the statistic each time the test is computed is stressed. To that end, simple random permutation works reasonably well. This paper also shows, through Monte Carlo experiments, that Ko[cbreve]enda's test can be more powerful than the Brock et al. (Brock, W., Dechert, D., Scheickman, J., LeBaron, B. (1996). A test for independence based on the correlation dimension. Econometric Reviews15:197–235) procedure. However, more than one range of values for the proximity parameter should be used. Finally, empirical evidence on exchange rates is reassessed. 相似文献

13.

The multinomial logistic regression model for predicting the discharge status after liver transplantation: estimation and diagnostics analysis

E. M. Hashimoto E. M. M. Ortega G. M. Cordeiro A. K. Suzuki M. W. Kattan 《Journal of applied statistics》2020,47(12):2159

The multinomial logistic regression model (MLRM) can be interpreted as a natural extension of the binomial model with logit link function to situations where the response variable can have three or more possible outcomes. In addition, when the categories of the response variable are nominal, the MLRM can be expressed in terms of two or more logistic models and analyzed in both frequentist and Bayesian approaches. However, few discussions about post modeling in categorical data models are found in the literature, and they mainly use Bayesian inference. The objective of this work is to present classic and Bayesian diagnostic measures for categorical data models. These measures are applied to a dataset (status) of patients undergoing kidney transplantation. 相似文献

14.

Bootstrapping the augmented Dickey–Fuller test for unit root using the MDIC

《Journal of Statistical Computation and Simulation》2012,82(3):431-443

In this paper, we consider the bootstrap procedure for the augmented Dickey–Fuller (ADF) unit root test by implementing the modified divergence information criterion (MDIC, Mantalos et al. [An improved divergence information criterion for the determination of the order of an AR process, Commun. Statist. Comput. Simul. 39(5) (2010a), pp. 865–879; Forecasting ARMA models: A comparative study of information criteria focusing on MDIC, J. Statist. Comput. Simul. 80(1) (2010b), pp. 61–73]) for the selection of the optimum number of lags in the estimated model. The asymptotic distribution of the resulting bootstrap ADF/MDIC test is established and its finite sample performance is investigated through Monte-Carlo simulations. The proposed bootstrap tests are found to have finite sample sizes that are generally much closer to their nominal values, than those tests that rely on other information criteria, like the Akaike information criterion [H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csáki, eds., Akademiai Kaido, Budapest, 1973, pp. 267–281]. The simulations reveal that the proposed procedure is quite satisfactory even for models with large negative moving average coefficients. 相似文献

15.

Catanova for Two-Way Contingency Tables with Ordinal Variables Using Orthogonal Polynomials

《统计学通讯:理论与方法》2013,42(8):1755-1769

ABSTRACT

The analysis of variance of cross-classified (categorical) data (CATANOVA) is a technique designed to identify the variation between treatments of interest to the researcher. There are well-established links between CATANOVA and the Goodman and Kruskal tau statistic as well as the Light and Margolin R ² for the purposes of the graphical identification of this variation.

The aim of this article is to present a partition of the numerator of the tau statistic, or equivalently, the BSS measure in the CATANOVA framework, into location, dispersion, and higher order components. Even if a CATANOVA identifies an overall lack of variation, by considering this partition and calculations derived from them, it is possible to identify hidden, but statistically significant, sources of variation. 相似文献

16.

A multiple group item response theory model with centered skew-normal latent trait distributions under a Bayesian framework

Jose R.S. Santos Heleno Bolfarine 《Journal of applied statistics》2013,40(10):2129-2149

Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions. 相似文献

17.

A new measure of nominal-ordinal association

Raffaella Piccarreta 《Journal of applied statistics》2001,28(1):107-120

A new measure for evaluating the strength of the association between a nominal variable and an ordered categorical response variable is introduced. The introduction of a new measure is justified by analysing the characteristics of a measure of the nominal-ordinal association proposed by Agresti (1981), especially with respect to the problem of the 'choice' of a predictive variable. The sample-based version of the index is studied, and its asymptotic standard error and asymptotic distribution are derived. Simulations are considered to evaluate the adequacy of the asymptotic approximation determined, following Goodman & Kruskal (1963). 相似文献

18.

GEEs for repeated categorical responses based on generalized residuals

《Journal of Statistical Computation and Simulation》2012,82(2):344-359

Clustered or correlated samples of categorical response data arise frequently in many fields of application. The method of generalized estimating equations (GEEs) introduced in Liang and Zeger [Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), pp. 13–22] is often used to analyse this type of data. GEEs give consistent estimates of the regression parameters and their variance based upon the Pearson residuals. Park et al. [Alternative GEE estimation procedures for discrete longitudinal data, Comput. Stat. Data Anal. 28 (1998), pp. 243–256] considered a modification of the GEE approach using the Anscombe residual and the deviance residual. In this work, we propose to extend this idea to a family of generalized residuals. A wide simulation study is conducted for binary and Poisson correlated outcomes and also two numerical illustrations are presented. 相似文献

19.

Some maximum-indifference estimators for the slope of a univariate linear model

Claudio G. Borroni D. Michele Cifarelli 《Journal of nonparametric statistics》2016,28(2):395-412

As known, the least-squares estimator of the slope of a univariate linear model sets to zero the covariance between the regression residuals and the values of the explanatory variable. To prevent the estimation process from being influenced by outliers, which can be theoretically modelled by a heavy-tailed distribution for the error term, one can substitute covariance with some robust measures of association, for example Kendall's tau in the popular Theil–Sen estimator. In a scarcely known Italian paper, Cifarelli [(1978), ‘La Stima del Coefficiente di Regressione Mediante l'Indice di Cograduazione di Gini’, Rivista di matematica per le scienze economiche e sociali, 1, 7–38. A translation into English is available at http://arxiv.org/abs/1411.4809 and will appear in Decisions in Economics and Finance] shows that a gain of efficiency can be obtained by using Gini's cograduation index instead of Kendall's tau. This paper introduces a new estimator, derived from another association measure recently proposed. Such a measure is strongly related to Gini's cograduation index, as they are both built to vanish in the general framework of indifference. The newly proposed estimator is shown to be unbiased and asymptotically normally distributed. Moreover, all considered estimators are compared via their asymptotic relative efficiency and a small simulation study. Finally, some indications about the performance of the considered estimators in the presence of contaminated normal data are provided. 相似文献

20.

Efficiency of ranked set sampling in entropy estimation and goodness-of-fit testing for the inverse Gaussian law

《Journal of Statistical Computation and Simulation》2012,82(7):761-774

When measuring units are expensive or time consuming, while ranking them is relatively easy and inexpensive, it is known that ranked set sampling (RSS) is preferable to simple random sampling (SRS). Many authors have suggested several extensions of RSS. As a variation, Al-Saleh and Al-Kadiri [Double ranked set sampling, Statist. Probab. Lett. 48 (2000), pp. 205–212] introduced double ranked set sampling (DRSS) and it was extended by Al-Saleh and Al-Omari [Multistage ranked set sampling, J. Statist. Plann. Inference 102 (2002), pp. 273–286] to multistage ranked set sampling (MSRSS). The entropy of a random variable (r.v.) is a measure of its uncertainty. It is a measure of the amount of information required on the average to determine the value of a (discrete) r.v.. In this work, we discuss entropy estimation in RSS design and aforementioned extensions and compare the results with those in SRS design in terms of bias and root mean square error (RMSE). Motivated by the above observed efficiency, we continue to investigate entropy-based goodness-of-fit test for the inverse Gaussian distribution using RSS. Critical values for some sample sizes determined by means of Monte Carlo simulations are presented for each design. A Monte Carlo power analysis is performed under various alternative hypotheses in order to compare the proposed testing procedure with the existing methods. The results indicate that tests based on RSS and its extensions are superior alternatives to the entropy test based on SRS. 相似文献