首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
ABSTRACT

Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.  相似文献   

2.
ABSTRACT

Often in data arising out of epidemiologic studies, covariates are subject to measurement error. In addition ordinal responses may be misclassified into a category that does not reflect the true state of the respondents. The goal of the present work is to develop an ordered probit model that corrects for the classification errors in ordinal responses and/or measurement error in covariates. Maximum likelihood method of estimation is used. Simulation study reveals the effect of ignoring measurement error and/or classification errors on the estimates of the regression coefficients. The methodology developed is illustrated through a numerical example.  相似文献   

3.
In this article, the operational details of the R package PoisNor that is designed for simulating multivariate data with count and continuous variables with a prespecified correlation matrix are described, and examples of some important functions are given. The data-generation mechanism is a combination of the “NORmal To Anything” principle and a recently established connection between Poisson and normal correlations. The package provides a unique and useful tool that has been lacking for generating multivariate mixed data with Poisson and normal components.  相似文献   

4.
Abstract

The regression model with ordinal outcome has been widely used in a lot of fields because of its significant effect. Moreover, predictors measured with error and multicollinearity are long-standing problems and often occur in regression analysis. However there are not many studies on dealing with measurement error models with generally ordinal response, even fewer when they suffer from multicollinearity. The purpose of this article is to estimate parameters of ordinal probit models with measurement error and multicollinearity. First, we propose to use regression calibration and refined regression calibration to estimate parameters in ordinal probit models with measurement error. Second, we develop new methods to obtain estimators of parameters in the presence of multicollinearity and measurement error in ordinal probit model. Furthermore we also extend all the methods to quadratic ordinal probit models and talk about the situation in ordinal logistic models. These estimators are consistent and asymptotically normally distributed under general conditions. They are easy to compute, perform well and are robust against the normality assumption for the predictor variables in our simulation studies. The proposed methods are applied to some real datasets.  相似文献   

5.
Abstract

Missing data arise frequently in clinical and epidemiological fields, in particular in longitudinal studies. This paper describes the core features of an R package wgeesel, which implements marginal model fitting (i.e., weighted generalized estimating equations, WGEE; doubly robust GEE) for longitudinal data with dropouts under the assumption of missing at random. More importantly, this package comprehensively provide existing information criteria for WGEE model selection on marginal mean or correlation structures. Also, it can serve as a valuable tool for simulating longitudinal data with missing outcomes. Lastly, a real data example and simulations are presented to illustrate and validate our package.  相似文献   

6.
ABSTRACT

We introduce a semi-parametric Bayesian approach based on skewed Dirichlet processes priors for location parameters in the ordinal calibration problem. This approach allows the modeling of asymmetrical error distributions. Conditional posterior distributions are implemented, thus allowing the use of Markov chains Monte Carlo to generate the posterior distributions. The methodology is applied to both simulated and real data.  相似文献   

7.

Influence diagnostics are investigated in this study. In particular, an approach based on the generalized linear mixed model setting is presented for formulating ordered categorical counts in stratified contingency tables. Deletion diagnostics and their first-order approximations are developed for assessing the stratum-specific influence on parameter estimates in the models. To illustrate the proposed model diagnostic technique, the method is applied to analyze two sets of data: a clinical trial and a survey study. The two examples demonstrate that the presence of influential strata may substantially change the results in ordinal contingency table analysis.  相似文献   

8.
Nonparametric estimation of copula-based measures of multivariate association in a continuous random vector X=(X1, …, Xd) is usually based on complete continuous data. In many practical applications, however, these types of data are not readily available; instead aggregated ordinal observations are given, for example, ordinal ratings based on a latent continuous scale. This article introduces a purely nonparametric and data-driven estimator of the unknown copula density and the corresponding copula based on multivariate contingency tables. Estimators for multivariate Spearman's rho and Kendall's tau are based thereon. The properties of these estimators in samples of medium and large size are evaluated in a simulation study. An increasing bias can be observed along with an increasing degree of association between the components. As it is to be expected, the bias is severely influenced by the amount of information available. Additionally, the influence of sample size is only marginal. We further give an empirical illustration based on daily returns of five German stocks.  相似文献   

9.
Measurement error and misclassification arise commonly in various data collection processes. It is well-known that ignoring these features in the data analysis usually leads to biased inference. With the generalized linear model setting, Yi et al. [Functional and structural methods with mixed measurement error and misclassification in covariates. J Am Stat Assoc. 2015;110:681–696] developed inference methods to adjust for the effects of measurement error in continuous covariates and misclassification in discrete covariates simultaneously for the scenario where validation data are available. The augmented simulation-extrapolation (SIMEX) approach they developed generalizes the usual SIMEX method which is only applicable to handle continuous error-prone covariates. To implement this method, we develop an R package, augSIMEX, for public use. Simulation studies are conducted to illustrate the use of the algorithm. This package is available at CRAN.  相似文献   

10.
ABSTRACT

Online consumer product ratings data are increasing rapidly. While most of the current graphical displays mainly represent the average ratings, Ho and Quinn proposed an easily interpretable graphical display based on an ordinal item response theory (IRT) model, which successfully accounts for systematic interrater differences. Conventionally, the discrimination parameters in IRT models are constrained to be positive, particularly in the modeling of scored data from educational tests. In this article, we use real-world ratings data to demonstrate that such a constraint can have a great impact on the parameter estimation. This impact on estimation was explained through rater behavior. We also discuss correlation among raters and assess the prediction accuracy for both the constrained and the unconstrained models. The results show that the unconstrained model performs better when a larger fraction of rater pairs exhibit negative correlations in ratings.  相似文献   

11.
ABSTRACT

For many years, detection of clusters has been of great public health interest and widely studied. Several methods have been developed to detect clusters and their performance has been evaluated in various contexts. Spatial scan statistics are widely used for geographical cluster detection and inference. Different types of discrete or continuous data can be analyzed using spatial scan statistics for Bernoulli, Poisson, ordinal, exponential, and normal models. In this paper, we propose a scan statistic for survival data which is based on generalized life distribution model that provides three important life distributions, viz. Weibull, exponential, and Rayleigh. The proposed method is applied to the survival data of tuberculosis patients in Nainital district of Uttarakhand, India, for the year 2004–05. The Monte Carlo simulation studies reveal that the proposed method performs well for different survival distributions.  相似文献   

12.

When analyzing categorical data using loglinear models in sparse contingency tables, asymptotic results may fail. In this paper the empirical properties of three commonly used asymptotic tests of independence, based on the uniform association model for ordinal data, are investigated by means of Monte Carlo simulation. Five different bootstrapped tests of independence are presented and compared to the asymptotic tests. The comparisons are made with respect to both size and power properties of the tests. Results indicate that the asymptotic tests have poor size control. The test based on the estimated association parameter is severely conservative and the two chi-squared tests (Pearson, likelihood-ratio) are both liberal. The bootstrap tests that either use a parametric assumption or are based on non-pivotal test statistics do not perform better than the asymptotic tests in all situations. The bootstrap tests that are based on approximately pivotal statistics provide both adjustment of size and enhancement of power. These tests are therefore recommended for use in situations similar to those included in the simulation study.  相似文献   

13.
This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set,  validity  and  efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada  相似文献   

14.
A random effects model for analyzing mixed longitudinal count and ordinal data is presented where the count response is inflated in two points (k and l) and an (k,l)-Inflated Power series distribution is used as its distribution. A full likelihood-based approach is used to obtain maximum likelihood estimates of parameters of the model. For data with non-ignorable missing values models with probit model for missing mechanism are used.The dependence between longitudinal sequences of responses and inflation parameters are investigated using a random effects approach. Also, to investigate the correlation between mixed ordinal and count responses of each individuals at each time, a shared random effect is used. In order to assess the performance of the model, a simulation study is performed for a case that the count response has (k,l)-Inflated Binomial distribution. Performance comparisons of count-ordinal random effect model, Zero-Inflated ordinal random effects model and (k,l)-Inflated ordinal random effects model are also given. The model is applied to a real social data set from the first two waves of the national longitudinal study of adolescent to adult health (Add Health study). In this data set, the joint responses are the number of days in a month that each individual smoked as the count response and the general health condition of each individual as the ordinal response. For the count response there is incidence of excess values of 0 and 30.  相似文献   

15.
ABSTRACT

Motivated by a longitudinal oral health study, the Signal-Tandmobiel® study, a Bayesian approach has been developed to model misclassified ordinal response data. Two regression models have been considered to incorporate misclassification in the categorical response. Specifically, probit and logit models have been developed. The computational difficulties have been avoided by using data augmentation. This idea is exploited to derive efficient Markov chain Monte Carlo methods. Although the method is proposed for ordered categories, it can also be implemented for unordered ones in a simple way. The model performance is shown through a simulation-based example and the analysis of the motivating study.  相似文献   

16.
In order to accelerate object evaluation, some measurement systems commonly use an ordinal scale (e.g., stick results, quality estimation). This paper presents a way to analyze ordinal data variation. As in classical ANOVA for continual data, ORDANOVA for ordinal data splits the total variation into within and between components. This decomposition has various practical applications such as classification, cluster analysis, distinguishing feature identification and so on.  相似文献   

17.
Agreement among raters is an important issue in medicine, as well as in education and psychology. The agreement among two raters on a nominal or ordinal rating scale has been investigated in many articles. The multi-rater case with normally distributed ratings has also been explored at length. However, there is a lack of research on multiple raters using an ordinal rating scale. In this simulation study, several methods were compared with analyze rater agreement. The special case that was focused on was the multi-rater case using a bounded ordinal rating scale. The proposed methods for agreement were compared within different settings. Three main ordinal data simulation settings were used (normal, skewed and shifted data). In addition, the proposed methods were applied to a real data set from dermatology. The simulation results showed that the Kendall's W and mean gamma highly overestimated the agreement in data sets with shifts in data. ICC4 for bounded data should be avoided in agreement studies with rating scales<5, where this method highly overestimated the simulated agreement. The difference in bias for all methods under study, except the mean gamma and Kendall's W, decreased as the rating scale increased. The bias of ICC3 was consistent and small for nearly all simulation settings except the low agreement setting in the shifted data set. Researchers should be careful in selecting agreement methods, especially if shifts in ratings between raters exist and may apply more than one method before any conclusions are made.  相似文献   

18.
A popular choice when analyzing ordinal data is to consider the cumulative proportional odds model to relate the marginal probabilities of the ordinal outcome to a set of covariates. However, application of this model relies on the condition of identical cumulative odds ratios across the cut-offs of the ordinal outcome; the well-known proportional odds assumption. This paper focuses on the assessment of this assumption while accounting for repeated and missing data. In this respect, we develop a statistical method built on multiple imputation (MI) based on generalized estimating equations that allows to test the proportionality assumption under the missing at random setting. The performance of the proposed method is evaluated for two MI algorithms for incomplete longitudinal ordinal data. The impact of both MI methods is compared with respect to the type I error rate and the power for situations covering various numbers of categories of the ordinal outcome, sample sizes, rates of missingness, well-balanced and skewed data. The comparison of both MI methods with the complete-case analysis is also provided. We illustrate the use of the proposed methods on a quality of life data from a cancer clinical trial.  相似文献   

19.
Using a multivariate latent variable approach, this article proposes some new general models to analyze the correlated bounded continuous and categorical (nominal or/and ordinal) responses with and without non-ignorable missing values. First, we discuss regression methods for jointly analyzing continuous, nominal, and ordinal responses that we motivated by analyzing data from studies of toxicity development. Second, using the beta and Dirichlet distributions, we extend the models so that some bounded continuous responses are replaced for continuous responses. The joint distribution of the bounded continuous, nominal and ordinal variables is decomposed into a marginal multinomial distribution for the nominal variable and a conditional multivariate joint distribution for the bounded continuous and ordinal variables given the nominal variable. We estimate the regression parameters under the new general location models using the maximum-likelihood method. Sensitivity analysis is also performed to study the influence of small perturbations of the parameters of the missing mechanisms of the model on the maximal normal curvature. The proposed models are applied to two data sets: BMI, Steatosis and Osteoporosis data and Tehran household expenditure budgets.  相似文献   

20.
Multiple imputation (MI) is now a reference solution for handling missing data. The default method for MI is the Multivariate Normal Imputation (MNI) algorithm that is based on the multivariate normal distribution. In the presence of longitudinal ordinal missing data, where the Gaussian assumption is no longer valid, application of the MNI method is questionable. This simulation study compares the performance of the MNI and ordinal imputation regression model for incomplete longitudinal ordinal data for situations covering various numbers of categories of the ordinal outcome, time occasions, sample sizes, rates of missingness, well-balanced, and skewed data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号