首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
AStA Advances in Statistical Analysis - Modeling human ratings data subject to raters’ decision uncertainty is an attractive problem in applied statistics. In view of the complex interplay...  相似文献   

2.
Agreement among raters is an important issue in medicine, as well as in education and psychology. The agreement among two raters on a nominal or ordinal rating scale has been investigated in many articles. The multi-rater case with normally distributed ratings has also been explored at length. However, there is a lack of research on multiple raters using an ordinal rating scale. In this simulation study, several methods were compared with analyze rater agreement. The special case that was focused on was the multi-rater case using a bounded ordinal rating scale. The proposed methods for agreement were compared within different settings. Three main ordinal data simulation settings were used (normal, skewed and shifted data). In addition, the proposed methods were applied to a real data set from dermatology. The simulation results showed that the Kendall's W and mean gamma highly overestimated the agreement in data sets with shifts in data. ICC4 for bounded data should be avoided in agreement studies with rating scales<5, where this method highly overestimated the simulated agreement. The difference in bias for all methods under study, except the mean gamma and Kendall's W, decreased as the rating scale increased. The bias of ICC3 was consistent and small for nearly all simulation settings except the low agreement setting in the shifted data set. Researchers should be careful in selecting agreement methods, especially if shifts in ratings between raters exist and may apply more than one method before any conclusions are made.  相似文献   

3.
Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

4.
Jointly modeling longitudinal and survival data has been an active research area. Most researches focus on improving the estimating efficiency but ignore many data features frequently encountered in practice. In the current study, we develop the joint models that concurrently accounting for longitudinal and survival data with multiple features. Specifically, the proposed model handles skewness, missingness and measurement errors in covariates which are typically observed in the collection of longitudinal survival data from many studies. We employ a Bayesian inferential method to make inference on the proposed model. We applied the proposed model to an real data study. A few alternative models under different conditions are compared. We conduct extensive simulations in order to evaluate how the method works.  相似文献   

5.
ABSTRACT

Online consumer product ratings data are increasing rapidly. While most of the current graphical displays mainly represent the average ratings, Ho and Quinn proposed an easily interpretable graphical display based on an ordinal item response theory (IRT) model, which successfully accounts for systematic interrater differences. Conventionally, the discrimination parameters in IRT models are constrained to be positive, particularly in the modeling of scored data from educational tests. In this article, we use real-world ratings data to demonstrate that such a constraint can have a great impact on the parameter estimation. This impact on estimation was explained through rater behavior. We also discuss correlation among raters and assess the prediction accuracy for both the constrained and the unconstrained models. The results show that the unconstrained model performs better when a larger fraction of rater pairs exhibit negative correlations in ratings.  相似文献   

6.
This paper aims at classifying, on the basis of their disability profile, the population of elderly and quantifying the number of those with a very low level of functioning in a central region of Italy. This is accomplished using a set of variables on the difficulty of accomplishing everyday tasks (Activities of Daily Living, ADL) and functions. This issue is very important for National and Local Health organizations in order to evaluate the need for care, planning services, elaborating policies and allocating resources. Latent class models are applied on data coming from the Italian National Survey on Health Conditions and Appeal to Medicare to extract the latent trait of disability and classify the elder population according to their disability profile. Model selection brings to a classification into four latent classes. Looking at posterior probabilities, classes may be interpreted as follows: elderly without disability, with difficulties in movements, with difficulties in movements and daily tasks, with very low functioning level. Estimates of the amount of population aged 65 or more falling in each class is also provided. Cross-validation shows evidence of the robustness of such classification. Item response theory models are also applied to the items considered to study how functions are lost with increasing levels of disability. In particular, the abilities of climbing stairs and stooping down are those lost first, while those of eating and getting washed are those lost last.  相似文献   

7.
Cross-classified data are often obtained in controlled experimental situations and in epidemiologic studies. As an example of the latter, occupational health studies sometimes require personal exposure measurements on a random sample of workers from one or more job groups, in one or more plant locations, on several different sampling dates. Because the marginal distributions of exposure data from such studies are generally right-skewed and well-approximated as lognormal, researchers in this area often consider the use of ANOVA models after a logarithmic transformation. While it is then of interest to estimate original-scale population parameters (e.g., the overall mean and variance), standard candidates such as maximum likelihood estimators (MLEs) can be unstable and highly biased. Uniformly minimum variance unbiased (UMVU) cstiniators offer a viable alternative, and are adaptable to sampling schemes that are typiral of experimental or epidemiologic studies. In this paper, we provide UMVU estimators for the mean and variance under two random effects ANOVA models for logtransformed data. We illustrate substantial mean squared error gains relative to the MLE when estimating the mean under a one-way classification. We illustrate that the results can readily be extended to encompass a useful class of purely random effects models, provided that the study data are balanced.  相似文献   

8.
Copula models describe the dependence structure of two random variables separately from their marginal distributions and hence are particularly useful in studying the association for bivariate survival data. Semiparametric inference for bivariate survival data based on copula models has been studied for various types of data, including complete data, right-censored data, and current status data. This article discusses the boundary effect on these inference procedures, a problem that has been neglected in the previous literature. Specifically, asymptotic distribution of the association estimator on the boundary of parameter space is derived for one-dimensional copula models. The boundary properties are applied to test independence and to study the estimation efficiency. Simulation study is conducted for the bivariate right-censored data and current status data.  相似文献   

9.
Outliers are commonly observed in psychosocial research, generally resulting in biased estimates when comparing group differences using popular mean-based models such as the analysis of variance model. Rank-based methods such as the popular Mann–Whitney–Wilcoxon (MWW) rank sum test are more effective to address such outliers. However, available methods for inference are limited to cross-sectional data and cannot be applied to longitudinal studies under missing data. In this paper, we propose a generalized MWW test for comparing multiple groups with covariates within a longitudinal data setting, by utilizing the functional response models. Inference is based on a class of U-statistics-based weighted generalized estimating equations, providing consistent and asymptotically normal estimates not only under complete but missing data as well. The proposed approach is illustrated with both real and simulated study data.  相似文献   

10.
Survival data involving silent events are often subject to interval censoring (the event is known to occur within a time interval) and classification errors if a test with no perfect sensitivity and specificity is applied. Considering the nature of this data plays an important role in estimating the time distribution until the occurrence of the event. In this context, we incorporate validation subsets into the parametric proportional hazard model, and show that this additional data, combined with Bayesian inference, compensate the lack of knowledge about test sensitivity and specificity improving the parameter estimates. The proposed model is evaluated through simulation studies, and Bayesian analysis is conducted within a Gibbs sampling procedure. The posterior estimates obtained under validation subset models present lower bias and standard deviation compared to the scenario with no validation subset or the model that assumes perfect sensitivity and specificity. Finally, we illustrate the usefulness of the new methodology with an analysis of real data about HIV acquisition in female sex workers that have been discussed in the literature.  相似文献   

11.
利用辽宁省农民工2014年抽样调查数据,对职业培训能否降低农民工的工作转换进行研究。在考虑人力资本异质性的基础上,运用离散选择模型对此做了回答。研究发现,企业提供的职业培训,通过增加专用性人力资本能够显著降低农民工的工作转换;政府或农民工自我提供的培训,由于增加的是通用性人力资本,因而无法有效降低农民工的工作转换。由此,研究认为,要想有效解决农民工的频繁工作变动问题,企业应把农民工纳入到培训体系当中。同时,地方政府可通过补贴政策来协助企业进行农民工培训,帮助企业分担培训成本和风险。  相似文献   

12.
In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.  相似文献   

13.
Statistics as data is ancient, but as a discipline of study and research it has a short history. Courses leading to degrees in statistics have been introduced in universities some sixty to seventy years ago. They were not considered to constitute a basic discipline with a subject matter of its own. However, during the last seventy five years, it has developed as a powerful blend of science, technology and art for solving problems in all areas of human endeavor. Now-a-days statistics is used in scientific research, economic development through optimum use of resources, increasing industrial productivity, medical diagnosis, legal practice, disputed authorship, and optimum decision making at individual and institutional levels. What is the future of statistics in the coming millennium dominated by information technology encompassing the whole of communications, interaction with intelligent systems, massive data bases, and complex information processing networks? The current statistical methodology based on probabilistic models applied on small data sets appears to be inadequate to meet the needs of the society in terms of quick processing of data and making the information available for practical purposes. Adhoc methods are being put forward under the title Data Mining by computer scientists and engineers to meet the needs of customers. The paper reviews the current state of the art in statistics and discusses possible future developments considering the availability of large data sets, enormous computing power and efficient optimization techniques using genetic algorithms and neural networks.

  相似文献   

14.
李博文等 《统计研究》2021,38(10):105-120
本文采用2013年和2016年广东省佛山市南海区“雇主—雇员”匹配数据库,利用倾向得分匹配和工具变量等估计方法,实证检验了工会会员身份对农民工工资率的影响及其在两代农民工之间的差异,并进行了影响机制的探索和分析。研究发现,将工会会员身份的工资溢价效应分解为覆盖效应和会员效应后,覆盖效应的影响难以发挥,而会员效应的影响则存在代际差异,其只在新生代农民工中显著,而在第一代农民工中不显著。进一步研究发现,覆盖效应难以发挥主要是由于工会未能有效地开展集体协商,会员效应存在代际差异则是源于两代农民工不同的需求层次所导致的入会行为导向的不同。本文研究,有利于企业工会着力解决农民工的工资问题,建立农民工工资的长效增长机制,同时为企业构建和谐劳动关系以及实施多元化的人力资源管理策略等提供参考和借鉴。  相似文献   

15.
Non-nested hypothesis tests provide a way to test the specification of an econometric model against the evidence provided by one or more non-nested alternatives. This paper surveys the recent literature on non-nested hypothesis testing in the context of regression and related models. Much of the purely statistical 1iterature which has evolved from the fundamental work of Cox (1961, 1962) is discussed briefly or not at all. Instead, emphasis is placed on those techniques which are easy to employ in practice and are likely to be useful to applied workers.  相似文献   

16.
In biomedical studies, frailty models arecommonly used in analyzing multivariate survival data, wherethe objective of the study is to estimate both the covariateeffect and the dependence between the multivariate survival times.However, inference based on these models are dependent on thedistributional assumption of frailty. We propose a diagnosticplot for assessing the frailty assumption. The proposed methodis based on the cross-ratio function and the diagnostic plotsuggested by Oakes (1989). We use kernel regression smoothingwith bandwidth choice by cross-validation, to obtain the proposedplot. The resulting plot is capable of differentiating betweenthe gamma and positive stable frailty models when strong associationis present. We illustrate the feasibility of our method usingsimulation studies under known frailty distributions. The approachis applied to data on blindness for each eye of diabetic patientswith adult onset diabetes and a reasonable fit to the gamma frailtymodel is found.  相似文献   

17.
Non-nested hypothesis tests provide a way to test the specification of an econometric model against the evidence provided by one or more non-nested alternatives. This paper surveys the recent literature on non-nested hypothesis testing in the context of regression and related models. Much of the purely statistical 1iterature which has evolved from the fundamental work of Cox (1961, 1962) is discussed briefly or not at all. Instead, emphasis is placed on those techniques which are easy to employ in practice and are likely to be useful to applied workers.  相似文献   

18.
Summary.  In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias.  相似文献   

19.
Quantile regression (QR) models have received increasing attention recently for longitudinal data analysis. When continuous responses appear non-centrality due to outliers and/or heavy-tails, commonly used mean regression models may fail to produce efficient estimators, whereas QR models may perform satisfactorily. In addition, longitudinal outcomes are often measured with non-normality, substantial errors and non-ignorable missing values. When carrying out statistical inference in such data setting, it is important to account for the simultaneous treatment of these data features; otherwise, erroneous or even misleading results may be produced. In the literature, there has been considerable interest in accommodating either one or some of these data features. However, there is relatively little work concerning all of them simultaneously. There is a need to fill up this gap as longitudinal data do often have these characteristics. Inferential procedure can be complicated dramatically when these data features arise in longitudinal response and covariate outcomes. In this article, our objective is to develop QR-based Bayesian semiparametric mixed-effects models to address the simultaneous impact of these multiple data features. The proposed models and method are applied to analyse a longitudinal data set arising from an AIDS clinical study. Simulation studies are conducted to assess the performance of the proposed method under various scenarios.  相似文献   

20.
新生代农民工作为城市新增劳动力的主要来源,其人力资本水平的提升,对促进供给侧结构性改革、增强经济持续增长动力具有重要意义。为了研究新生代农民工人力资本的构成、测度及现状等问题,从健康资本、经验资本和技能资本三个维度建立指标体系,构建了基于潜变量的新生代农民工人力资本的测度模型。通过问卷调查,采用偏最小二乘方法对新生代农民工的人力资本水平进行了测度,结果表明:新生代农民工健康资本对于技能资本的形成具有积极作用,健康资本通过技能资本对收入产生影响,技能资本和经验资本的差异直接导致其收入的差异,新生代农民工技能资本和经验资本仍然偏低。因此,提升新生代农民工人力资本应该突破学历教育的局限,转向适应产业发展需求的职业技能培训和技术经验的积累上。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号