期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Weighted logistic regression and robust analysis of diverse toxicology data

Douglas G. Simpson Raymond J. Carroll Minge Xie Daniel J. Guth 《统计学通讯:理论与方法》2013,42(11):2615-2632

Simpson, Carroll, Zhou and Guth (1996) developed an ordinal response regression approach to meta-analysis of data from diverse toxicology studies, applying the methodology to a database of acute inhalation studies of tetra-chloroethylene. We present an alternative analysis of the same data, with two major differences: (1) interval censored scores are assigned worst-case values, e.g., a score known to be in the interval [0,1] is set equal to 1; and (2) the response is reduced to a binary response (adverse, nonadverse). We explore the stability of the analysis by varying a robustness parameter and graphing the curves traced out by the estimates and confidence intervals. 相似文献

2.

Semi-parametric survival analysis via Dirichlet process mixtures of the First Hitting Time model

Race Jonathan A. Pennell Michael L. 《Lifetime data analysis》2021,27(1):177-194

Time-to-event data often violate the proportional hazards assumption inherent in the popular Cox regression model. Such violations are especially common in the sphere of biological and medical data where latent heterogeneity due to unmeasured covariates or time varying effects are common. A variety of parametric survival models have been proposed in the literature which make more appropriate assumptions on the hazard function, at least for certain applications. One such model is derived from the First Hitting Time (FHT) paradigm which assumes that a subject’s event time is determined by a latent stochastic process reaching a threshold value. Several random effects specifications of the FHT model have also been proposed which allow for better modeling of data with unmeasured covariates. While often appropriate, these methods often display limited flexibility due to their inability to model a wide range of heterogeneities. To address this issue, we propose a Bayesian model which loosens assumptions on the mixing distribution inherent in the random effects FHT models currently in use. We demonstrate via simulation study that the proposed model greatly improves both survival and parameter estimation in the presence of latent heterogeneity. We also apply the proposed methodology to data from a toxicology/carcinogenicity study which exhibits nonproportional hazards and contrast the results with both the Cox model and two popular FHT models.

相似文献

3.

Mixture inverse Gaussian distributions and its transformations,moments and applications

N. Balakrishnan Antonio Sanhueza Enrique Cabrera 《Statistics》2013,47(1):91-104

Skewed models are important and necessary when parametric analyses are carried out on data. Mixture distributions produce widely flexible models with good statistical and probabilistic properties, and the mixture inverse Gaussian (MIG) model is one of those. Transformations of the MIG model also create new parametric distributions, which are useful in diverse situations. The aim of this paper is to discuss several aspects of the MIG distribution useful for modelling positive data. We specifically discuss transformations, the derivation of moments, fitting of models, and a shape analysis of the transformations. Finally, real examples from engineering, environment, insurance, and toxicology are presented for illustrating some of the results developed here. Three of the four data sets, which have arisen from the consulting work of the authors, are new and have not been previously analysed. All these examples display that the empirical fit of the MIG distribution to the data is very good. 相似文献

4.

Robust exploratory factor analysis

Reinhold Kosfeld 《Statistical Papers》1996,37(2):105-122

In classical factor analysis, a few outliers can bias the factor structure extracted from the relationship between manifest variables. As in least-squares regression analysis there is no protection against deviant observations. This paper discusses estimation methods which aim to extract the “true” factor structure reflecting the relationships within the bulk of the data. Such estimation methods constitute the core of robust factor analysis. By means of a simulation study, we illustrate that an implementation of robust estimation methods can lend considerable improvement to the validity of a factor analysis. 相似文献

5.

一种加权主成分距离的聚类分析方法 总被引：1，自引：0，他引：1

吕岩威李平《统计研究》2016,33(11):102-108

指标之间的高度相关性及其重要性差异导致了传统聚类分析方法往往无法获得良好的分类效果。本文在对传统聚类分析方法及其各种改进方法局限性展开探讨的基础上,运用数学方法重构了分类定义中的距离概念,通过定义自适应赋权的主成分距离为分类统计量,提出一种新的改进的主成分聚类分析方法——加权主成分距离聚类分析法。理论研究表明,加权主成分距离聚类分析法系统集成了已有聚类分析方法的优点,有充分的理论基础保证其科学合理性。仿真实验结果显示,加权主成分距离聚类分析法能够有效解决已有聚类分析方法在特定情形下的失真问题,所得分类效果更为理想。相似文献

6.

Marginal and Conditional Distribution Estimation from Double‐sampled Semi‐competing Risks Data

下载免费PDF全文

Menggang Yu Constantin T. Yiannoutsos 《Scandinavian Journal of Statistics》2015,42(1):87-103

Informative dropout is a vexing problem for any biomedical study. Most existing statistical methods attempt to correct estimation bias related to this phenomenon by specifying unverifiable assumptions about the dropout mechanism. We consider a cohort study in Africa that uses an outreach programme to ascertain the vital status for dropout subjects. These data can be used to identify a number of relevant distributions. However, as only a subset of dropout subjects were followed, vital status ascertainment was incomplete. We use semi‐competing risk methods as our analysis framework to address this specific case where the terminal event is incompletely ascertained and consider various procedures for estimating the marginal distribution of dropout and the marginal and conditional distributions of survival. We also consider model selection and estimation efficiency in our setting. Performance of the proposed methods is demonstrated via simulations, asymptotic study and analysis of the study data. 相似文献

7.

A Comparison of Hierarchical Methods for Clustering Functional Data

Laura Ferreira 《统计学通讯:模拟与计算》2013,42(9):1925-1949

Functional data analysis (FDA)—the analysis of data that can be considered a set of observed continuous functions—is an increasingly common class of statistical analysis. One of the most widely used FDA methods is the cluster analysis of functional data; however, little work has been done to compare the performance of clustering methods on functional data. In this article, a simulation study compares the performance of four major hierarchical methods for clustering functional data. The simulated data varied in three ways: the nature of the signal functions (periodic, non periodic, or mixed), the amount of noise added to the signal functions, and the pattern of the true cluster sizes. The Rand index was used to compare the performance of each clustering method. As a secondary goal, clustering methods were also compared when the number of clusters has been misspecified. To illustrate the results, a real set of functional data was clustered where the true clustering structure is believed to be known. Comparing the clustering methods for the real data set confirmed the findings of the simulation. This study yields concrete suggestions to future researchers to determine the best method for clustering their functional data. 相似文献

8.

A Comparison of Two Group Classification Approaches to Fat-tailed and Skewed Data

Filiz Kardiyen Hülya Olmuş 《统计学通讯:模拟与计算》2016,45(1):17-32

The problem of two-group classification has implications in a number of fields, such as medicine, finance, and economics. This study aims to compare the methods of two-group classification. The minimum sum of deviations and linear programming model, linear discriminant analysis, quadratic discriminant analysis and logistic regression, multivariate analysis of variance (MANOVA) test-based classification and the unpooled T-square test-based classification methods, support vector machines and k-nearest neighbor methods, and combined classification method will be compared for data structures having fat-tail and/or skewness. The comparison has been carried out by using a simulation procedure designed for various stable distribution structures and sample sizes. 相似文献

9.

稳健主成分分析方法研究及其在经济管理中的应用

下载免费PDF全文

王斌会《统计研究》2007,24(8):72-76

传统的多元统计分析方法,如主成分分析方法和因子分析方法等的共同点是计算样本的均值向量和协方差矩阵,并在这两者的基础上计算其他统计量。当样本数据中没有离群值时,这些方法都能得到优良的结果。但是当样本数据中包括离群值时,计算结果就会很容易受到这些离群值的影响,这是因为传统的均值向量和协方差矩阵都不是稳健的统计量。本文对目前较流行的FAST-MCD方法的算法进行研究,构造了稳健的均值向量和稳健的协方差矩阵,应用到主成分分析中,并针对其不足之处提出改进方法。从模拟和实证的结果来看,改进后的的方法和新的稳健估计量确实能够对离群值起到很好的抵抗作用,大幅度地降低它们对计算结果的影响。相似文献

10.

A study on discriminant analysis techniques applied to multivariate lognormal data

《Journal of Statistical Computation and Simulation》2012,82(1-2):79-100

The purpose of this paper is to examine the multiple group (>2) discrimination problem in which the group sizes are unequal and the variables used in the classification are correlated with skewed distributions. Using statistical simulation based on data from a clinical study, we compare the performances, in terms of misclassification rates, of nine statistical discrimination methods. These methods are linear and quadratic discriminant analysis applied to untransformed data, rank transformed data, and inverse normal scores data, as well as fixed kernel discriminant analysis, variable kernel discriminant analysis, and variable kernel discriminant analysis applied to inverse normal scores data. It is found that the parametric methods with transformed data generally outperform the other methods, and the parametric methods applied to inverse normal scores usually outperform the parametric methods applied to rank transformed data. Although the kernel methods often have very biased estimates, the variable kernel method applied to inverse normal scores data provides considerable improvement in terms of total nonerror rate. 相似文献

11.

Exposure Stratified Case-Cohort Designs 总被引：5，自引：1，他引：4

Borgan O Langholz B Samuelsen SO Goldstein L Pogoda J 《Lifetime data analysis》2000,6(1):39-58

A variant of the case-cohort design is proposed for the situation in which a correlate of the exposure (or prognostic factor) of interest is available for all cohort members, and exposure information is to be collected for a case-cohort sample. The cohort is stratified according to the correlate, and the subcohort is selected by stratified random sampling. A number of possible methods for the analysis of such exposure stratified case-cohort samples are presented, some of their statistical properties developed, and approximate relative efficiency and optimal allocation to the strata discussed. The methods are compared to each other, and to randomly sampled case-cohort studies, in a limited computer simulation study. We found that all of the proposed analysis methods performed well and were more efficient than a randomly sampled case-cohort study. 相似文献

12.

A COMPARISON OF ANALYSIS METHODS FOR LATE‐STAGE VARIETY EVALUATION TRIALS

Sue J. Welham Beverley J. Gogel Alison B. Smith Robin Thompson Brian R. Cullis 《Australian & New Zealand Journal of Statistics》2010,52(2):125-149

The statistical analysis of late‐stage variety evaluation trials using a mixed model is described, with one‐ or two‐stage approaches to the analysis. Two sets of trials, from Australia and the UK, were used to provide realistic scenarios for a simulation study to evaluate the different methods of analysis. This study showed that a one‐stage approach gave the most accurate predictions of variety performance overall or within each environment, across a range of models, as measured by mean squared error of prediction or realized genetic gain. A weighted two‐stage approach performed adequately for variety predictions both overall and within environments, but a two‐stage unweighted approach performed poorly in both cases. A generalized heritability measure was developed to compare methods. 相似文献

13.

A comparison of statistical methods for animal oncology studies

Tianhui Zhang Steven J. Novick 《Pharmaceutical statistics》2023,22(1):112-127

In pre-clinical oncology studies, tumor-bearing animals are treated and observed over a period of time in order to measure and compare the efficacy of one or more cancer-intervention therapies along with a placebo/standard of care group. A data analysis is typically carried out by modeling and comparing tumor volumes, functions of tumor volumes, or survival. Data analysis on tumor volumes is complicated because animals under observation may be euthanized prior to the end of the study for one or more reasons, such as when an animal's tumor volume exceeds an upper threshold. In such a case, the tumor volume is missing not-at-random for the time remaining in the study. To work around the non-random missingness issue, several statistical methods have been proposed in the literature, including the rate of change in log tumor volume and partial area under the curve. In this work, an examination and comparison of the test size and statistical power of these and other popular methods for the analysis of tumor volume data is performed through realistic Monte Carlo computer simulations. The performance, advantages, and drawbacks of popular statistical methods for animal oncology studies are reported. The recommended methods are applied to a real data set. 相似文献

14.

Comparing a survey and a conjoint study: the future vision of water intermediaries

Erik Mønness Kim Pearce Shirley Coleman 《Journal of applied statistics》2008,35(1):19-30

This paper compares and contrasts two methods of obtaining opinions using questionnaires. As the name suggests, a conjoint study makes it possible to consider several attributes jointly. Conjoint analysis is a statistical method to analyse preferences. However, conjoint analysis requires a certain amount of effort by the respondent. The alternative is ordinary survey questions, answered one at a time. Survey questions are easier to grasp mentally, but they do not challenge the respondent to prioritize. This investigation has utilized both methods, survey and conjoint, making it possible to compare them on real data. Attribute importance, attribute correlations, case clustering and attribute grouping are evaluated by both methods. Correspondence between how the two methods measure the attribute in question is also given. Overall, both methods yield the same picture concerning the relative importance of the attributes. Taken one attribute at a time, the correspondence between the methods varies from good to no correspondence. Considering all attributes together by cluster analysis of the cases, the conjoint and survey data yield different cluster structures. The attributes are grouped by factor analysis, and there is reasonable correspondence. The data originate from the EU project 'New Intermediary services and the transformation of urban water supply and wastewater disposal systems in Europe'. 相似文献

15.

C330. The variability of evaluations of speculative ideas

《Journal of Statistical Computation and Simulation》2012,82(1-2):81-88

This article reports results of an extensive simulation study which investigated the performances of some commonly used methods of estimating error rates in discriminant analysis. Earlier research papers limited their comparisons of these methods to independent training data. This study allows for a simple auto-regressive dependence among the training data. The results suggest that the estimation methods based on the normal distribution perform adequately well under conditions of negative or mild positive correlation in the data, and small dimensions (p) of the observation vectors. For large p or strong positive correlation structures the conclusion is that one of the better non-parametric methods should be used. Special circumstances and conditions which notably affect the relative performances of the methods are identified. 相似文献

16.

On robustifying some second order blind source separation methods for nonstationary time series

Klaus Nordhausen 《Statistical Papers》2014,55(1):141-156

Blind source separation (BSS) is an important analysis tool in various signal processing applications like image, speech or medical signal analysis. The most popular BSS solutions have been developed for independent component analysis (ICA) with identically and independently distributed (iid) observation vectors. In many BSS applications the assumption on iid observations is not realistic, however, as the data are often an observed time series with temporal correlation and even nonstationarity. In this paper, some BSS methods for time series with nonstationary variances are discussed. We also suggest ways to robustify these methods and illustrate their performance in a simulation study. 相似文献

17.

A New Robust Regression Method Based on Particle Swarm Optimization

Ozge Cagcag Erol Egrioglu 《统计学通讯:理论与方法》2013,42(6):1270-1280

Regression analysis is one of methods widely used in prediction problems. Although there are many methods used for parameter estimation in regression analysis, ordinary least squares (OLS) technique is the most commonly used one among them. However, this technique is highly sensitive to outlier observation. Therefore, in literature, robust techniques are suggested when data set includes outlier observation. Besides, in prediction a problem, using the techniques that reduce the effectiveness of outlier and using the median as a target function rather than an error mean will be more successful in modeling these kinds of data. In this study, a new parameter estimation method using the median of absolute rate obtained by division of the difference between observation values and predicted values by the observation value and based on particle swarm optimization was proposed. The performance of the proposed method was evaluated with a simulation study by comparing it with OLS and some other robust methods in the literature. 相似文献

18.

A multiple imputation method for incomplete correlated ordinal data using multivariate probit models

Xiao Zhang Quanlin Li Karen Cropsey Xiaowei Yang Kui Zhang Thomas Belin 《统计学通讯:模拟与计算》2017,46(3):2360-2375

The multiple imputation technique has proven to be a useful tool in missing data analysis. We propose a Markov chain Monte Carlo method to conduct multiple imputation for incomplete correlated ordinal data using the multivariate probit model. We conduct a thorough simulation study to compare the performance of our proposed method with two available imputation methods – multivariate normal-based and chain equation methods for various missing data scenarios. For illustration, we present an application using the data from the smoking cessation treatment study for low-income community corrections smokers. 相似文献

19.

Analysis of longitudinal binary data from multiphase sampling

David Clayton David Spiegelhalter Graham Dunn & Andrew Pickles 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(1):71-87

The efficient use of surrogate or auxiliary information has been investigated within both model-based and design-based approaches to data analysis, particularly in the context of missing data. Here we consider the use of such data in epidemiological studies of disease incidence in which surrogate measures of disease status are available for all subjects at two time points, but definitive diagnoses are available only in stratified subsamples. We briefly review methods for the analysis of two-phase studies of disease prevalence at a single time point, and we discuss the extension of four of these methods to the analysis of incidence studies. Their performance is compared with special reference to a study of the incidence of senile dementia. 相似文献

20.

Early stopping by using stochastic curtailment in a three-arm sequential trial

Denis Heng-Yan Leung You-Gan Wang David Amar 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(2):139-152

Summary. Interim analysis is important in a large clinical trial for ethical and cost considerations. Sometimes, an interim analysis needs to be performed at an earlier than planned time point. In that case, methods using stochastic curtailment are useful in examining the data for early stopping while controlling the inflation of type I and type II errors. We consider a three-arm randomized study of treatments to reduce perioperative blood loss following major surgery. Owing to slow accrual, an unplanned interim analysis was required by the study team to determine whether the study should be continued. We distinguish two different cases: when all treatments are under direct comparison and when one of the treatments is a control. We used simulations to study the operating characteristics of five different stochastic curtailment methods. We also considered the influence of timing of the interim analyses on the type I error and power of the test. We found that the type I error and power between the different methods can be quite different. The analysis for the perioperative blood loss trial was carried out at approximately a quarter of the planned sample size. We found that there is little evidence that the active treatments are better than a placebo and recommended closure of the trial. 相似文献