首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Various statistical tests have been developed for testing the equality of means in matched pairs with missing values. However, most existing methods are commonly based on certain distributional assumptions such as normality, 0-symmetry or homoscedasticity of the data. The aim of this paper is to develop a statistical test that is robust against deviations from such assumptions and also leads to valid inference in case of heteroscedasticity or skewed distributions. This is achieved by applying a clever randomization approach to handle missing data. The resulting test procedure is not only shown to be asymptotically correct but is also finitely exact if the distribution of the data is invariant with respect to the considered randomization group. Its small sample performance is further studied in an extensive simulation study and compared to existing methods. Finally, an illustrative data example is analysed.  相似文献   

2.
The use of surrogate end points has become increasingly common in medical and biological research. This is primarily because, in many studies, the primary end point of interest is too expensive or too difficult to obtain. There is now a large volume of statistical methods for analysing studies with surrogate end point data. However, to our knowledge, there has not been a comprehensive review of these methods to date. This paper reviews some existing methods and summarizes the strengths and weaknesses of each method. It also discusses the assumptions that are made by each method and critiques how likely these assumptions are met in practice.  相似文献   

3.
Lee  Chi Hyun  Ning  Jing  Shen  Yu 《Lifetime data analysis》2019,25(1):79-96

Length-biased data are frequently encountered in prevalent cohort studies. Many statistical methods have been developed to estimate the covariate effects on the survival outcomes arising from such data while properly adjusting for length-biased sampling. Among them, regression methods based on the proportional hazards model have been widely adopted. However, little work has focused on checking the proportional hazards model assumptions with length-biased data, which is essential to ensure the validity of inference. In this article, we propose a statistical tool for testing the assumed functional form of covariates and the proportional hazards assumption graphically and analytically under the setting of length-biased sampling, through a general class of multiparameter stochastic processes. The finite sample performance is examined through simulation studies, and the proposed methods are illustrated with the data from a cohort study of dementia in Canada.

  相似文献   

4.
Recurrent events in clinical trials have typically been analysed using either a multiple time-to-event method or a direct approach based on the distribution of the number of events. An area of application for these methods is exacerbation data from respiratory clinical trials. The different approaches to the analysis and the issues involved are illustrated for a large trial (n = 1465) in chronic obstructive pulmonary disease (COPD). For exacerbation rates, clinical interest centres on a direct comparison of rates for each treatment which favours the distribution-based analysis, rather than a time-to-event approach. Poisson regression has often been employed and has recently been recommended as the appropriate method of analysis for COPD exacerbations but the key assumptions often appear unreasonable for this analysis. By contrast use of a negative binomial model which corresponds to assuming a separate Poisson parameter for each subject offers a more appealing approach. Non-parametric methods avoid some of the assumptions required by these models, but do not provide appropriate estimates of treatment effects because of the discrete and bounded nature of the data.  相似文献   

5.
On Testing Equality of Distributions of Technical Efficiency Scores   总被引:5,自引:0,他引:5  
The challenge of the econometric problem in production efficiency analysis is that the efficiency scores to be analyzed are unobserved. Statistical properties have recently been discovered for a type of estimator popular in the literature, known as data envelopment analysis (DEA). This opens up a wide range of possibilities for well-grounded statistical inference about the true efficiency scores from their DEA estimates. In this paper we investigate the possibility of using existing tests for the equality of two distributions in such a context. Considering the statistical complications pertinent to our context, we consider several approaches to adapting the Li test to the context and explore their performance in terms of the size and power of the test in various Monte Carlo experiments. One of these approaches shows good performance for both the size and the power of the test, thus encouraging its use in empirical studies. We also present an empirical illustration analyzing the efficiency distributions of countries in the world, following up a recent study by Kumar and Russell (2002), and report very interesting results.  相似文献   

6.
This article introduces BestClass, a set of SAS macros, available in the mainframe and workstation environment, designed for solving two-group classification problems using a class of recently developed nonparametric classification methods. The criteria used to estimate the classification function are based on either minimizing a function of the absolute deviations from the surface which separates the groups, or directly minimizing a function of the number of misclassified entities in the training sample. The solution techniques used by BestClass to estimate the classification rule use the mathematical programming routines of the SAS/OR software. Recently, a number of research studies have reported that under certain data conditions this class of classification methods can provide more accurate classification results than existing methods, such as Fisher's linear discriminant function and logistic regression. However, these robust classification methods have not yet been implemented in the major statistical packages, and hence are beyond the reach of those statistical analysts who are unfamiliar with mathematical programming techniques. We use a limited simulation experiment and an example to compare and contrast properties of the methods included in Best-Class with existing parametric and nonparametric methods. We believe that BestClass contributes significantly to the field of nonparametric classification analysis, in that it provides the statistical community with convenient access to this recently developed class of methods. BestClass is available from the authors.  相似文献   

7.
Decisions concerning the management of fisheries are founded on confidence statements for interest parameters such as biomass and exploitation rate, derived from complex structural models that describe the dynamics of fisheries. We identify four generic statistical issues and focus on how they impact on the reliability of those confidence statements: (a) parameters for which the data have little or no information; (b) competing structural relationships; (c) weighting of observations; and (d) alternative methods for computing confidence statements. Our purpose is to give an exposition of how these issues impact on fisheries' analyses, with the intent of stimulating thought on more effective alternatives. We describe the fisheries' management context and use two specific studies to illustrate how these generic statistical issues impact on fisheries assessment results. It is demonstrated that these statistical issues can have a profound impact on fishery management decisions and that established approaches to handle them have not been fully developed.  相似文献   

8.
Nonparametric predictive inference (NPI) is a statistical approach based on few assumptions about probability distributions, with inferences based on data. NPI assumes exchangeability of random quantities, both related to observed data and future observations, and uncertainty is quantified using lower and upper probabilities. In this paper, units from several groups are placed simultaneously on a lifetime experiment and times-to-failure are observed. The experiment may be ended before all units have failed. Depending on the available data and few assumptions, we present lower and upper probabilities for selecting the best group, the subset of best groups and the subset including the best group. We also compare our approach of selecting the best group with some classical precedence selection methods. Throughout, examples are provided to demonstrate our method.  相似文献   

9.
10.
Skewed models are important and necessary when parametric analyses are carried out on data. Mixture distributions produce widely flexible models with good statistical and probabilistic properties, and the mixture inverse Gaussian (MIG) model is one of those. Transformations of the MIG model also create new parametric distributions, which are useful in diverse situations. The aim of this paper is to discuss several aspects of the MIG distribution useful for modelling positive data. We specifically discuss transformations, the derivation of moments, fitting of models, and a shape analysis of the transformations. Finally, real examples from engineering, environment, insurance, and toxicology are presented for illustrating some of the results developed here. Three of the four data sets, which have arisen from the consulting work of the authors, are new and have not been previously analysed. All these examples display that the empirical fit of the MIG distribution to the data is very good.  相似文献   

11.
Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms.  相似文献   

12.
函数性数据的统计分析:思想、方法和应用   总被引:9,自引:0,他引:9       下载免费PDF全文
严明义 《统计研究》2007,24(2):87-94
 摘  要:实际中,越来越多的研究领域所收集到的样本观测数据具有函数性特征,这种函数性数据是融合时间序列和横截面两者的数据,有些甚是曲线或其他函数图像。虽然计量经济学近二十多年来发展的面板数据分析方法,具有很好的应用价值,但是面板数据只是函数性数据的一种特殊类型,且其分析方法太过于依赖模型的线性结构和假设条件等。本文基于函数性数据的普遍特征,介绍一种对其进行分析的全新方法,并率先使用该方法对经济函数性数据进行分析,拓展了函数性数据分析的应用范围。分析结果表明,函数性数据分析方法,较之计量经济学和其他统计方法具有更多的优越性,尤其能够揭示其他方法所不能揭示的数据特征  相似文献   

13.
Pragmatic trials offer practical means of obtaining real-world evidence to help improve decision-making in comparative effectiveness settings. Unfortunately, incomplete adherence is a common problem in pragmatic trials. The commonly used methods in randomized control trials often cannot handle the added complexity imposed by incomplete adherence, resulting in biased estimates. Several naive methods and advanced causal inference methods (e.g., inverse probability weighting and instrumental variable-based approaches) have been used in the literature to deal with incomplete adherence. Practitioners and applied researchers are often confused about which method to consider under a given setting. This current work is aimed to review commonly used statistical methods to deal with non-adherence along with their key assumptions, advantages, and limitations, with a particular focus on pragmatic trials. We have listed the applicable settings for these methods and provided a summary of available software. All methods were applied to two hypothetical datasets to demonstrate how these methods perform in a given scenario, along with the R codes. The key considerations include the type of intervention strategy (point treatment settings, where treatment is administered only once versus sustained treatment settings, where treatment has to be continued over time) and availability of data (e.g., the extent of measured or unmeasured covariates that are associated with adherence, dependent confounding impacted by past treatment, and potential violation of assumptions). This study will guide practitioners and applied researchers to use the appropriate statistical method to address incomplete adherence in pragmatic trial settings for both the point and sustained treatment strategies.  相似文献   

14.
Measuring the efficiency of public services: the limits of analysis   总被引:2,自引:0,他引:2  
Summary.  Policy makers are increasingly seeking to develop overall measures of the effi-ciency of public service organizations. For that, the use of 'off-the-shelf' statistical tools such as data envelopment analysis and stochastic frontier analysis have been advocated as tools to measure organizational efficiency. The analytical sophistication of such methods has reached an advanced stage of development. We discuss the context within which such models are deployed, their underlying assumptions and their usefulness for a regulator of public services. Four specific model building issues are discussed: the weights that are attached to public service outputs; the specification of the statistical model; the treatment of environmental influences on performance; the treatment of dynamic effects. The paper concludes with recommendations for policy makers and researchers on the development and use of efficiency measurement techniques.  相似文献   

15.
Classification of high-dimensional data set is a big challenge for statistical learning and data mining algorithms. To effectively apply classification methods to high-dimensional data sets, feature selection is an indispensable pre-processing step of learning process. In this study, we consider the problem of constructing an effective feature selection and classification scheme for data set which has a small number of sample size with a large number of features. A novel feature selection approach, named four-Staged Feature Selection, has been proposed to overcome high-dimensional data classification problem by selecting informative features. The proposed method first selects candidate features with number of filtering methods which are based on different metrics, and then it applies semi-wrapper, union and voting stages, respectively, to obtain final feature subsets. Several statistical learning and data mining methods have been carried out to verify the efficiency of the selected features. In order to test the adequacy of the proposed method, 10 different microarray data sets are employed due to their high number of features and small sample size.  相似文献   

16.
Summary. The problem of analysing longitudinal data that are complicated by possibly informative drop-out has received considerable attention in the statistical literature. Most researchers have concentrated on either methodology or application, but we begin this paper by arguing that more attention could be given to study objectives and to the relevant targets for inference. Next we summarize a variety of approaches that have been suggested for dealing with drop-out. A long-standing concern in this subject area is that all methods require untestable assumptions. We discuss circumstances in which we are willing to make such assumptions and we propose a new and computationally efficient modelling and analysis procedure for these situations. We assume a dynamic linear model for the expected increments of a constructed variable, under which subject-specific random effects follow a martingale process in the absence of drop-out. Informal diagnostic procedures to assess the tenability of the assumption are proposed. The paper is completed by simulations and a comparison of our method and several alternatives in the analysis of data from a trial into the treatment of schizophrenia, in which approximately 50% of recruited subjects dropped out before the final scheduled measurement time.  相似文献   

17.
Informative dropout is a vexing problem for any biomedical study. Most existing statistical methods attempt to correct estimation bias related to this phenomenon by specifying unverifiable assumptions about the dropout mechanism. We consider a cohort study in Africa that uses an outreach programme to ascertain the vital status for dropout subjects. These data can be used to identify a number of relevant distributions. However, as only a subset of dropout subjects were followed, vital status ascertainment was incomplete. We use semi‐competing risk methods as our analysis framework to address this specific case where the terminal event is incompletely ascertained and consider various procedures for estimating the marginal distribution of dropout and the marginal and conditional distributions of survival. We also consider model selection and estimation efficiency in our setting. Performance of the proposed methods is demonstrated via simulations, asymptotic study and analysis of the study data.  相似文献   

18.
Real-time polymerase chain reaction (PCR) is reliable quantitative technique in gene expression studies. The statistical analysis of real-time PCR data is quite crucial for results analysis and explanation. The statistical procedures of analyzing real-time PCR data try to determine the slope of regression line and calculate the reaction efficiency. Applications of mathematical functions have been used to calculate the target gene relative to the reference gene(s). Moreover, these statistical techniques compare Ct (threshold cycle) numbers between control and treatments group. There are many different procedures in SAS for real-time PCR data evaluation. In this study, the efficiency of calibrated model and delta delta Ct model have been statistically tested and explained. Several methods were tested to compare control with treatment means of Ct. The methods tested included t-test (parametric test), Wilcoxon test (non-parametric test) and multiple regression. Results showed that applied methods led to similar results and no significant difference was observed between results of gene expression measurement by the relative method.  相似文献   

19.
New data collection and storage technologies have given rise to a new field of streaming data analytics, called real-time statistical methodology for online data analyses. Most existing online learning methods are based on homogeneity assumptions, which require the samples in a sequence to be independent and identically distributed. However, inter-data batch correlation and dynamically evolving batch-specific effects are among the key defining features of real-world streaming data such as electronic health records and mobile health data. This article is built under a state-space mixed model framework in which the observed data stream is driven by a latent state process that follows a Markov process. In this setting, online maximum likelihood estimation is made challenging by high-dimensional integrals and complex covariance structures. In this article, we develop a real-time Kalman-filter-based regression analysis method that updates both point estimates and their standard errors for fixed population average effects while adjusting for dynamic hidden effects. Both theoretical justification and numerical experiments demonstrate that our proposed online method has statistical properties similar to those of its offline counterpart and enjoys great computational efficiency. We also apply this method to analyze an electronic health record dataset.  相似文献   

20.
This article presents a general Bayesian analysis of incomplete categorical data considered as generated by a statistical model involving the categorical sampling process and the observable censoring process. The novelty is that we allow dependence of the censoring process paramenters on the sampling categories; i.e., an informative censoring process. In this way, we relax the assumptions under which both classical and Bayesian solutions have been de-veloped. The proposed solution is outlined for the relevant case of the censoring pattern based on partitions. It is completely developed for a simple but typical examples. Several possible extensions of our procedure are discussed in the final remarks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号