首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The ecological fallacy is related to Simpson's paradox (1951) where relationships among group means may be counterintuitive and substantially different from relationships within groups, where the groups are usually geographic entities such as census tracts. We consider the problem of estimating the correlation between two jointly normal random variables where only ecological data (group means) are available. Two empirical Bayes estimators and one fully Bayesian estimator are derived and compared with the usual ecological estimator, which is simply the Pearson correlation coefficient of the group sample means. We simulate the bias and mean squared error performance of these estimators, and also give an example employing a dataset where the individual level data are available for model checking. The results indicate superiority of the empirical Bayes estimators in a variety of practical situations where, though we lack individual level data, other relevant prior information is available.  相似文献   

3.
The Wilcoxon-Mann-Whitney statistic is commonly used for a distribution-free comparison of two groups. One requirement for its use is that the sample sizes of the two groups are fixed. This is violated in some of the applications such as medical imaging studies and diagnostic marker studies; in the former, the violation occurs since the number of correctly localized abnormal images is random, while in the latter the violation is due to some subjects not having observable measurements. For this reason, we propose here a random-sum Wilcoxon statistic for comparing two groups in the presence of ties, and derive its variance as well as its asymptotic distribution for large sample sizes. The proposed statistic includes the regular Wilcoxon rank-sum statistic. Finally, we apply the proposed statistic for summarizing location response operating characteristic data from a liver computed tomography study, and also for summarizing diagnostic accuracy of biomarker data.  相似文献   

4.
In case-control evaluations of cancer screening, subjects who have died from the cancer in question (cases) are compared with those who have not (controls) with respect to screening histories. This method is subject to a rather subtle bias, among others, whereby the cases have greater opportunity to have been screened than the controls. In this paper, we propose a method of correction for this bias. We demonstrate its use on two case-control studies of mammographic screening for breast cancer.  相似文献   

5.
Longitudinal health-related quality-of-life (QOL) data are often collected as part of clinical studies. Here two analyses of QOL data from a prospective study of breast cancer patients evaluate how physical performance is related to factors such as age, menopausal status and type of adjuvant treatment. The first analysis uses summary statistic methods. The same questions are then addressed using a multilevel model. Because of the structure of the physical performance response, regression models for the analysis of ordinal data are used. The analyses of base-line and follow-up QOL data at four time points over two years from 257 women show that reported base-line physical performance was consistently associated with later performance and that women who had received chemotherapy in the month before the QOL assessment had a greater physical performance burden. There is a slight power gain of the multilevel model over the summary statistic analysis. The multilevel model also allows relationships with time-dependent covariates to be included, highlighting treatment-related factors affecting physical performance that could not be considered within the summary statistic analysis. Checking of the multilevel model assumptions is exemplified.  相似文献   

6.
Clinical studies, which have a small number of patients, are conducted by pharmaceutical companies and research institutions. Examples of constraints that lead to a small clinical study include a single investigative site with a highly specialized expertise or equipment, rare diseases, and limited time and budget. We consider the following topics, which we believe will be helpful for the investigator and statistician working together on the design and analysis of small clinical studies: definitions of various types of small studies (exploratory, pilot, proof of concept); bias and ways to mitigate the bias; commonly used study designs for randomized and nonrandomized studies, and some less commonly used designs; potential ethical issues associated with small underpowered clinical studies; sample size for small studies; statistical analysis methods for different types of variables and multiplicity issues. We conclude the paper with recommendations made by an Institute of Medicine committee, which was asked to assess the current methodologies and appropriate situations for conducting small clinical studies.  相似文献   

7.
We performed a simulation study comparing the statistical properties of the estimated log odds ratio from propensity scores analyses of a binary response variable, in which missing baseline data had been imputed using a simple imputation scheme (Treatment Mean Imputation), compared with three ways of performing multiple imputation (MI) and with a Complete Case analysis. MI that included treatment (treated/untreated) and outcome (for our analyses, outcome was adverse event [yes/no]) in the imputer's model had the best statistical properties of the imputation schemes we studied. MI is feasible to use in situations where one has just a few outcomes to analyze. We also found that Treatment Mean Imputation performed quite well and is a reasonable alternative to MI in situations where it is not feasible to use MI. Treatment Mean Imputation performed better than MI methods that did not include both the treatment and outcome in the imputer's model. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
We define the odd log-logistic exponential Gaussian regression with two systematic components, which extends the heteroscedastic Gaussian regression and it is suitable for bimodal data quite common in the agriculture area. We estimate the parameters by the method of maximum likelihood. Some simulations indicate that the maximum-likelihood estimators are accurate. The model assumptions are checked through case deletion and quantile residuals. The usefulness of the new regression model is illustrated by means of three real data sets in different areas of agriculture, where the data present bimodality.  相似文献   

9.
To study the relationship between a sensitive binary response variable and a set of non‐sensitive covariates, this paper develops a hidden logistic regression to analyse non‐randomized response data collected via the parallel model originally proposed by Tian (2014). This is the first paper to employ the logistic regression analysis in the field of non‐randomized response techniques. Both the Newton–Raphson algorithm and a monotone quadratic lower bound algorithm are developed to derive the maximum likelihood estimates of the parameters of interest. In particular, the proposed logistic parallel model can be used to study the association between a sensitive binary variable and another non‐sensitive binary variable via the measure of odds ratio. Simulations are performed and a study on people's sexual practice data in the United States is used to illustrate the proposed methods.  相似文献   

10.
An alarming report from an environmental pressure group raised concerns about childhood leukaemia and the Irish Sea. In response, this ecological study explores the hypotheses that childhood cancer rates are increased by living near the coast of Wales, especially in the north, and in particular near estuaries and mud-flats. Using Poisson regression to adjust for possible confounding variables, no evidence was found for a coastline proximity effect at the level of census wards (5 km). Moreover the rates were significantly lower near estuaries than for the rest of the coast, but there was a small but non-significant increase near mud-flats. Case–control modelling of postcoded cases living within the coastal wards using Stone's method also failed to detect any monotonic reduction in relative risk near the coastline.  相似文献   

11.
Since the late 1980s, regular monitoring of the human immunodeficiency virus epidemic in England and Wales has been carried out through the work of successive national working groups. One of their tasks has been to provide short-term projections of the incidence of acquired immune deficiency syndrome. In this paper the data and methods used in this projection work are reviewed and results critically assessed with the aim of highlighting the strong interaction between methodological developments and data acquisition.  相似文献   

12.
The analysis of exogeneity in econometric time-series models as formalized in the seminal paper by Engle et al. [Econometrica 51 (1983), 277–304] is extended to cover a more general class of models, including error-components models. The Bayesian framework adopted here allows us to take full advantage of a number of statistical tools, related to the reduction of Bayesian experiments, and motivates a careful consideration of prediction issues, leading to a concept of predictive exogeneity. We also adapt the formal definitions of weak and strong exogeneity introduced in Engle et al. (1983), and provide a naturally nested set of definitions for exogeneity. An example highlights the main implications of our analysis for econometric modelling.  相似文献   

13.
Summary.  The instigation of mass screening for breast cancer has, over the last three decades, raised various statistical issues and led to the development of new statistical approaches. Initially, the design of screening trials was the main focus of research but, as the evidence in favour of population-based screening programmes mounts, a variety of other applications have also been identified. These include administrative and quality control tasks, for monitoring routine screening services, as well as epidemiological modelling of incidence and mortality. We review the commonly used methods of cancer screening evaluation, highlight some current issues in breast screening and, using examples from randomized trials and established screening programmes, illustrate the role that statistical science has played in the development of clinical research in this field.  相似文献   

14.
Preventive maintenance (PM) scheduling of units is addressed as a crucial issue that effects on both economy and reliability of power systems. In this paper, we describe an application of statistical analysis for determining the best PM strategy in the case of parallel, series, and single-item replacement systems. A key aspect of industrial maintenance is the trade-off between cost and time of performing PM operations. The goals of this study is to determine the best time for performing PM operations in each system and also finding the number of spare parts and facilities in single-item replacement and parallel systems respectively so that the average cost per unit time is minimized. In this proposed maintenance strategy, PM operations are regularly performed on the production unit in equal time intervals. Finally, three examples are presented to demonstrate the effectiveness of the proposed models.  相似文献   

15.
Benzene is classified as a group 1 human carcinogen by the International Agency for Research on Cancer, and it is now accepted that occupational exposure is associated with an increased risk of various leukaemias. However, occupational exposure accounts for less than 1% of all benzene exposures, the major sources being cigarette smoking and vehicle exhaust emissions. Whether such low level exposures to environmental benzene are also associated with the risk of leukaemia is currently not known. In this study, we investigate the relationship between benzene emissions arising from outdoor sources (predominantly road traffic and petrol stations) and the incidence of childhood leukaemia in Greater London. An ecological design was used because of the rarity of the disease, the difficulty of obtaining individual level measurements of benzene exposure and the availability of data. However, some methodological difficulties were encountered, including problems of case registration errors, the choice of geographical areas for analysis, exposure measurement errors and ecological bias. We use a Bayesian hierarchical modelling framework to address these issues, and we investigate the sensitivity of our inference to various modelling assumptions.  相似文献   

16.
The use of the logit transformation on paired-comparison data in the weighted least squares analysis of response surfaces for aesthetic qualities of products is discussed. Monte Carlo simulations are employed to investigate the small sample properties of the estimators and test statistics. A secondary objective of the Monte Carlo simulations is the comparison of two transformation procedures. The simulations are of standard-item paired-compar-ison experiments in which ties are not allowed.  相似文献   

17.
Abstract

A method is proposed for the estimation of missing data in analysis of covariance models. This is based on obtaining an estimate of the missing observation that minimizes the error sum of squares. Specific derivation of this estimate is carried out for the one-factor analysis of covariance, and numerical examples are given to show the nature of the estimates produced. Parameter estimates of the imputed data are then compared with those of the incomplete data.  相似文献   

18.
Summary.  To obtain information about the contribution of individual and area level factors to population health, it is desirable to use both data collected on areas, such as censuses, and on individuals, e.g. survey and cohort data. Recently developed models allow us to carry out simultaneous regressions on related data at the individual and aggregate levels. These can reduce 'ecological bias' that is caused by confounding, model misspecification or lack of information and increase power compared with analysing the data sets singly. We use these methods in an application investigating individual and area level sociodemographic predictors of the risk of hospital admissions for heart and circulatory disease in London. We discuss the practical issues that are encountered in this kind of data synthesis and demonstrate that this modelling framework is sufficiently flexible to incorporate a wide range of sources of data and to answer substantive questions. Our analysis shows that the variations that are observed are mainly attributable to individual level factors rather than the contextual effect of deprivation.  相似文献   

19.
One of the fundamental issues in analyzing microarray data is to determine which genes are expressed and which ones are not for a given group of subjects. In datasets where many genes are expressed and many are not expressed (i.e., underexpressed), a bimodal distribution for the gene expression levels often results, where one mode of the distribution represents the expressed genes and the other mode represents the underexpressed genes. To model this bimodality, we propose a new class of mixture models that utilize a random threshold value for accommodating bimodality in the gene expression distribution. Theoretical properties of the proposed model are carefully examined. We use this new model to examine the problem of differential gene expression between two groups of subjects, develop prior distributions, and derive a new criterion for determining which genes are differentially expressed between the two groups. Prior elicitation is carried out using empirical Bayes methodology in order to estimate the threshold value as well as elicit the hyperparameters for the two component mixture model. The new gene selection criterion is demonstrated via several simulations to have excellent false positive rate and false negative rate properties. A gastric cancer dataset is used to motivate and illustrate the proposed methodology.  相似文献   

20.
Abstract

Imputation methods for missing data on a time-dependent variable within time-dependent Cox models are investigated in a simulation study. Quality of life (QoL) assessments were removed from the complete simulated datasets, which have a positive relationship between QoL and disease-free survival (DFS) and delayed chemotherapy and DFS, by missing at random and missing not at random (MNAR) mechanisms. Standard imputation methods were applied before analysis. Method performance was influenced by missing data mechanism, with one exception for simple imputation. The greatest bias occurred under MNAR and large effect sizes. It is important to carefully investigate the missing data mechanism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号