首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Overcoming biases and misconceptions in ecological studies   总被引:2,自引:1,他引:1  
The aggregate data study design provides an alternative group level analysis to ecological studies in the estimation of individual level health risks. An aggregate model is derived by aggregating a plausible individual level relative rate model within groups, such that population-based disease rates are modelled as functions of individual level covariate data. We apply an aggregate data method to a series of fictitious examples from a review paper by Greenland and Robins which illustrated the problems that can arise when using the results of ecological studies to make inference about individual health risks. We use simulated data based on their examples to demonstrate that the aggregate data approach can address many of the sources of bias that are inherent in typical ecological analyses, even though the limited between-region covariate variation in these examples reduces the efficiency of the aggregate study. The aggregate method has the potential to estimate exposure effects of interest in the presence of non-linearity, confounding at individual and group levels, effect modification, classical measurement error in the exposure and non-differential misclassification in the confounder.  相似文献   

2.
The ecological fallacy is related to Simpson's paradox (1951) where relationships among group means may be counterintuitive and substantially different from relationships within groups, where the groups are usually geographic entities such as census tracts. We consider the problem of estimating the correlation between two jointly normal random variables where only ecological data (group means) are available. Two empirical Bayes estimators and one fully Bayesian estimator are derived and compared with the usual ecological estimator, which is simply the Pearson correlation coefficient of the group sample means. We simulate the bias and mean squared error performance of these estimators, and also give an example employing a dataset where the individual level data are available for model checking. The results indicate superiority of the empirical Bayes estimators in a variety of practical situations where, though we lack individual level data, other relevant prior information is available.  相似文献   

3.
The relationship between socioeconomic factors and health has been studied in many circumstances. Whether the association takes place at individual level only, or also at population level (contextual effect) is still unclear. We present a multilevel hierarchical Bayesian model to investigate the joint contribution of individual and population-based socioeconomic factors to mortality, using data from the census cohort of the general population of the city of Florence, Italy (Tuscany Longitudinal Study, 1991-1995). Evidence supporting a contextual effect of deprivation on mortality at the very fine level of aggregation is found. Inappropriate modelling of individual and aggregate variables could strongly bias effect estimates.Received: 10 January 2002, Revised: 23 June 2003, The research on Tuscany Longitudinal Study (Studio Longitudinale Toscano, SLTo) was supported by the Regione Toscana Servizio Statistica.  相似文献   

4.
A statistical framework for ecological and aggregate studies   总被引:6,自引:2,他引:4  
Inference from studies that make use of data at the level of the area, rather than at the level of the individual, is more difficult for a variety of reasons. Some of these difficulties arise because frequently exposures (including confounders) vary within areas. In the most basic form of ecological study the outcome measure is regressed against a simple area level summary of exposure. In the aggregate data approach a survey of exposures and confounders is taken within each area. An alternative approach is to assume a parametric form for the within-area exposure distribution. We provide a framework within which ecological and aggregate data studies may be viewed, and we review some approaches to inference in such studies, clarifying the assumptions on which they are based. General strategies for analysis are provided including an estimator based on Monte Carlo integration that allows inference in the case of a general risk–exposure model. We also consider the implications of the introduction of random effects, and the existence of confounding and errors in variables.  相似文献   

5.
ABSTRACT

In ecological studies, individual inference is made based on results from ecological models. Interpretation of the results requires caution since ecological analysis on group level may not hold in the individual level within the groups, leading to ecological fallacy. Using an ecological regression example for analyzing voting behaviors, we highlight that the explicit use of individual-level models is crucial in understanding the results of ecological studies. In particular, we clarify three relevant statistical issues for each individual-level models: assessment of the uncertainty of parameter estimates obtained from a wrong model, the use of shrinkage estimation method for simultaneous estimation of many parameters, and the necessity of sensitivity analysis rather than adhering to one seemingly most compelling assumption.  相似文献   

6.
Summary.  Statistical methods of ecological analysis that attempt to reduce ecological bias are empirically evaluated to determine in which circumstances each method might be practicable. The method that is most successful at reducing ecological bias is stratified ecological regression. It allows individual level covariate information to be incorporated into a stratified ecological analysis, as well as the combination of disease and risk factor information from two separate data sources, e.g. outcomes from a cancer registry and risk factor information from the census sample of anonymized records data set. The aggregated individual level model compares favourably with this model but has convergence problems. In addition, it is shown that the large areas that are covered by local authority districts seem to reduce between-area variability and may therefore not be as informative as conducting a ward level analysis. This has policy implications because access to ward level data is restricted.  相似文献   

7.
This article seeks to measure deprivation among Portuguese households, taking into account four well-being dimensions – housing, durable goods, economic strain and social relationships – with survey data from the European Community Household Panel. We propose a multi-stage approach to a cross-sectional analysis, side-stepping the sparse nature of the contingency tables caused by the large number of variables considered and bringing together partial and overall analyses of deprivation that are based on Bayesian latent class models via Markov Chain Monte Carlo methods. The outcomes demonstrate that there was a substantial improvement on household overall well-being between 1995 and 2001. The dimensions that most contributed to the risk of household deprivation were found to be economic strain and social relationships.  相似文献   

8.
Evolutionary ecology is the study of evolutionary processes, and the ecological conditions that influence them. A fundamental paradigm underlying the study of evolution is natural selection. Although there are a variety of operational definitions for natural selection in the literature, perhaps the most general one is that which characterizes selection as the process whereby heritable variation in fitness associated with variation in one or more phenotypic traits leads to intergenerational change in the frequency distribution of those traits. The past 20 years have witnessed a marked increase in the precision and reliability of our ability to estimate one or more components of fitness and characterize natural selection in wild populations, owing particularly to significant advances in methods for analysis of data from marked individuals. In this paper, we focus on several issues that we believe are important considerations for the application and development of these methods in the context of addressing questions in evolutionary ecology. First, our traditional approach to estimation often rests upon analysis of aggregates of individuals, which in the wild may reflect increasingly non-random (selected) samples with respect to the trait(s) of interest. In some cases, analysis at the aggregate level, rather than the individual level, may obscure important patterns. While there are a growing number of analytical tools available to estimate parameters at the individual level, and which can cope (to varying degrees) with progressive selection of the sample, the advent of new methods does not reduce the need to consider carefully the appropriate level of analysis in the first place. Estimation should be motivated a priori by strong theoretical analysis. Doing so provides clear guidance, in terms of both (i) assisting in the identification of realistic and meaningful models to include in the candidate model set, and (ii) providing the appropriate context under which the results are interpreted. Second, while it is true that selection (as defined) operates at the level of the individual, the selection gradient is often (if not generally) conditional on the abundance of the population. As such, it may be important to consider estimating transition rates conditional on both the parameter values of the other individuals in the population (or at least their distribution), and population abundance. This will undoubtedly pose a considerable challenge, for both single- and multi-strata applications. It will also require renewed consideration of the estimation of abundance, especially for open populations. Thirdly, selection typically operates on dynamic, individually varying traits. Such estimation may require characterizing fitness in terms of individual plasticity in one or more state variables, constituting analysis of the norms of reaction of individuals to variable environments. This can be quite complex, especially for traits that are under facultative control. Recent work has indicated that the pattern of selection on such traits is conditional on the relative rates of movement among and frequency of spatially heterogeneous habitats, suggesting analyses of evolution of life histories in open populations can be misleading in some cases.  相似文献   

9.
Evolutionary ecology is the study of evolutionary processes, and the ecological conditions that influence them. A fundamental paradigm underlying the study of evolution is natural selection. Although there are a variety of operational definitions for natural selection in the literature, perhaps the most general one is that which characterizes selection as the process whereby heritable variation in fitness associated with variation in one or more phenotypic traits leads to intergenerational change in the frequency distribution of those traits. The past 20 years have witnessed a marked increase in the precision and reliability of our ability to estimate one or more components of fitness and characterize natural selection in wild populations, owing particularly to significant advances in methods for analysis of data from marked individuals. In this paper, we focus on several issues that we believe are important considerations for the application and development of these methods in the context of addressing questions in evolutionary ecology. First, our traditional approach to estimation often rests upon analysis of aggregates of individuals, which in the wild may reflect increasingly non-random (selected) samples with respect to the trait(s) of interest. In some cases, analysis at the aggregate level, rather than the individual level, may obscure important patterns. While there are a growing number of analytical tools available to estimate parameters at the individual level, and which can cope (to varying degrees) with progressive selection of the sample, the advent of new methods does not reduce the need to consider carefully the appropriate level of analysis in the first place. Estimation should be motivated a priori by strong theoretical analysis. Doing so provides clear guidance, in terms of both (i) assisting in the identification of realistic and meaningful models to include in the candidate model set, and (ii) providing the appropriate context under which the results are interpreted. Second, while it is true that selection (as defined) operates at the level of the individual, the selection gradient is often (if not generally) conditional on the abundance of the population. As such, it may be important to consider estimating transition rates conditional on both the parameter values of the other individuals in the population (or at least their distribution), and population abundance. This will undoubtedly pose a considerable challenge, for both single- and multi-strata applications. It will also require renewed consideration of the estimation of abundance, especially for open populations. Thirdly, selection typically operates on dynamic, individually varying traits. Such estimation may require characterizing fitness in terms of individual plasticity in one or more state variables, constituting analysis of the norms of reaction of individuals to variable environments. This can be quite complex, especially for traits that are under facultative control. Recent work has indicated that the pattern of selection on such traits is conditional on the relative rates of movement among and frequency of spatially heterogeneous habitats, suggesting analyses of evolution of life histories in open populations can be misleading in some cases.  相似文献   

10.
Parallel individual and ecological analyses of data on residential radon have been performed using information on cases of lung cancer and population controls from a recent study in south-west England. For the individual analysis the overall results indicated that the relative risk of lung cancer at 100 Bq m−3 compared with at 0 Bq m−3 was 1.12 (95% confidence interval (0.99, 1.27)) after adjusting for age, sex, smoking, county of residence and social class. In the ecological analysis substantial bias in the estimated effect of radon was present for one of the two counties involved unless an additional variable, urban–rural status, was included in the model, although this variable was not an important confounder in the individual level analysis. Most of the methods that have been recommended for overcoming the limitations of ecological studies would not in practice have proved useful in identifying this variable as an appreciable source of bias.  相似文献   

11.
Summary.  To obtain information about the contribution of individual and area level factors to population health, it is desirable to use both data collected on areas, such as censuses, and on individuals, e.g. survey and cohort data. Recently developed models allow us to carry out simultaneous regressions on related data at the individual and aggregate levels. These can reduce 'ecological bias' that is caused by confounding, model misspecification or lack of information and increase power compared with analysing the data sets singly. We use these methods in an application investigating individual and area level sociodemographic predictors of the risk of hospital admissions for heart and circulatory disease in London. We discuss the practical issues that are encountered in this kind of data synthesis and demonstrate that this modelling framework is sufficiently flexible to incorporate a wide range of sources of data and to answer substantive questions. Our analysis shows that the variations that are observed are mainly attributable to individual level factors rather than the contextual effect of deprivation.  相似文献   

12.
13.
Researchers familiar with spatial models are aware of the challenge of choosing the level of spatial aggregation. Few studies have been published on the investigation of temporal aggregation and its impact on inferences regarding disease outcome in space–time analyses. We perform a case study for modelling individual disease outcomes using several Bayesian hierarchical spatio‐temporal models, while taking into account the possible impact of spatial and temporal aggregation. Using longitudinal breast cancer data from South East Queensland, Australia, we consider both parametric and non‐parametric formulations for temporal effects at various levels of aggregation. Two temporal smoothness priors are considered separately; each is modelled with fixed effects for the covariates and an intrinsic conditional autoregressive prior for the spatial random effects. Our case study reveals that different model formulations produce considerably different model performances. For this particular dataset, a classical parametric formulation that assumes a linear time trend produces the best fit among the five models considered. Different aggregation levels of temporal random effects were found to have little impact on model goodness‐of‐fit and estimation of fixed effects.  相似文献   

14.
The potential of cycle helmets to reduce head injury remains controversial. Although several case‐control studies have been published, ecological analyses of head injury remain commonplace, presumably because of the availability of data and policy‐makers’ preference for ‘whole population’ studies. Given that such population‐level analysis will be conducted, this paper models the odds ratio between different road‐user groups over time. We use a Bayesian implementation of a vector generalized additive model in order to examine the odds ratio for head injury when comparing male cyclists with female cyclists, male pedestrians with male cyclists, and female pedestrians with female cyclists over a period when helmet‐wearing rates were thought to diverge by gender.  相似文献   

15.
Spatial variation in teenage conceptions in south and west England   总被引:1,自引:0,他引:1  
Multilevel Poisson models are used to identify factors influencing variation in census ward level teenage conception rates. Multilevel logistic models are also employed to examine the outcome of these conceptions. Demographic and socioeconomic characteristics are accounted for as well as access to family planning services. The paper emphasizes the importance of customized deprivation indices that are specific to the health outcome in urban and rural areas.  相似文献   

16.
"One can often gain insight into the aetiology of a disease by relating mortality rates in different areas to explanatory variables. Multiple regression techniques are usually employed, but unweighted least squares may be inappropriate if the areas vary in population size. Also, a fully weighted regression, with weights inversely proportional to binomial sampling variances, is usually too extreme. This paper proposes an intermediate solution via maximum likelihood which takes account of three sources of variation in death rates: sampling error, explanatory variables and unexplained differences between areas. The method is also adapted for logit (death rates), standardized mortality ratios (SMRs) and log (SMRs). Two [United Kingdom] examples are presented."  相似文献   

17.
The case for small area microdata   总被引:3,自引:2,他引:1  
Summary.  Census data are available in aggregate form for local areas and, through the samples of anonymized records (SARs), as samples of microdata for households and individuals. In 1991 there were two SAR files: a household file and an individual file. These have a high degree of detail on the census variables but little geographical detail, a situation that will be exacerbated for the 2001 SAR owing to the loss of district level geography on the individual SAR. The paper puts forward the case for an additional sample of microdata, also drawn from the census, that has much greater geographical detail. Small area microdata (SAM) are individual level records with local area identifiers and, to maintain confidentiality, reduced detail on the census variables. Population data from seven local authorities, including rural and urban areas, are used to define prototype samples of SAM. The rationale for SAM is given, with examples that demonstrate the role of local area information in the analysis of census data. Since there is a trade-off between the extent of local detail and the extent of detail on variables that can be made available, the confidentiality risk of SAM is assessed empirically. An indicative specification of the SAM is given, having taken into account the results of the confidentiality analysis.  相似文献   

18.
We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development.  相似文献   

19.
Recent analyses seeking to explain variation in area health outcomes often consider the impact on them of latent measures (i.e. unobserved constructs) of population health risk. The latter are typically obtained by forms of multivariate analysis, with a small set of latent constructs derived from a collection of observed indicators, and a few recent area studies take such constructs to be spatially structured rather than independent over areas. A confirmatory approach is often applicable to the model linking indicators to constructs, based on substantive knowledge of relevant risks for particular diseases or outcomes. In this paper, population constructs relevant to a particular set of health outcomes are derived using an integrated model containing all the manifest variables, namely health outcome variables, as well as indicator variables underlying the latent constructs. A further feature of the approach is the use of variable selection techniques to select significant loadings and factors (especially in terms of effects of constructs on health outcomes), so ensuring parsimonious models are selected. A case study considers suicide mortality and self-harm contrasts in the East of England in relation to three latent constructs: deprivation, fragmentation and urbanicity.  相似文献   

20.
Collecting individual patient data has been described as the 'gold standard' for undertaking meta-analysis. If studies involve time-to-event outcomes, conducting a meta-analysis based on aggregate data can be problematical. Two meta-analyses of randomized controlled trials with time-to-event outcomes are used to illustrate the practicality and value of several proposed methods to obtain summary statistic estimates. In the first example the results suggest that further effort should be made to find unpublished trials. In the second example the use of aggregate data for trials where no individual patient data have been supplied allows the totality of evidence to be assessed and indicates previously unrecognized heterogeneity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号