Asymptotic variance plays an important role in the inference using interval estimate of attributable risk. This paper compares asymptotic variances of attributable risk estimate using the delta method and the Fisher information matrix for a 2×2 case–control study due to the practicality of applications. The expressions of these two asymptotic variance estimates are shown to be equivalent. Because asymptotic variance usually underestimates the standard error, the bootstrap standard error has also been utilized in constructing the interval estimates of attributable risk and compared with those using asymptotic estimates. A simulation study shows that the bootstrap interval estimate performs well in terms of coverage probability and confidence length. An exact test procedure for testing independence between the risk factor and the disease outcome using attributable risk is proposed and is justified for the use with real-life examples for a small-sample situation where inference using asymptotic variance may not be valid.  相似文献   

Lifetime Data Analysis - Assuming Cox’s regression model, we consider penalized full likelihood approach to conduct variable selection under nested case–control (NCC) sampling....  相似文献   

Two-phase case–control studies cope with the problem of confounding by obtaining required additional information for a subset (phase 2) of all individuals (phase 1). Nowadays, studies with rich phase 1 data are available where only few unmeasured confounders need to be obtained in phase 2. The extended conditional maximum likelihood (ECML) approach in two-phase logistic regression is a novel method to analyse such data. Alternatively, two-phase case–control studies can be analysed by multiple imputation (MI), where phase 2 information for individuals included in phase 1 is treated as missing. We conducted a simulation of two-phase studies, where we compared the performance of ECML and MI in typical scenarios with rich phase 1. Regarding exposure effect, MI was less biased and more precise than ECML. Furthermore, ECML was sensitive against misspecification of the participation model. We therefore recommend MI to analyse two-phase case–control studies in situations with rich phase 1 data.  相似文献   

We propose a semiparametric approach for the analysis of case–control genome-wide association study. Parametric components are used to model both the conditional distribution of the case status given the covariates and the distribution of genotype counts, whereas the distribution of the covariates are modelled nonparametrically. This yields a direct and joint modelling of the case status, covariates and genotype counts, and gives a better understanding of the disease mechanism and results in more reliable conclusions. Side information, such as the disease prevalence, can be conveniently incorporated into the model by an empirical likelihood approach and leads to more efficient estimates and a powerful test in the detection of disease-associated SNPs. Profiling is used to eliminate a nuisance nonparametric component, and the resulting profile empirical likelihood estimates are shown to be consistent and asymptotically normal. For the hypothesis test on disease association, we apply the approximate Bayes factor (ABF) which is computationally simple and most desirable in genome-wide association studies where hundreds of thousands to a million genetic markers are tested. We treat the approximate Bayes factor as a hybrid Bayes factor which replaces the full data by the maximum likelihood estimates of the parameters of interest in the full model and derive it under a general setting. The deviation from Hardy–Weinberg Equilibrium (HWE) is also taken into account and the ABF for HWE using cases is shown to provide evidence of association between a disease and a genetic marker. Simulation studies and an application are further provided to illustrate the utility of the proposed methodology.  相似文献   

Consider a population of individuals who are free of a disease under study, and who are exposed simultaneously at random exposure levels, say X,Y,Z,… to several risk factors which are suspected to cause the disease in the populationm. At any specified levels X=x, Y=y, Z=z, …, the incidence rate of the disease in the population ot risk is given by the exposure–response relationship r(x,y,z,…) = P(disease|x,y,z,…). The present paper examines the relationship between the joint distribution of the exposure variables X,Y,Z, … in the population at risk and the joint distribution of the exposure variables U,V,W,… among cases under the linear and the exponential risk models. It is proven that under the exponential risk model, these two joint distributions belong to the same family of multivariate probability distributions, possibly with different parameters values. For example, if the exposure variables in the population at risk have jointly a multivariate normal distribution, so do the exposure variables among cases; if the former variables have jointly a multinomial distribution, so do the latter. More generally, it is demonstrated that if the joint distribution of the exposure variables in the population at risk belongs to the exponential family of multivariate probability distributions, so does the joint distribution of exposure variables among cases. If the epidemiologist can specify the differnce among the mean exposure levels in the case and control groups which are considered to be clinically or etiologically important in the study, the results of the present paper may be used to make sample size determinations for the case–control study, corresponding to specified protection levels, i.e., size α and 1–β of a statistical test. The multivariate normal, the multinomial, the negative multinomial and Fisher's multivariate logarithmic series exposure distributions are used to illustrate our results.  相似文献   

In this paper, the destructive negative binomial (DNB) cure rate model with a latent activation scheme [V. Cancho, D. Bandyopadhyay, F. Louzada, and B. Yiqi, The DNB cure rate model with a latent activation scheme, Statistical Methodology 13 (2013b), pp. 48–68] is extended to the case where the observations are grouped into clusters. Parameter estimation is performed based on the restricted maximum likelihood approach and on a Bayesian approach based on Dirichlet process priors. An application to a real data set related to a sealant study in a dentistry experiment is considered to illustrate the performance of the proposed model.  相似文献   

In this paper a test for model selection is proposed which extends the usual goodness-of-fit test in several ways. It is assumed that the underlying distribution H depends on a covariate value in a fixed design setting. Secondly, instead of one parametric class we consider two competing classes one of which may contain the underlying distribution. The test allows to select one of two equally treated model classes which fits the underlying distribution better. To define the distance of distributions various measures are available. Here the Cramér-von Mises has been chosen. The null hypothesis that both parametric classes have the same distance to the underlying distribution H can be checked by means of a test statistic, the asymptotic properties of which are shown under a set of suitable conditions. The performance of the test is demonstrated by Monte Carlo simulations. Finally, the procedure is applied to a data set from an endurance test on electric motors.  相似文献   

Case–control design to assess the accuracy of a binary diagnostic test (BDT) is very frequent in clinical practice. This design consists of applying the diagnostic test to all of the individuals in a sample of those who have the disease and in another sample of those who do not have the disease. The sensitivity of the diagnostic test is estimated from the case sample and the specificity is estimated from the control sample. Another parameter which is used to assess the performance of a BDT is the weighted kappa coefficient. The weighted kappa coefficient depends on the sensitivity and specificity of the diagnostic test, on the disease prevalence and on the weighting index. In this article, confidence intervals are studied for the weighted kappa coefficient subject to a case–control design and a method is proposed to calculate the sample sizes to estimate this parameter. The results obtained were applied to a real example.  相似文献   

In this paper, a new survival cure rate model is introduced considering the Yule–Simon distribution [12 H.A. Simon, On a class of skew distribution functions, Biometrika 42 (1955), pp. 425440.[Crossref], [Web of Science ®] [Google Scholar]] to model the number of concurrent causes. We study some properties of this distribution and the model arising when the distribution of the competing causes is the Weibull model. We call this distribution the Weibull–Yule–Simon distribution. Maximum likelihood estimation is conducted for model parameters. A small scale simulation study is conducted indicating satisfactory parameter recovery by the estimation approach. Results are applied to a real data set (melanoma) illustrating the fact that the model proposed can outperform traditional alternative models in terms of model fitting.  相似文献   

This paper deals with estimation of a green tree frog population in an urban setting using repeated capture–mark–recapture (CMR) method over several weeks with an individual tagging system which gives rise to a complicated generalization of the hypergeometric distribution. Based on the maximum likelihood estimation, a parametric bootstrap approach is adopted to obtain interval estimates of the weekly population size which is the main objective of our work. The method is computation-based; and programming intensive to implement the algorithm for re-sampling. This method can be applied to estimate the population size of any species based on repeated CMR method at multiple time points. Further, it has been pointed out that the well-known Jolly–Seber method, which is based on some strong assumptions, produces either unrealistic estimates, or may have situations where its assumptions are not valid for our observed data set.  相似文献   

