共查询到20条相似文献,搜索用时 0 毫秒
1.
Mei-Ling Ting Lee G.A. Whitmore Francine Laden Jaime E. Hart Eric Garshick 《Journal of statistical planning and inference》2009
A case–control study of lung cancer mortality in U.S. railroad workers in jobs with and without diesel exhaust exposure is reanalyzed using a new threshold regression methodology. The study included 1256 workers who died of lung cancer and 2385 controls who died primarily of circulatory system diseases. Diesel exhaust exposure was assessed using railroad job history from the US Railroad Retirement Board and an industrial hygiene survey. Smoking habits were available from next-of-kin and potential asbestos exposure was assessed by job history review. The new analysis reassesses lung cancer mortality and examines circulatory system disease mortality. Jobs with regular exposure to diesel exhaust had a survival pattern characterized by an initial delay in mortality, followed by a rapid deterioration of health prior to death. The pattern is seen in subjects dying of lung cancer, circulatory system diseases, and other causes. The unique pattern is illustrated using a new type of Kaplan–Meier survival plot in which the time scale represents a measure of disease progression rather than calendar time. The disease progression scale accounts for a healthy-worker effect when describing the effects of cumulative exposures on mortality. 相似文献
2.
3.
William B. Smith 《统计学通讯:理论与方法》2013,42(1):237-241
Exact probabilities (under a Markov-like assumption) are calculated for overlapping 2×2 contingency tables. These contingency tables often arise in biological and legal situations that yield dichotomous responses, including the evaluation of clinical trials and the determination of prima facie evidence of employer discrimination. In this note, probabilities, means and variances are derived and comparisons are made with results when assuming no overlap exists. 相似文献
4.
B. Raja Rao 《统计学通讯:理论与方法》2013,42(10):3035-3065
Consider a population of individuals who are free of a disease under study, and who are exposed simultaneously at random exposure levels, say X,Y,Z,… to several risk factors which are suspected to cause the disease in the populationm. At any specified levels X=x, Y=y, Z=z, …, the incidence rate of the disease in the population ot risk is given by the exposure–response relationship r(x,y,z,…) = P(disease|x,y,z,…). The present paper examines the relationship between the joint distribution of the exposure variables X,Y,Z, … in the population at risk and the joint distribution of the exposure variables U,V,W,… among cases under the linear and the exponential risk models. It is proven that under the exponential risk model, these two joint distributions belong to the same family of multivariate probability distributions, possibly with different parameters values. For example, if the exposure variables in the population at risk have jointly a multivariate normal distribution, so do the exposure variables among cases; if the former variables have jointly a multinomial distribution, so do the latter. More generally, it is demonstrated that if the joint distribution of the exposure variables in the population at risk belongs to the exponential family of multivariate probability distributions, so does the joint distribution of exposure variables among cases. If the epidemiologist can specify the differnce among the mean exposure levels in the case and control groups which are considered to be clinically or etiologically important in the study, the results of the present paper may be used to make sample size determinations for the case–control study, corresponding to specified protection levels, i.e., size α and 1–β of a statistical test. The multivariate normal, the multinomial, the negative multinomial and Fisher's multivariate logarithmic series exposure distributions are used to illustrate our results. 相似文献
5.
This paper studies the construction of a Bayesian confidence interval for the risk ratio (RR) in a 2 × 2 table with structural zero. Under a Dirichlet prior distribution, the exact posterior distribution of the RR is derived, and tail-based interval is suggested for constructing Bayesian confidence interval. The frequentist performance of this confidence interval is investigated by simulation and compared with the score-based interval in terms of the mean coverage probability and mean expected width of the interval. An advantage of the Bayesian confidence interval is that it is well defined for all data structure and has shorter expected width. Our simulation shows that the Bayesian tail-based interval under Jeffreys’ prior performs as well as or better than the score-based confidence interval. 相似文献
6.
Sangwook Kang 《Journal of Statistical Computation and Simulation》2017,87(4):652-663
A nested case–control (NCC) study is an efficient cohort-sampling design in which a subset of controls are sampled from the risk set at each event time. Since covariate measurements are taken only for the sampled subjects, time and efforts of conducting a full scale cohort study can be saved. In this paper, we consider fitting a semiparametric accelerated failure time model to failure time data from a NCC study. We propose to employ an efficient induced smoothing procedure for rank-based estimating method for regression parameters estimation. For variance estimation, we propose to use an efficient resampling method that utilizes the robust sandwich form. We extend our proposed methods to a generalized NCC study that allows a sampling of cases. Finite sample properties of the proposed estimators are investigated via an extensive stimulation study. An application to a tumor study illustrates the utility of the proposed method in routine data analysis. 相似文献
7.
8.
J. A. Roldán-Nofuentes R. M. Amro 《Journal of Statistical Computation and Simulation》2017,87(3):530-545
Case–control design to assess the accuracy of a binary diagnostic test (BDT) is very frequent in clinical practice. This design consists of applying the diagnostic test to all of the individuals in a sample of those who have the disease and in another sample of those who do not have the disease. The sensitivity of the diagnostic test is estimated from the case sample and the specificity is estimated from the control sample. Another parameter which is used to assess the performance of a BDT is the weighted kappa coefficient. The weighted kappa coefficient depends on the sensitivity and specificity of the diagnostic test, on the disease prevalence and on the weighting index. In this article, confidence intervals are studied for the weighted kappa coefficient subject to a case–control design and a method is proposed to calculate the sample sizes to estimate this parameter. The results obtained were applied to a real example. 相似文献
9.
Jörg Drechsler 《AStA Advances in Statistical Analysis》2011,95(1):1-26
Multiple imputation is widely accepted as the method of choice to address item nonresponse in surveys. Nowadays most statistical software packages include features to multiply impute missing values in a dataset. Nevertheless, the application to real data imposes many implementation problems. To define useful imputation models for a dataset that consists of categorical and possibly skewed continuous variables, contains skip patterns and all sorts of logical constraints is a challenging task. Besides, in most applications little attention is paid to the evaluation of the underlying assumptions behind the imputation models. 相似文献
10.
We propose a semiparametric approach for the analysis of case–control genome-wide association study. Parametric components are used to model both the conditional distribution of the case status given the covariates and the distribution of genotype counts, whereas the distribution of the covariates are modelled nonparametrically. This yields a direct and joint modelling of the case status, covariates and genotype counts, and gives a better understanding of the disease mechanism and results in more reliable conclusions. Side information, such as the disease prevalence, can be conveniently incorporated into the model by an empirical likelihood approach and leads to more efficient estimates and a powerful test in the detection of disease-associated SNPs. Profiling is used to eliminate a nuisance nonparametric component, and the resulting profile empirical likelihood estimates are shown to be consistent and asymptotically normal. For the hypothesis test on disease association, we apply the approximate Bayes factor (ABF) which is computationally simple and most desirable in genome-wide association studies where hundreds of thousands to a million genetic markers are tested. We treat the approximate Bayes factor as a hybrid Bayes factor which replaces the full data by the maximum likelihood estimates of the parameters of interest in the full model and derive it under a general setting. The deviation from Hardy–Weinberg Equilibrium (HWE) is also taken into account and the ABF for HWE using cases is shown to provide evidence of association between a disease and a genetic marker. Simulation studies and an application are further provided to illustrate the utility of the proposed methodology. 相似文献
11.
Parametric and semiparametric mixture models have been widely used in applications from many areas, and it is often of interest to test the homogeneity in these models. However, hypothesis testing is non standard due to the fact that several regularity conditions do not hold under the null hypothesis. We consider a semiparametric mixture case–control model, in the sense that the density ratio of two distributions is assumed to be of an exponential form, while the baseline density is unspecified. This model was first considered by Qin and Liang (2011, biometrics), and they proposed a modified score statistic for testing homogeneity. In this article, we consider alternative testing procedures based on supremum statistics, which could improve power against certain types of alternatives. We demonstrate the connection and comparison among the proposed and existing approaches. In addition, we provide a unified theoretical justification of the supremum test and other existing test statistics from an empirical likelihood perspective. The finite-sample performance of the supremum test statistics was evaluated in simulation studies. 相似文献
12.
The Birnbaum–Saunders distribution is a positively skewed distribution that is frequently used for analyzing lifetime data. Regression analysis is widely used in this context when some covariates are involved in the life-test. In this article, we discuss the maximum likelihood estimation of the model parameters and associated inference. We discuss the likelihood-ratio tests for some hypotheses of interest as well as some interval estimation methods. A Monte Carlo simulation study is then carried out to examine the performance of the proposed estimators and the interval estimation methods. Finally, some numerical data analyses are done for illustrating all the inferential methods developed here. 相似文献
13.
The Birnbaum–Saunders distribution is a widely used distribution in reliability applications to model failure times. For several samples from possible different Birnbaum–Saunders distributions, if their means can be considered as the same, it is of importance to make inference for the common mean. This paper presents procedures for interval estimation and hypothesis testing for the common mean of several Birnbaum–Saunders populations. The proposed approaches are hybrids between the generalized inference method and the large sample theory. Some simulation results are conducted to present the performance of the proposed approaches. The simulation results indicate that our proposed approaches perform well. Finally, the proposed approaches are applied to analyze a real example on the fatigue life of 6061-T6 aluminum coupons for illustration. 相似文献
14.
In this article, we discuss the estimation of model parameters of the Type II bivariate Pólya–Aeppli distribution using the method of moments and the maximum likelihood method. We also compare some interval estimation methods. We then carry out a Monte Carlo simulation study to evaluate the performance of the proposed point and interval estimation methods. Finally, we present an example to illustrate all the inferential methods developed here. 相似文献
15.
While analyzing 2 × 2 contingency tables, the log odds ratio for measuring the strength of association is often approximated by a normal distribution with some variance. We show that the expression of that variance needs to be modified in the presence of correlation between two binomial distributions of the contingency table. In the present paper, we derive a correlation-adjusted variance of the limiting normal distribution of log odds ratio. We also propose a correlation adjusted test based on the standard odds ratio for analyzing matched-pair studies and any other study settings that induce correlated binary outcomes. We demonstrate that our proposed test outperforms the classical McNemar’s test. Simulation studies show the gains in power are especially manifest when sample size is small and strong correlation is present. Two examples of real data sets are used to demonstrate that the proposed method may lead to conclusions significantly different from those reached using McNemar’s test. 相似文献
16.
Tie-Hua Ng 《统计学通讯:理论与方法》2013,42(2):435-450
Preliminary tests of significance on the crucial assumptions are often done before drawing inferences of primary interest. In a factorial trial, the data may be pooled across the columns or rows for making inferences concerning the efficacy of the drugs {simple effect) in the absence of interaction. Pooling the data has an advantage of higher power due to larger sample size. On the other hand, in the presence of interaction, such pooling may seriously inflate the type I error rate in testing for the simple effect. A preliminary test for interaction is therefore in order. If this preliminary test is not significant at some prespecified level of significance, then pool the data for testing the efficacy of the drugs at a specified α level. Otherwise, use of the corresponding cell means for testing the efficacy of the drugs at the specified α is recommended. This paper demonstrates that this adaptive procedure may seriously inflate the overall type I error rate. Such inflation happens even in the absence of interaction. One interesting result is that the type I error rate of the adaptive procedure depends on the interaction and the square root of the sample size only through their product. One consequence of this result is as follows. No matter how small the non-zero interaction might be, the inflation of the type I error rate of the always-pool procedure will eventually become unacceptable as the sample size increases. Therefore, in a very large study, even though the interaction is suspected to be very small but non-zero, the always-pool procedure may seriously inflate the type I error rate in testing for the simple effects. It is concluded that the 2 × 2 factorial design is not an efficient design for detecting simple effects, unless the interaction is negligible. 相似文献
17.
Dirk Enders Bianca Kollhorst Susanne Engel Roland Linder Iris Pigeot 《Journal of Statistical Computation and Simulation》2018,88(11):2201-2214
Two-phase case–control studies cope with the problem of confounding by obtaining required additional information for a subset (phase 2) of all individuals (phase 1). Nowadays, studies with rich phase 1 data are available where only few unmeasured confounders need to be obtained in phase 2. The extended conditional maximum likelihood (ECML) approach in two-phase logistic regression is a novel method to analyse such data. Alternatively, two-phase case–control studies can be analysed by multiple imputation (MI), where phase 2 information for individuals included in phase 1 is treated as missing. We conducted a simulation of two-phase studies, where we compared the performance of ECML and MI in typical scenarios with rich phase 1. Regarding exposure effect, MI was less biased and more precise than ECML. Furthermore, ECML was sensitive against misspecification of the participation model. We therefore recommend MI to analyse two-phase case–control studies in situations with rich phase 1 data. 相似文献
18.
When an existing risk prediction model is not sufficiently predictive, additional variables are sought for inclusion in the model. This paper addresses study designs to evaluate the improvement in prediction performance that is gained by adding a new predictor to a risk prediction model. We consider studies that measure the new predictor in a case–control subset of the study cohort, a practice that is common in biomarker research. We ask if matching controls to cases in regards to baseline predictors improves efficiency. A variety of measures of prediction performance are studied. We find through simulation studies that matching improves the efficiency with which most measures are estimated, but can reduce efficiency for some. Efficiency gains are less when more controls per case are included in the study. A method that models the distribution of the new predictor in controls appears to improve estimation efficiency considerably. 相似文献
19.
The aim of this paper is twofold. First we discuss the maximum likelihood estimators of the unknown parameters of a two-parameter Birnbaum–Saunders distribution when the data are progressively Type-II censored. The maximum likelihood estimators are obtained using the EM algorithm by exploiting the property that the Birnbaum–Saunders distribution can be expressed as an equal mixture of an inverse Gaussian distribution and its reciprocal. From the proposed EM algorithm, the observed information matrix can be obtained quite easily, which can be used to construct the asymptotic confidence intervals. We perform the analysis of two real and one simulated data sets for illustrative purposes, and the performances are quite satisfactory. We further propose the use of different criteria to compare two different sampling schemes, and then find the optimal sampling scheme for a given criterion. It is observed that finding the optimal censoring scheme is a discrete optimization problem, and it is quite a computer intensive process. We examine one sub-optimal censoring scheme by restricting the choice of censoring schemes to one-step censoring schemes as suggested by Balakrishnan (2007), which can be obtained quite easily. We compare the performances of the sub-optimal censoring schemes with the optimal ones, and observe that the loss of information is quite insignificant. 相似文献
20.
《Journal of Statistical Computation and Simulation》2012,82(1-4):65-82
For the situation of several 2 × 2 tables two approaches are presented to jackknife the well-known estimators of a common odds ratio proposed by Woolf (1955) and by Mantel and Haenszel (1959). These estimators are compared w.r.t. their bias and mean squared error by means of a Monte Carlo study for a wide range of parameters. 相似文献