期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Inference from Case–cohort Data with Multiple End-points

SANGITA KULATHINAL ELJA ARJAS 《Scandinavian Journal of Statistics》2006,33(1):25-36

Abstract. In a case–cohort design a random sample from the study cohort, referred as a subcohort, and all the cases outside the subcohort are selected for collecting extra covariate data. The union of the selected subcohort and all cases are referred as the case–cohort set. Such a design is generally employed when the collection of information on an extra covariate for the study cohort is expensive. An advantage of the case–cohort design over more traditional case–control and the nested case–control designs is that it provides a set of controls which can be used for multiple end-points, in which case there is information on some covariates and event follow-up for the whole study cohort. Here, we propose a Bayesian approach to analyse such a case–cohort design as a cohort design with incomplete data on the extra covariate. We construct likelihood expressions when multiple end-points are of interest simultaneously and propose a Bayesian data augmentation method to estimate the model parameters. A simulation study is carried out to illustrate the method and the results are compared with the complete cohort analysis. 相似文献

2.

Generalized staggered nested designs for variance components estimation

Yoshikazu Ojima 《Journal of applied statistics》2000,27(5):541-553

Staggered nested experimental designs are the most popular class of unbalanced nested designs in practical fields. The most important features of the staggered nested design are that it has a very simple open-ended structure and each sum of squares in the analysis of variance has almost the same degrees of freedom. Based on the features, a class of unbalanced nested designs that is a generalization of the staggered nested design is proposed in this paper. Formulae for the estimation of variance components and their sums are provided. Comparing the variances of the estimators to the staggered nested designs, it is found that some of the generalized staggered nested designs are more efficient than the traditional staggered nested design in estimating some of the variance components and their sums. An example is provided for illustration. 相似文献

3.

Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression 总被引：2，自引：1，他引：1

NORMAN E. BRESLOW JON A. WELLNER 《Scandinavian Journal of Statistics》2007,34(1):86-102

Abstract. We consider semiparametric models for which solution of Horvitz–Thompson or inverse probability weighted (IPW) likelihood equations with two-phase stratified samples leads to consistent and asymptotically Gaussian estimators of both Euclidean and non-parametric parameters. For Bernoulli (independent and identically distributed) sampling, standard theory shows that the Euclidean parameter estimator is asymptotically linear in the IPW influence function. By proving weak convergence of the IPW empirical process, and borrowing results on weighted bootstrap empirical processes, we derive a parallel asymptotic expansion for finite population stratified sampling. Several of our key results have been derived already for Cox regression with stratified case–cohort and more general survey designs. This paper is intended to help interpret this previous work and to pave the way towards a general Horvitz–Thompson approach to semiparametric inference with data from complex probability samples. 相似文献

4.

Case–control studies with complex sampling

Alastair Scott & Chris Wild 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(3):389-401

The use of complex sampling designs in population-based case–control studies is becoming more common, particularly for sampling the control population. This is prompted by all the usual cost and logistical benefits that are conferred by multistage sampling. Complex sampling has often been ignored in analysis but, with the advent of packages like SUDAAN, survey-weighted analyses that take account of the sample design can be carried out routinely. This paper explores this approach and more efficient alternatives, which can also be implemented by using readily available software. 相似文献

5.

Efficient estimators for adaptive stratified sequential sampling

《Journal of Statistical Computation and Simulation》2012,82(10):1163-1179

In stratified sampling, methods for the allocation of effort among strata usually rely on some measure of within-stratum variance. If we do not have enough information about these variances, adaptive allocation can be used. In adaptive allocation designs, surveys are conducted in two phases. Information from the first phase is used to allocate the remaining units among the strata in the second phase. Brown et al. [Adaptive two-stage sequential sampling, Popul. Ecol. 50 (2008), pp. 239–245] introduced an adaptive allocation sampling design – where the final sample size was random – and an unbiased estimator. Here, we derive an unbiased variance estimator for the design, and consider a related design where the final sample size is fixed. Having a fixed final sample size can make survey-planning easier. We introduce a biased Horvitz–Thompson type estimator and a biased sample mean type estimator for the sampling designs. We conduct two simulation studies on honey producers in Kurdistan and synthetic zirconium distribution in a region on the moon. Results show that the introduced estimators are more efficient than the available estimators for both variable and fixed sample size designs, and the conventional unbiased estimator of stratified simple random sampling design. In order to evaluate efficiencies of the introduced designs and their estimator furthermore, we first review some well-known adaptive allocation designs and compare their estimator with the introduced estimators. Simulation results show that the introduced estimators are more efficient than available estimators of these well-known adaptive allocation designs. 相似文献

6.

Survival analysis with incomplete genetic data

D. Y. Lin 《Lifetime data analysis》2014,20(1):16-22

Genetic data are now collected frequently in clinical studies and epidemiological cohort studies. For a large study, it may be prohibitively expensive to genotype all study subjects, especially with the next-generation sequencing technology. Two-phase sampling, such as case-cohort and nested case-control sampling, is cost-effective in such settings but entails considerable analysis challenges, especially if efficient estimators are desired. Another type of missing data arises when the investigators are interested in the haplotypes or the genetic markers that are not on the genotyping platform used for the current study. Valid and efficient analysis of such missing data is also interesting and challenging. This article provides an overview of these issues and outlines some directions for future research. 相似文献

7.

New two-stage sampling designs based on neoteric ranked set sampling

Cesar Augusto Taconeli Angelo da Silva Cabral 《Journal of Statistical Computation and Simulation》2019,89(2):232-248

Neoteric ranked set sampling (NRSS) is a recently developed sampling plan, derived from the well-known ranked set sampling (RSS) scheme. It has already been proved that NRSS provides more efficient estimators for population mean and variance compared to RSS and other sampling designs based on ranked sets. In this work, we propose and evaluate the performance of some two-stage sampling designs based on NRSS. Five different sampling schemes are proposed. Through an extensive Monte Carlo simulation study, we verified that all proposed sampling designs outperform RSS, NRSS, and the original double RSS design, producing estimators for the population mean with a lower mean square error. Furthermore, as with NRSS, two-stage NRSS estimators present some bias for asymmetric distributions. We complement the study with a discussion on the relative performance of the proposed estimators. Moreover, an additional simulation based on data of the diameter and height of pine trees is presented. 相似文献

8.

Fitting semiparametric accelerated failure time models for nested case–control data

Sangwook Kang 《Journal of Statistical Computation and Simulation》2017,87(4):652-663

A nested case–control (NCC) study is an efficient cohort-sampling design in which a subset of controls are sampled from the risk set at each event time. Since covariate measurements are taken only for the sampled subjects, time and efforts of conducting a full scale cohort study can be saved. In this paper, we consider fitting a semiparametric accelerated failure time model to failure time data from a NCC study. We propose to employ an efficient induced smoothing procedure for rank-based estimating method for regression parameters estimation. For variance estimation, we propose to use an efficient resampling method that utilizes the robust sandwich form. We extend our proposed methods to a generalized NCC study that allows a sampling of cases. Finite sample properties of the proposed estimators are investigated via an extensive stimulation study. An application to a tumor study illustrates the utility of the proposed method in routine data analysis. 相似文献

9.

A Generalized Class of Ratio Type Exponential Estimators of Population Mean Under Linear Transformation of Auxiliary Variable

Lovleen Kumar Grover Parmdeep Kaur 《统计学通讯:模拟与计算》2013,42(7):1552-1574

Recently, Shabbir and Gupta [Shabbir, J. and Gupta, S. (2011). On estimating finite population mean in simple and stratified random sampling. Communications in Statistics-Theory and Methods, 40(2), 199–212] defined a class of ratio type exponential estimators of population mean under a very specific linear transformation of auxiliary variable. In the present article, we propose a generalized class of ratio type exponential estimators of population mean in simple random sampling under a very general linear transformation of auxiliary variable. Shabbir and Gupta's [Shabbir, J. and Gupta, S. (2011). On estimating finite population mean in simple and stratified random sampling. Communications in Statistics-Theory and Methods, 40(2), 199–212] class of estimators is a particular member of our proposed class of estimators. It has been found that the optimal estimator of our proposed generalized class of estimators is always more efficient than almost all the existing estimators defined under the same situations. Moreover, in comparison to a few existing estimators, our proposed estimator becomes more efficient under some simple conditions. Theoretical results obtained in the article have been verified by taking a numerical illustration. Finally, a simulation study has been carried out to see the relative performance of our proposed estimator with respect to some existing estimators which are less efficient under certain conditions as compared to the proposed estimator. 相似文献

10.

Stratified Case-Cohort Analysis of General Cohort Sampling Designs 总被引：1，自引：0，他引：1

SVEN OVE SAMUELSEN HALLVARD ÅNESTAD ANDERS SKRONDAL 《Scandinavian Journal of Statistics》2007,34(1):103-119

Abstract. It is shown that variance estimates for regression coefficients in exposure-stratified case-cohort studies (Borgan et al ., Lifetime Data Anal., 6, 2000, 39–58) can easily be obtained from influence terms routinely calculated in the standard software for Cox regression. By allowing for post-stratification on outcome we also place the estimators proposed by Chen ( J. R. Statist. Soc. Ser. B , 63, 2001, 791–809) for a general class of cohort sampling designs within the Borgan et al. 's framework, facilitating simple variance estimation for these designs. Finally, the Chen approach is extended to accommodate stratified designs with surrogate variables available for all cohort members, such as stratified case-cohort and counter-matching designs. 相似文献

11.

Nested exposure case-control sampling: a sampling scheme to analyze rare time-dependent exposures

Feifel Jan Gebauer Madlen Schumacher Martin Beyersmann Jan 《Lifetime data analysis》2020,26(1):21-44

For large cohort studies with rare outcomes, the nested case-control design only requires data collection of small subsets of the individuals at risk. These are typically randomly sampled at the observed event times and a weighted, stratified analysis takes over the role of the full cohort analysis. Motivated by observational studies on the impact of hospital-acquired infection on hospital stay outcome, we are interested in situations, where not necessarily the outcome is rare, but time-dependent exposure such as the occurrence of an adverse event or disease progression is. Using the counting process formulation of general nested case-control designs, we propose three sampling schemes where not all commonly observed outcomes need to be included in the analysis. Rather, inclusion probabilities may be time-dependent and may even depend on the past sampling and exposure history. A bootstrap analysis of a full cohort data set from hospital epidemiology allows us to investigate the practical utility of the proposed sampling schemes in comparison to a full cohort analysis and a too simple application of the nested case-control design, if the outcome is not rare.

相似文献

12.

Statistics for weighing benefits and harms in a proposed genetic substudy of a randomized cancer prevention trial

Stuart G. Baker Barnett S. Kramer 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(5):941-954

Summary. When evaluating potential interventions for cancer prevention, it is necessary to compare benefits and harms. With new study designs, new statistical approaches may be needed to facilitate this comparison. A case in point arose in a proposed genetic substudy of a randomized trial of tamoxifen versus placebo in asymptomatic women who were at high risk for breast cancer. Although the randomized trial showed that tamoxifen substantially reduced the risk of breast cancer, the harms from tamoxifen were serious and some were life threaten-ing. In hopes of finding a subset of women with inherited risk genes who derive greater bene-fits from tamoxifen, we proposed a nested case–control study to test some trial subjects for various genes and new statistical methods to extrapolate benefits and harms to the general population. An important design question is whether or not the study should target common low penetrance genes. Our calculations show that useful results are only likely with rare high penetrance genes. 相似文献

13.

A jackknife variance estimator for unequal probability sampling

Yves G. Berger Chris J. Skinner 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(1):79-89

Summary. The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators. 相似文献

14.

Pseudo‐R2 statistics under complex sampling

下载免费PDF全文

Thomas Lumley 《Australian & New Zealand Journal of Statistics》2017,59(2):187-194

Model summaries based on the ratio of fitted and null likelihoods have been proposed for generalised linear models, reducing to the familiar R² coefficient of determination in the Gaussian model with identity link. In this note I show how to define the Cox–Snell and Nagelkerke summaries under arbitrary probability sampling designs, giving a design‐consistent estimator of the population model summary. It is also shown that for logistic regression models under case–control sampling the usual Cox–Snell and Nagelkerke R² are not design‐consistent, but are systematically larger than would be obtained with a cross‐sectional or cohort sample from the same population, even in settings where the weighted and unweighted logistic regression estimators are similar or identical. Implementation of the new estimators is straightforward and code is provided in R. 相似文献

15.

Inverse Adaptive Cluster Sampling with Unequal Selection Probabilities: Case Studies on Crab Holes and Arsenic Pollution

下载免费PDF全文

Mohammad Salehi Mohammad Moradi Jassim A. Al Khayat Jennifer Brown Adil Eltayeb Mohamed Yousif 《Australian & New Zealand Journal of Statistics》2015,57(2):189-201

Adaptive cluster sampling is an efficient method of estimating the parameters of rare and clustered populations. The method mimics how biologists would like to collect data in the field by targeting survey effort to localised areas where the rare population occurs. Another popular sampling design is inverse sampling. Inverse sampling was developed so as to be able to obtain a sample of rare events having a predetermined size. Ideally, in inverse sampling, the resultant sample set will be sufficiently large to ensure reliable estimation of population parameters. In an effort to combine the good properties of these two designs, adaptive cluster sampling and inverse sampling, we introduce inverse adaptive cluster sampling with unequal selection probabilities. We develop an unbiased estimator of the population total that is applicable to data obtained from such designs. We also develop numerical approximations to this estimator. The efficiency of the estimators that we introduce is investigated through simulation studies based on two real populations: crabs in Al Khor, Qatar and arsenic pollution in Kurdistan, Iran. The simulation results show that our estimators are efficient. 相似文献

16.

General formulae for expectations, variances and covariances of the mean squares for staggered nested designs

Yoshikazu Ojima 《Journal of applied statistics》1998,25(6):785-799

Staggered nested experimental designs are the most popular class of unbalanced nested designs. Using a special notation which covers the particular structure of the staggered nested design, this paper systematically derives the canonical form for the arbitrary m-factors. Under the normality assumption for every random variable, a vector comprising m canonical variables from each experimental unit is normally independently and identically distributed. Every sum of squares used in the analysis of variance (ANOVA) can be expressed as the sum of squares of the corresponding canonical variables. Hence, general formulae for the expectations, variances and covariances of the mean squares are directly obtained from the canonical form. Applying the formulae, the explicit forms of the ANOVA estimators of the variance components and unbiased estimators of the ratios of the variance components are introduced in this paper. The formulae are easily applied to obtain the variances and covariances of any linear combinations of the mean squares, especially the ANOVA estimators of the variance components. These results are eff ectively applied for the standardization of measurement methods. 相似文献

17.

Estimation of contingency tables in complex survey sampling using probabilistic expert systems

Marco Ballin Mauro Scanu Paola Vicard 《Journal of statistical planning and inference》2010

In this paper we explore the possibility to use a particular class of models, known as probabilistic expert systems, to define two classes of estimators of a contingency table in case of stratified sampling designs. The two classes are characterized by the different role of the sampling design: in the first, the sampling design is treated as an additional variable; in the second, it is used only for estimation purposes by means of the survey weights. The bias/variance trade off of these estimators is analyzed and the consequences of model misspecification are illustrated. Furthermore, it is shown that the Horvitz–Thompson estimator belongs to both classes of estimators. It comes out that the Horvitz–Thompson estimator is almost always inefficient but robust. Monte Carlo simulations illustrate the efficiency of the proposed estimators. 相似文献

18.

A note on the prospective analysis of outcome-dependent samples

Hua Yun Chen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(2):575-584

Summary. Two likelihood representations corresponding to the prospective and retrospective analyses of the case–control design are derived for general outcome-dependent samples with arbitrary discrete or continuous outcomes and possibly non-multiplicative models. Parameter identification in the general outcome-dependent design is reduced to the simple problem of parameter identification in the general odds ratio function. Both likelihoods are shown to generate the same profile likelihood for the common parameter of interest. Maximum like- lihood estimators based on either likelihood are semiparametric efficient for the identifiable parameters. 相似文献

19.

On Entropy-Based Test of Exponentiality in Ranked Set Sampling

M. Mahdizadeh 《统计学通讯:模拟与计算》2015,44(4):979-995

When the sampling units can be easily ranked than quantified, ranked set sampling (RSS) is a viable alternative to the traditional simple random sampling (SRS). Much effort has been made for modifying basic RSS protocol with the aim of deriving more efficient estimators of the population attributes. Entropy has been seminal in developing measures of distributional disparities as a tool for statistical inference. This article is concerned with testing exponentiality based on sample entropy under some RSS-based designs. A simulation study shows that the proposed tests possess good power properties against several alternatives as compared with the ordinary test based on SRS. 相似文献

20.

The Efficiency of Simple and Counter-matched Nested Case-control Sampling

Ornulf Borgan & Espen F. Olsen 《Scandinavian Journal of Statistics》1999,26(4):493-509

This paper presents a study of the performance of simple and counter-matched nested case-control sampling relative to a full cohort study. First we review methods for estimating the regression parameters and the integrated baseline hazard for Cox's proportional hazards model from cohort and case-control data. Then the asymptotic distributional properties of these estimators are recapitulated, and relative efficiency results are presented both for regression and baseline hazard estimation. 相似文献