首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In some studies that relate covariates to times of failure it is not feasible to observe all covariates for all subjects. For example, some covariates may be too costly in terms of time, money, or effect on the subject to record for all subjects. This paper considers the relative efficiencies of several designs for sampling a portion of the cohort on which the costly covariates will be observed. Such designs typically measure all covariates for each failure and control for covariates of lesser interest. Control subjects are sampled either from risk sets at times of observed failures or from the entire cohort. A new design in which the sampling probability for each individual depends on the amount of information that the individual can contribute to estimated coefficients is shown to be superior to other sampling designs under certain conditions. Primary focus of our designs is on time-invariant covariates, but some methods easily generalize to the time-varying setting. Data from a study conducted by the AIDS Clinical Trials Group are used to illustrate the new sampling procedure and to explore the relative efficiency of several sampling schemes.  相似文献   

2.
In this paper a method for the construction of a class of row-column designs with good statistical properties and high efficiency is presented. The class of designs produced is shown to exhibit balance, orthogonality and adjusted orthogonality. The efficiencies of these designs are investigated in detail, and they are shown to be very high, and possibly maximal in some cases.  相似文献   

3.
Ranked set sampling is a sampling technique that provides substantial cost efficiency in experiments where a quick, inexpensive ranking procedure is available to rank the units prior to formal, expensive and precise measurements. Although the theoretical properties and relative efficiencies of this approach with respect to simple random sampling have been extensively studied in the literature for the infinite population setting, the use of ranked set sampling methods has not yet been explored widely for finite populations. The purpose of this study is to use sheep population data from the Research Farm at Ataturk University, Erzurum, Turkey, to demonstrate the practical benefits of ranked set sampling procedures relative to the more commonly used simple random sampling estimation of the population mean and variance in a finite population. It is shown that the ranked set sample mean remains unbiased for the population mean as is the case for the infinite population, but the variance estimators are unbiased only with use of the finite population correction factor. Both mean and variance estimators provide substantial improvement over their simple random sample counterparts.  相似文献   

4.
The case-crossover design has been used by many researchers to study the transient effect of an exposure on the risk of a rare outcome. In a case-crossover design, only cases are sampled and each case will act as his/her own control. The time of failure acts as the case and non failure times act as the controls. Case-crossover designs have frequently been used to study the effect of environmental exposures on rare diseases or mortality. Time trends and seasonal confounding may be present in environmental studies and thus need to be controlled for by the sampling design. Several sampling methods are available for this purpose. In time-stratified sampling, disjoint strata of equal size are formed and the control times within the case stratum are used for comparison. The random semi-symmetric sampling design randomly selects a control time for comparison from two possible control times. The fixed semi-symmetric sampling design is a modified version of the random semi-symmetric sampling design that removes the random selection. Simulations show that the fixed semi-symmetric sampling design improves the variance of the random semi-symmetric sampling estimator by at least 35% for the exposures we studied. We derive expressions for the asymptotic variance of risk estimators for these designs, and show, that while the designs are not theoretically equivalent, in many realistic situations, the random semi-symmetric sampling design has similar efficiency to a time-stratified sampling design of size two and the fixed semi-symmetric sampling design has similar efficiency to a time-stratified sampling design of size three.  相似文献   

5.
Rosèn [1997. J. Statist. Plann. Inference 62, 159–191] introduced order sampling schemes of fixed shape which have inclusion probabilities roughly proportional to given size measures (πps schemes). Three particular cases where the fixed shape distributions are Pareto, exponential and uniform, respectively, are specially treated. In this paper, we give general algorithms for computing the first- and second-order inclusion probabilities for a general fixed shape order sampling scheme and explicit formulae for the three special cases. Identities are given that can be used to check the accuracy of the numerical results. Examples are included as well as some comments on improving the computational efficiency and accuracy of the algorithms.  相似文献   

6.
I consider the design of multistage sampling schemes for epidemiologic studies involving latent variable models, with surrogate measurements of the latent variables on a subset of subjects. Such models arise in various situations: when detailed exposure measurements are combined with variables that can be used to assign exposures to unmeasured subjects; when biomarkers are obtained to assess an unobserved pathophysiologic process; or when additional information is to be obtained on confounding or modifying variables. In such situations, it may be possible to stratify the subsample on data available for all subjects in the main study, such as outcomes, exposure predictors, or geographic locations. Three circumstances where analytic calculations of the optimal design are possible are considered: (i) when all variables are binary; (ii) when all are normally distributed; and (iii) when the latent variable and its measurement are normally distributed, but the outcome is binary. In each of these cases, it is often possible to considerably improve the cost efficiency of the design by appropriate selection of the sampling fractions. More complex situations arise when the data are spatially distributed: the spatial correlation can be exploited to improve exposure assignment for unmeasured locations using available measurements on neighboring locations; some approaches for informative selection of the measurement sample using location and/or exposure predictor data are considered.  相似文献   

7.
This study proposes the estimators for the mean and its variance of the number of respondents who possessed a rare sensitive attribute based on stratified sampling schemes (stratified sampling and stratified double sampling). This study deals with the extension of the estimation reported in Land et al. [Estimation of a rare sensitive attribute using Poisson distribution, Statistics (2011), in press. DOI: 10.1080/02331888.2010.524300] using a Poisson distribution and an unrelated question randomized response model reported in Greenberg et al. [The unrelated question randomized response model: Theoretical framework, J. Amer. Statist. Assoc. 64 (1969), 520–539]. In the stratified sampling, the estimators are proposed when the parameter of the rare unrelated attribute is known and unknown. The variances of estimators using a proportional and optimum allocation are also suggested. The proposed estimators are evaluated using a relative efficiency comparing variances of the estimators reported in Land et al. depending on the parameters and the probability of selecting a question. We showed that our proposed methods have better efficiencies than Land et al.’s randomized response model in some conditions. When the sizes of stratified populations are not given, other estimators are suggested using a stratified double sampling. For the proportional allocation, the difference between two variances in the stratified sampling and the stratified double sampling is given with the known rare unrelated attribute.  相似文献   

8.
A class of cohort sampling designs, including nested case–control, case–cohort and classical case–control designs involving survival data, is studied through a unified approach using Cox's proportional hazards model. By finding an optimal sample reuse method via local averaging, a closed form estimating function is obtained, leading directly to the estimators of the regression parameters that are relatively easy to compute and are more efficient than some commonly used estimators in case–cohort and nested case–control studies. A semiparametric efficient estimator can also be found with some further computation. In addition, the class of sampling designs in this study provides a variety of sampling options and relaxes the restrictions of sampling schemes that are currently available.  相似文献   

9.
现行的多水平抽样调查使用的各种形式的轮换模式,在西方各国均得到了广泛应用,但也存在着一系列问题。鉴此,通过对各种形式轮换模式的归纳统一和理论化综述研究,最终归纳出三维平衡多水平轮换模式设计方法,即将多水平轮换模式设计与后续的抽样估计方法研究统一起来,不但能够削减各类轮换偏差的负面影响,还能准确度量轮换样本之间的相关关系,并在多水平调查下得出更加准确的连续性抽样数据。此套设计方法具有极大的推广价值。  相似文献   

10.
Hypothesis Testing in Two-Stage Cluster Sampling   总被引:1,自引:0,他引:1  
Correlated observations often arise in complex sampling schemes such as two-stage cluster sampling. The resulting observations from this sampling scheme usually exhibit certain positive intracluster correlation, as a result of which the standard statistical procedures for testing hypotheses concerning linear combinations of the parameters may lack some of the optimal properties that these possess when the data are uncorrelated. The aim of this paper is to present exact methods for testing these hypotheses by combining within and between cluster information much as in Zhou & Mathew (1993).  相似文献   

11.
In this paper, a new sampling method is suggested, namely truncation-based ranked set samples (TBRSS) for estimating the population mean and median. The suggested method is compared with the simple random sampling (SRS), ranked set sampling (RSS), extreme ranked set sampling (ERSS) and median-ranked set sampling (MRSS) methods. It is shown that for estimating the population mean when the underlying distribution is symmetric, TBRSS estimator is unbiased and it is more efficient than the SRS estimator based on the same number of measured units. For asymmetric distributions considered in this study, TBRSS estimator is more efficient than the SRS for all considered distributions except for exponential distribution when the selection coefficient gets large. When compared with ERSS and MRSS methods, TBRSS performs well with respect to ERSS for all considered distributions except for U(0, 1) distribution, while TBRSS efficiency is higher than that of MRSS for U(0, 1) distribution. For estimating the population median, the TBRSS estimators have higher efficiencies when compared with SRS and ERSS. A real data set is used to illustrate the suggested method.  相似文献   

12.
Although Markov chain Monte Carlo methods have been widely used in many disciplines, exact eigen analysis for such generated chains has been rare. In this paper, a special Metropolis-Hastings algorithm, Metropolized independent sampling, proposed first in Hastings (1970), is studied in full detail. The eigenvalues and eigenvectors of the corresponding Markov chain, as well as a sharp bound for the total variation distance between the nth updated distribution and the target distribution, are provided. Furthermore, the relationship between this scheme, rejection sampling, and importance sampling are studied with emphasis on their relative efficiencies. It is shown that Metropolized independent sampling is superior to rejection sampling in two respects: asymptotic efficiency and ease of computation.  相似文献   

13.
Abstract. The efficiency of observational studies may be increased by applying multistage sampling designs. It is, however, not always transparent how to construct such a design to obtain increased efficiency. We here present a general statistical framework for describing and constructing multistage designs. We also provide tools for efficiency and cost‐efficiency comparisons, to facilitate the choice of sampling scheme. The comparisons are based on Fisher information matrices and the results are presented in graphs, where either efficiency or cost‐adjusted efficiency is plotted against a normalized measure of cost. The former curve resides in the unit square and is analogous to the receiver operating characteristic curve used for testing.  相似文献   

14.
现行的轮换样本调查使用各种类型的单水平轮换模式,在西方各国均得到了广泛应用,但是也存在着一系列问题。因此,通过对各种类型的轮换模式进行统一,并进行系统化、理论化研究,最终得出了二维平衡单水平轮换模式设计方法,并对其应用优势进行了总结。这套设计方法不仅将轮换模式设计与后续的估计方法研究统一起来,而且还能够削减各类轮换偏差的负面影响,并能准确度量轮换样本之间的相关关系,最终得出更加准确的连续性抽样估计量。  相似文献   

15.
We examine the efficiency of several sampling plans for use in certain agricultural, ecological and environmental studies. One concern for such studies is that plots that arephysically close might be more similar than distant plots. We considered sampling plansthat are designed to generate samples that represent the entire population while avoidingthe selection of units that provide essentially redundant information. All plans havethe property that they avoid the simultaneous selection of units that are, in some sense,neighboring units. By means of a simulation study, the efficiency of these plans iscompared to simple random Aampling Factors that influence the relative efficiencies areexamined. This is done for a number of different populations, representing variouspossible patterns for a response variable.  相似文献   

16.
Generalized aberration (GA) is one of the most frequently used criteria to quantify the suitability of an orthogonal array (OA) to be used as an experimental design. The two main motivations for GA are that it quantifies bias in a main-effects only model and that it is a good surrogate for estimation efficiencies of models with all the main effects and some two-factor interaction components. We demonstrate that these motivations are not appropriate for three-level OAs of strength 3 and we propose a direct classification with other criteria instead. To illustrate, we classified complete series of three-level strength-3 OAs with 27, 54 and 81 runs using the GA criterion, the rank of the matrix with two-factor interaction contrasts, the estimation efficiency of two-factor interactions, the projection estimation capacity, and a new model robustness criterion. For all of the series, we provide a list of admissible designs according to these criteria.  相似文献   

17.
I analyze efficient estimation of a cointegrating vector when the regressand and regressor are observed at different frequencies. Previous authors have examined the effects of specific temporal aggregation or sampling schemes, finding conventionally efficient techniques to be efficient only when both the regressand and the regressors are average sampled. Using an alternative method for analyzing aggregation under more general weighting schemes, I derive an efficiency bound that is conditional on the type of aggregation used on the low-frequency series and differs from the unconditional bound defined by the full-information high-frequency data-generating process, which is infeasible due to aggregation of at least one series. I modify a conventional estimator, canonical cointegrating regression (CCR), to accommodate cases in which the aggregation weights are known. The correlation structure may be utilized to offset the potential information loss from aggregation, resulting in a conditionally efficient estimator. In the case of unknown weights, the correlation structure of the error term generally confounds identification of conditionally efficient weights. Efficiency is illustrated using a simulation study and an application to estimating a gasoline demand equation.  相似文献   

18.
19.
In statistical practice, systematic sampling (SYS) is used in many modifications due to its simple handling. In addition, SYS may provide efficiency gains if it is well adjusted to the structure of the population under study. However, if SYS is based on an inappropriate picture of the population a high decrease of efficiency, i.e. a high increase in variance may result by changing from simple random sampling to SYS. In the context of two-stage designs SYS so far seems often in use for subsampling within the primary units. As an alternative to this practice, we propose to randomize the order of the primary units, then to select systematically a number of primary units and, thereafter, to draw secondary units by simple random sampling without replacement within the primary units selected. This procedure is more efficient than simple random sampling with replacement from the whole population of all secondary units, i.e. the variance of an adequate estimator for a total is never increased by changing from simple random sampling to randomized SYS whatever be the values associated by a characteristic with the secondary units, while there are values for which the variance decreases for the change mentioned. This result should hold generally, even if our proof, so far, is not complete for general sample sizes.  相似文献   

20.
In the paper we present a new method of calculating sampling intervals, so-called windows, allowing an experimenter some flexibility in timing the sample collection, while a minimum required design efficiency for parameter estimation is assured. The method is based on the Equivalence Theorem for D-optimality what makes the length of each window related to the parameter sensitivities. An example of calculating the windows in a pharmacokinetic study is presented. Some other methods of calculating efficient sampling windows are briefly discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号