首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Abstract. Two new unequal probability sampling methods are introduced: conditional and restricted Pareto sampling. The advantage of conditional Pareto sampling compared with standard Pareto sampling, introduced by Rosén (J. Statist. Plann. Inference, 62, 1997, 135, 159), is that the factual inclusion probabilities better agree with the desired ones. Restricted Pareto sampling, preferably conditioned or adjusted, is able to handle cases where there are several restrictions on the sample and is an alternative to the recent cube method for balanced sampling introduced by Deville and Tillé (Biometrika, 91, 2004, 893). The new sampling designs have high entropy and the involved random numbers can be seen as permanent random numbers.  相似文献   

2.
Abstract.  We consider semiparametric models for which solution of Horvitz–Thompson or inverse probability weighted (IPW) likelihood equations with two-phase stratified samples leads to consistent and asymptotically Gaussian estimators of both Euclidean and non-parametric parameters. For Bernoulli (independent and identically distributed) sampling, standard theory shows that the Euclidean parameter estimator is asymptotically linear in the IPW influence function. By proving weak convergence of the IPW empirical process, and borrowing results on weighted bootstrap empirical processes, we derive a parallel asymptotic expansion for finite population stratified sampling. Several of our key results have been derived already for Cox regression with stratified case–cohort and more general survey designs. This paper is intended to help interpret this previous work and to pave the way towards a general Horvitz–Thompson approach to semiparametric inference with data from complex probability samples.  相似文献   

3.
A class of sampling two units without replacement with inclusion probability proportional to size is proposed in this article. Many different well known probability proportional to size sampling designs are special cases from this class. The first and second inclusion probabilities of this class satisfy important properties and provide a non-negative variance estimator of the Horvitz and Thompson estimator for the population total. Suitable choice for the first and second inclusion probabilities from this class can be used to reduce the variance estimator of the Horvitz and Thompson estimator. Comparisons between different proportional to size sampling designs through real data and artificial examples are given. Examples show that the minimum variance of the Horvitz and Thompson estimator obtained from the proposed design is not attainable for the most cases at any of the well known designs.  相似文献   

4.
An outcome-dependent sampling (ODS) design is a retrospective sampling scheme where one observes the primary exposure variables with a probability that depends on the observed value of the outcome variable. When the outcome of interest is failure time, the observed data are often censored. By allowing the selection of the supplemental samples depends on whether the event of interest happens or not and oversampling subjects from the most informative regions, ODS design for the time-to-event data can reduce the cost of the study and improve the efficiency. We review recent progresses and advances in research on ODS designs with failure time data. This includes researches on ODS related designs like case–cohort design, generalized case–cohort design, stratified case–cohort design, general failure-time ODS design, length-biased sampling design and interval sampling design.  相似文献   

5.
A new method for sampling from a finite population that is spread in one, two or more dimensions is presented. Weights are used to create strong negative correlations between the inclusion indicators of nearby units. The method can be used to produce unequal probability samples that are well spread over the population in every dimension, without any spatial stratification. Since the method is very general there are numerous possible applications, especially in sampling of natural resources where spatially balanced sampling has proven to be efficient. Two examples show that the method gives better estimates than other commonly used designs.  相似文献   

6.
We give a formal definition of a representative sample, but roughly speaking, it is a scaled‐down version of the population, capturing its characteristics. New methods for selecting representative probability samples in the presence of auxiliary variables are introduced. Representative samples are needed for multipurpose surveys, when several target variables are of interest. Such samples also enable estimation of parameters in subspaces and improved estimation of target variable distributions. We describe how two recently proposed sampling designs can be used to produce representative samples. Both designs use distance between population units when producing a sample. We propose a distance function that can calculate distances between units in general auxiliary spaces. We also propose a variance estimator for the commonly used Horvitz–Thompson estimator. Real data as well as illustrative examples show that representative samples are obtained and that the variance of the Horvitz–Thompson estimator is reduced compared with simple random sampling.  相似文献   

7.
The case-crossover design has been used by many researchers to study the transient effect of an exposure on the risk of a rare outcome. In a case-crossover design, only cases are sampled and each case will act as his/her own control. The time of failure acts as the case and non failure times act as the controls. Case-crossover designs have frequently been used to study the effect of environmental exposures on rare diseases or mortality. Time trends and seasonal confounding may be present in environmental studies and thus need to be controlled for by the sampling design. Several sampling methods are available for this purpose. In time-stratified sampling, disjoint strata of equal size are formed and the control times within the case stratum are used for comparison. The random semi-symmetric sampling design randomly selects a control time for comparison from two possible control times. The fixed semi-symmetric sampling design is a modified version of the random semi-symmetric sampling design that removes the random selection. Simulations show that the fixed semi-symmetric sampling design improves the variance of the random semi-symmetric sampling estimator by at least 35% for the exposures we studied. We derive expressions for the asymptotic variance of risk estimators for these designs, and show, that while the designs are not theoretically equivalent, in many realistic situations, the random semi-symmetric sampling design has similar efficiency to a time-stratified sampling design of size two and the fixed semi-symmetric sampling design has similar efficiency to a time-stratified sampling design of size three.  相似文献   

8.
Based on progressively Type-I interval censored sample, the problem of estimating unknown parameters of a two parameter generalized half-normal(GHN) distribution is considered. Different methods of estimation are discussed. They include the maximum likelihood estimation, midpoint approximation method, approximate maximum likelihood estimation, method of moments, and estimation based on probability plot. Several Bayesian estimates with respect to different symmetric and asymmetric loss functions such as squared error, LINEX, and general entropy is calculated. The Lindley’s approximation method is applied to determine Bayesian estimates. Monte Carlo simulations are performed to compare the performances of the different methods. Finally, analysis is also carried out for a real dataset.  相似文献   

9.
In some studies that relate covariates to times of failure it is not feasible to observe all covariates for all subjects. For example, some covariates may be too costly in terms of time, money, or effect on the subject to record for all subjects. This paper considers the relative efficiencies of several designs for sampling a portion of the cohort on which the costly covariates will be observed. Such designs typically measure all covariates for each failure and control for covariates of lesser interest. Control subjects are sampled either from risk sets at times of observed failures or from the entire cohort. A new design in which the sampling probability for each individual depends on the amount of information that the individual can contribute to estimated coefficients is shown to be superior to other sampling designs under certain conditions. Primary focus of our designs is on time-invariant covariates, but some methods easily generalize to the time-varying setting. Data from a study conducted by the AIDS Clinical Trials Group are used to illustrate the new sampling procedure and to explore the relative efficiency of several sampling schemes.  相似文献   

10.
For fixed size sampling designs with high entropy, it is well known that the variance of the Horvitz–Thompson estimator can be approximated by the Hájek formula. The interest of this asymptotic variance approximation is that it only involves the first order inclusion probabilities of the statistical units. We extend this variance formula when the variable under study is functional, and we prove, under general conditions on the regularity of the individual trajectories and the sampling design, that we can get a uniformly convergent estimator of the variance function of the Horvitz–Thompson estimator of the mean function. Rates of convergence to the true variance function are given for the rejective sampling. We deduce, under conditions on the entropy of the sampling design, that it is possible to build confidence bands whose coverage is asymptotically the desired one via simulation of Gaussian processes with variance function given by the Hájek formula. Finally, the accuracy of the proposed variance estimator is evaluated on samples of electricity consumption data measured every half an hour over a period of 1 week.  相似文献   

11.
In some applications it is cost efficient to sample data in two or more stages. In the first stage a simple random sample is drawn and then stratified according to some easily measured attribute. In each subsequent stage a random subset of previously selected units is sampled for more detailed and costly observation, with a unit's sampling probability determined by its attributes as observed in the previous stages. This paper describes multistage sampling designs and estimating equations based on the resulting data. Maximum likelihood estimates (MLEs) and their asymptotic variances are given for designs using parametric models. Horvitz–Thompson estimates are introduced as alternatives to MLEs, their asymptotic distributions are derived and their strengths and weaknesses are evaluated. The designs and the estimates are illustrated with data on corn production.  相似文献   

12.
Summary.  The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.  相似文献   

13.
Suppose there are k(>= 2) treatments and each treatment is a Bernoulli process with binomial sampling. The problem of selecting a random-sized subset which contains the treatment with the largest survival probability (reliability or probability of success) is considered. Based on the ideas from both classical approaches and general Bayesian statistical decision approach, a new subset selection procedure is proposed to solve this kind of problem in both balanced and unbalanced designs. Comparing with the classical procedures, the proposed procedure has a significantly smaller selected subset. The optimal properties and performance of it were examined. The methods of selecting and fitting the priors and the results of Monte Carlo simulations on selected important cases are also studied.  相似文献   

14.
In this article, we present a study carried out to compare the effectiveness of the normal probability plot (NPP) and a simple dot plot in assessing the significance of the effects in experimental designs with factors at two levels (2 k?p designs). Several groups of students who had just completed a course that covered factorial designs were asked to identify the significant effects in a total of 32 situations, 16 of which were represented using NPPs and the other 16 using dot plots. Although the 32 scenarios were said to be different, there were really only 16 different situations, each of which was represented using the two methods to be compared. A simple graphical analysis shows no evidence that there is a difference between the two procedures. However, in designs with 16 runs there are some cases where NPP seems to give slightly better results.  相似文献   

15.
Modern sampling designs in survey statistics, in general, are constructed in order to optimize the accuracy of estimators such as totals, means and proportions. In stratified random sampling a variance minimal solution was introduced by Neyman and Tschuprov. However, practical constraints may lead to limitations of the domain of sampling fractions which have to be considered within the optimization process. Special attention on the complexity of numerical solutions has to be paid in cases with many strata or when the optimal allocation has to be applied repeatedly, such as in iterative solutions of stratification problems. The present article gives an overview of recent numerical algorithms which allow adequate inclusion of box constraints in the numerical optimization process. These box constraints may play an important role in statistical modeling. Furthermore, a new approach through a fixed point iteration with a finite termination property is presented.  相似文献   

16.
The choice of smoothing determines the properties of nonparametric estimates of probability densities. In the discrimination problem, the choice is often tied to loss functions. A framework for the cross–validatory choice of smoothing parameters based on general loss functions is given. Several loss functions are considered as special cases. In particular, a family of loss functions, which is connected to discrimination problems, is directly related to measures of performance used in discrimination. Consistency results are given for a general class of loss functions which comprise this family of discriminant loss functions.  相似文献   

17.
Two sampling designs via inverse sampling for generating record data and their concomitants are considered: single sample and multisample. The purpose here is to compare the Fisher information in these two sampling schemes. It is shown that the comparison criterion depends on the underlying distribution. Several general results are established for some parametric families and their well known subclasses such as location-scale and shape families, exponential family and proportional (reversed) hazard model. Farlie-Gumbel-Morgenstern (FGM) family, bivariate normal distribution, and some other common bivariate distributions are considered as examples for illustrations and are classified according to this criterion.  相似文献   

18.
Not having a variance estimator is a seriously weak point of a sampling design from a practical perspective. This paper provides unbiased variance estimators for several sampling designs based on inverse sampling, both with and without an adaptive component. It proposes a new design, which is called the general inverse sampling design, that avoids sampling an infeasibly large number of units. The paper provide estimators for this design as well as its adaptive modification. A simple artificial example is used to demonstrate the computations. The adaptive and non‐adaptive designs are compared using simulations based on real data sets. The results indicate that, for appropriate populations, the adaptive version can have a substantial variance reduction compared with the non‐adaptive version. Also, adaptive general inverse sampling with a limitation on the initial sample size has a greater variance reduction than without the limitation.  相似文献   

19.
If a model is fitted to empirical data, bias can arise from terms which are not incorporated in the model assumptions. As a consequence the commonly used optimality criteria based on the generalized variance of the estimator of the model parameters may not lead to efficient designs for the statistical analysis. In this note some general aspects of all-bias designs are presented, which were introduced in this context by Box and Draper (1959). Using an interesting correspondence between the points of all-bias designs and the knots of quadrature formulas we establish sufficient conditions such that a given design is an all-bias design. The results are illustrated in the special case of spline regression models. In particular our results generalize recent findings of Woods and Lewis (2006).  相似文献   

20.
Weighted analyses for cohort sampling designs   总被引:1,自引:1,他引:0  
Weighted analysis methods are considered for cohort sampling designs that allow subsampling of both cases and non-cases, but with cases generally sampled more intensively. The methods fit into the general framework for the analysis of survey sampling designs considered by Lin (Biometrika 87:37–47, 2000). Details are given for applying the general methodology in this setting. In addition to considering proportional hazards regression, methods for evaluating the representativeness of the sample and for estimating event-free probabilities are given. In a small simulation study, the one-sample cumulative hazard estimator and its variance estimator were found to be nearly unbiased, but the true coverage probabilities of confidence intervals computed from these sometimes deviated significantly from the nominal levels. Methods for cross-validation and for bootstrap resampling, which take into account the dependencies in the sample, are also considered. An erratum to this article can be found at  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号