首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
2.
《统计学通讯:理论与方法》2012,41(16-17):3278-3300
Under complex survey sampling, in particular when selection probabilities depend on the response variable (informative sampling), the sample and population distributions are different, possibly resulting in selection bias. This article is concerned with this problem by fitting two statistical models, namely: the variance components model (a two-stage model) and the fixed effects model (a single-stage model) for one-way analysis of variance, under complex survey design, for example, two-stage sampling, stratification, and unequal probability of selection, etc. Classical theory underlying the use of the two-stage model involves simple random sampling for each of the two stages. In such cases the model in the sample, after sample selection, is the same as model for the population; before sample selection. When the selection probabilities are related to the values of the response variable, standard estimates of the population model parameters may be severely biased, leading possibly to false inference. The idea behind the approach is to extract the model holding for the sample data as a function of the model in the population and of the first order inclusion probabilities. And then fit the sample model, using analysis of variance, maximum likelihood, and pseudo maximum likelihood methods of estimation. The main feature of the proposed techniques is related to their behavior in terms of the informativeness parameter. We also show that the use of the population model that ignores the informative sampling design, yields biased model fitting.  相似文献   

3.
Calibration on the available auxiliary variables is widely used to increase the precision of the estimates of parameters. Singh and Sedory [Two-step calibration of design weights in survey sampling. Commun Stat Theory Methods. 2016;45(12):3510–3523.] considered the problem of calibration of design weights under two-step for single auxiliary variable. For a given sample, design weights and calibrated weights are set proportional to each other, in the first step. While, in the second step, the value of proportionality constant is determined on the basis of objectives of individual investigator/user for, for example, to get minimum mean squared error or reduction of bias. In this paper, we have suggested to use two auxiliary variables for two-step calibration of the design weights and compared the results with single auxiliary variable for different sample sizes based on simulated and real-life data set. The simulated and real-life application results show that two-auxiliary variables based two-step calibration estimator outperforms the estimator under single auxiliary variable in terms of minimum mean squared error.  相似文献   

4.
This paper considers the effects of informative two-stage cluster sampling on estimation and prediction. The aims of this article are twofold: first to estimate the parameters of the superpopulation model for two-stage cluster sampling from a finite population, when the sampling design for both stages is informative, using maximum likelihood estimation methods based on the sample-likelihood function; secondly to predict the finite population total and to predict the cluster-specific effects and the cluster totals for clusters in the sample and for clusters not in the sample. To achieve this we derive the sample and sample-complement distributions and the moments of the first and second stage measurements. Also we derive the conditional sample and conditional sample-complement distributions and the moments of the cluster-specific effects given the cluster measurements. It should be noted that classical design-based inference that consists of weighting the sample observations by the inverse of sample selection probabilities cannot be applied for the prediction of the cluster-specific effects for clusters not in the sample. Also we give an alternative justification of the Royall [1976. The linear least squares prediction approach to two-stage sampling. Journal of the American Statistical Association 71, 657–664] predictor of the finite population total under two-stage cluster population. Furthermore, small-area models are studied under informative sampling.  相似文献   

5.
Semiparametric regression models with multiple covariates are commonly encountered. When there are covariates not associated with response variable, variable selection may lead to sparser models, more lucid interpretations and more accurate estimation. In this study, we adopt a sieve approach for the estimation of nonparametric covariate effects in semiparametric regression models. We adopt a two-step iterated penalization approach for variable selection. In the first step, a mixture of the Lasso and group Lasso penalties are employed to conduct the first-round variable selection and obtain the initial estimate. In the second step, a mixture of the weighted Lasso and weighted group Lasso penalties, with weights constructed using the initial estimate, are employed for variable selection. We show that the proposed iterated approach has the variable selection consistency property, even when number of unknown parameters diverges with sample size. Numerical studies, including simulation and analysis of a diabetes dataset, show satisfactory performance of the proposed approach.  相似文献   

6.
Flexible designs offer a large amount of flexibility in clinical trials with control of the type I error rate. This allows the combination of trials from different clinical phases of a drug development process. Such combinations require designs where hypotheses are selected and/or added at interim analysis without knowing the selection rule in advance so that both flexibility and multiplicity issues arise. The paper reviews the basic principles and some of the common methods for reaching flexibility while controlling the family-wise error rate in the strong sense. Flexible designs have been criticized because they may lead to different weights for the patients from the different stages when reassessing sample sizes. Analyzing the data in a conventional way avoids such unequal weighting but may inflate the multiple type I error rate. In cases where the conditional type I error rates of the new design (and conventional analysis) are below the conditional type I error rates of the initial design the conventional analysis may, however, be done without inflating the type I error rate. Focusing on a parallel group design with two treatments and a common control, we use this principle to investigate when we can select one treatment, reassess sample sizes and test the corresponding null hypotheses by the conventional level alpha z-test without compromising on the multiple type I error rate.  相似文献   

7.
In non‐randomized biomedical studies using the proportional hazards model, the data often constitute an unrepresentative sample of the underlying target population, which results in biased regression coefficients. The bias can be avoided by weighting included subjects by the inverse of their respective selection probabilities, as proposed by Horvitz & Thompson (1952) and extended to the proportional hazards setting for use in surveys by Binder (1992) and Lin (2000). In practice, the weights are often estimated and must be treated as such in order for the resulting inference to be accurate. The authors propose a two‐stage weighted proportional hazards model in which, at the first stage, weights are estimated through a logistic regression model fitted to a representative sample from the target population. At the second stage, a weighted Cox model is fitted to the biased sample. The authors propose estimators for the regression parameter and cumulative baseline hazard. They derive the asymptotic properties of the parameter estimators, accounting for the difference in the variance introduced by the randomness of the weights. They evaluate the accuracy of the asymptotic approximations in finite samples through simulation. They illustrate their approach in an analysis of renal transplant patients using data obtained from the Scientific Registry of Transplant Recipients  相似文献   

8.
自加权分层多阶段抽样设计具有三大特征:一为除第一阶抽样外其余各阶抽样的样本量均为常数,二为样本量按照各层的最终单元数量在各层比例分配,三为前几阶采用抽样而最后一阶采用放回或不放回的简单随机抽样。根据上述三个特征设计了中国人口变动调查的自加权抽样设计。  相似文献   

9.
ABSTRACT

In this article, a new two-step calibration technique of design weights is proposed. In the first step, the calibration weights are set proportional to the design weights in a given sample. In the second step, the constants of proportionality are determined based on different objectives of the investigator such as bias reduction or minimum mean squared error. Many estimators available in the literature can be shown to be special cases of the proposed two-step calibrated estimator. A simulation study, based on a real data set, is also included at the end. A few technical issues are raised with respect to the use of the proposed calibration technique: both limitations and benefits are discussed.  相似文献   

10.
We present methodology for estimating age-specific reference ranges by using data from two-stage samples. On the basis of the information obtained in the first stage, the initial sample is stratified and random subsamples are drawn from each stratum, where the selection probabilities in this second-stage sampling may be different across strata in the population. The variable for which the reference ranges are to be established is measured at the second phase. The approach involves maximum likelihood estimation of the parameters of the age-specific distributions and separate estimation of the population stratum probabilities. These are combined to yield estimates of the quantiles of interest. The issue of variance estimation for the estimated quantiles is also addressed. The methodology is applied to the estimation of reference ranges for a cognitive test score in a study of non-demented older Japanese-Americans.  相似文献   

11.
Often the variables in a regression model are difficult or expensive to obtain so auxiliary variables are collected in a preliminary step of a study and the model variables are measured at later stages on only a subsample of the study participants called the validation sample. We consider a study in which at the first stage some variables, throughout called auxiliaries, are collected; at the second stage the true outcome is measured on a subsample of the first-stage sample, and at the third stage the true covariates are collected on a subset of the second-stage sample. In order to increase efficiency, the probabilities of selection into the second and third-stage samples are allowed to depend on the data observed at the previous stages. In this paper we describe a class of inverse-probability-of-selection-weighted semiparametric estimators for the parameters of the model for the conditional mean of the outcomes given the covariates. We assume that a subject's probability of being sampled at subsequent stages is bounded away from zero and depends only on the subject's data collected at the previous sampling stages. We show that the asymptotic variance of the optimal estimator in our class is equal to the semiparametric variance bound for the model. Since the optimal estimator depends on unknown population parameters it is not available for data analysis. We therefore propose an adaptive estimation procedure for locally efficient inferences. A simulation study is carried out to study the finite sample properties of the proposed estimators.  相似文献   

12.
The success rate of drug development has been declined dramatically in recent years and the current paradigm of drug development is no longer functioning. It requires a major undertaking on breakthrough strategies and methodology for designs to minimize sample sizes and to shorten duration of the development. We propose an alternative phase II/III design based on continuous efficacy endpoints, which consists of two stages: a selection stage and a confirmation stage. For the selection stage, a randomized parallel design with several doses with a placebo group is employed for selection of doses. After the best dose is chosen, the patients of the selected dose group and placebo group continue to enter the confirmation stage. New patients will also be recruited and randomized to receive the selected dose or placebo group. The final analysis is performed with the cumulative data of patients from both stages. With the pre‐specified probabilities of rejecting the drug at each stage, sample sizes and critical values for both stages can be determined. As it is a single trial with controlling overall type I and II error rates, the proposed phase II/III adaptive design may not only reduce the sample size but also improve the success rate. An example illustrates the applications of the proposed phase II/III adaptive design. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

13.
从不均等选择概率的角度,提出两类常见的权数调整类型及其调整方法:一是规模调整,使得样本单元权数之和等于总体规模;二是结构调整,使得样本结构和总体结构一致,并构造出加权调整的设计效应模型,应用于复杂样本设计。案例分析显示,加权调整往往导致设计效应变大,带来负的效应,但校准调整能降低设计效应,提高估计精度。  相似文献   

14.
The sampling designs dependent on sample moments of auxiliary variables are well known. Lahiri (Bull Int Stat Inst 33:133–140, 1951) considered a sampling design proportionate to a sample mean of an auxiliary variable. Sing and Srivastava (Biometrika 67(1):205–209, 1980) proposed the sampling design proportionate to a sample variance while Wywiał (J Indian Stat Assoc 37:73–87, 1999) a sampling design proportionate to a sample generalized variance of auxiliary variables. Some other sampling designs dependent on moments of an auxiliary variable were considered e.g. in Wywiał (Some contributions to multivariate methods in, survey sampling. Katowice University of Economics, Katowice, 2003a); Stat Transit 4(5):779–798, 2000) where accuracy of some sampling strategies were compared, too.These sampling designs cannot be useful in the case when there are some censored observations of the auxiliary variable. Moreover, they can be much too sensitive to outliers observations. In these cases the sampling design proportionate to the order statistic of an auxiliary variable can be more useful. That is why such an unequal probability sampling design is proposed here. Its particular cases as well as its conditional version are considered, too. The sampling scheme implementing this sampling design is proposed. The inclusion probabilities of the first and second orders were evaluated. The well known Horvitz–Thompson estimator is taken into account. A ratio estimator dependent on an order statistic is constructed. It is similar to the well known ratio estimator based on the population and sample means. Moreover, it is an unbiased estimator of the population mean when the sample is drawn according to the proposed sampling design dependent on the appropriate order statistic.  相似文献   

15.
A technique of systematically allocating a sample to the strata formed by double stratification is presented. The method can proportionally allocate the sample along each variable of stratification. If there are R strata and C strata for the first and second variable of stratification respectively, the technique requires that the total sample size be at least as large as max(R, C). An unbiased estimator of the population mean is given and its variance is obtained. The technique is compared with a random allocation procedure given by Bryant, Hartley, and Jessen (1960). Numerical examples are given suggesting when one technique is superior to the other.  相似文献   

16.
The author considers the use of auxiliary information available at population level to improve the estimation of finite population totals. She introduces a new type of model‐assisted estimator based on nonparametric regression splines. The estimator is a weighted linear combination of the study variable with weights calibrated to the B‐splines known population totals. The author shows that the estimator is asymptotically design‐unbiased and consistent under conditions which do not require the superpopulation model to be correct. She proposes a design‐based variance approximation and shows that the anticipated variance is asymptotically equivalent to the Godambe‐Joshi lower bound. She also shows through simulations that the estimator has good properties.  相似文献   

17.
This article proposes a new mixed variable lot-size multiple dependent state sampling plan in which the attribute sampling plan can be used in the first stage and the variables multiple dependent state sampling plan based on the process capability index will be used in the second stage for the inspection of measurable quality characteristics. The proposed mixed plan is developed for both symmetric and asymmetric fraction non conforming. The optimal plan parameters can be determined by considering the satisfaction levels of the producer and the consumer simultaneously at an acceptable quality level and a limiting quality level, respectively. The performance of the proposed plan over the mixed single sampling plan based on Cpk and the mixed variable lot size plan based on Cpk with respect to the average sample number is also investigated. Tables are constructed for easy selection of plan parameters for both symmetric and asymmetric fraction non conforming and real world examples are also given for the illustration and practical implementation of the proposed mixed variable lot-size plan.  相似文献   

18.
Sarjinder Singh 《Statistics》2013,47(3):566-574
In this note, a dual problem to the calibration of design weights of the Deville and Särndal [Calibration estimators in survey sampling, J. Amer. Statist. Assoc. 87 (1992), pp. 376–382] method has been considered. We conclude that the chi-squared distance between the design weights and the calibrated weights equals the square of the standardized Z-score obtained by the difference between the known population total of the auxiliary variable and its corresponding Horvitz and Thompson [A generalization of sampling without replacement from a finite universe, J. Amer. Statist. Assoc. 47 (1952), pp. 663–685] estimator divided by the sample standard deviation of the auxiliary variable to obtain the linear regression estimator in survey sampling.  相似文献   

19.
Impulse response functions are often used to investigate the relationships between the components of a VAR (vector autoregressive) process. A hypothesis of particular interest is that a variable does not react to impulses in another variable, i.e., the impulse responses are zero. Two types of tests for such hypotheses are considered. The first type is based on finite-order VAR assumptions and the second allows for possibly infinite-order processes. It is found that both types of tests have to be used cautiously because small sample and asymptotic properties may differ substantially.  相似文献   

20.
Suppose that the conditional density of a response variable given a vector of explanatory variables is parametrically modelled, and that data are collected by a two-phase sampling design. First, a simple random sample is drawn from the population. The stratum membership in a finite number of strata of the response and explanatory variables is recorded for each unit. Second, a subsample is drawn from the phase-one sample such that the selection probability is determined by the stratum membership. The response and explanatory variables are fully measured at this phase. We synthesize existing results on nonparametric likelihood estimation and present a streamlined approach for the computation and the large sample theory of profile likelihood in four different situations. The amount of information in terms of data and assumptions varies depending on whether the phase-one data are retained, the selection probabilities are known, and/or the stratum probabilities are known. We establish and illustrate numerically the order of efficiency among the maximum likelihood estimators, according to the amount of information utilized, in the four situations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号