首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Large governmental surveys typically provide accurate national statistics. To decrease the mean squared error of estimates for small areas, i.e., domains in which the sample size is small, auxiliary variables from administrative records are often used as covariates in a mixed linear model. It is generally assumed that the auxiliary information is available for every small area. In many cases, though, such information is available for only some of the small areas, either from another survey or from a previous administration of the same survey. The authors propose and study small area estimators that use multivariate models to combine information from several surveys. They discuss computational algorithms, and a simulation study indicates that if quantities in the different surveys are sufficiently correlated, substantial gains in efficiency can be achieved.  相似文献   

2.
In randomized clinical trials, we are often concerned with comparing two-sample survival data. Although the log-rank test is usually suitable for this purpose, it may result in substantial power loss when the two groups have nonproportional hazards. In a more general class of survival models of Yang and Prentice (Biometrika 92:1–17, 2005), which includes the log-rank test as a special case, we improve model efficiency by incorporating auxiliary covariates that are correlated with the survival times. In a model-free form, we augment the estimating equation with auxiliary covariates, and establish the efficiency improvement using the semiparametric theories in Zhang et al. (Biometrics 64:707–715, 2008) and Lu and Tsiatis (Biometrics, 95:674–679, 2008). Under minimal assumptions, our approach produces an unbiased, asymptotically normal estimator with additional efficiency gain. Simulation studies and an application to a leukemia study show the satisfactory performance of the proposed method.  相似文献   

3.
Survey statisticians make use of auxiliary information to improve estimates. One important example is calibration estimation, which constructs new weights that match benchmark constraints on auxiliary variables while remaining “close” to the design weights. Multiple-frame surveys are increasingly used by statistical agencies and private organizations to reduce sampling costs and/or avoid frame undercoverage errors. Several ways of combining estimates derived from such frames have been proposed elsewhere; in this paper, we extend the calibration paradigm, previously used for single-frame surveys, to calculate the total value of a variable of interest in a dual-frame survey. Calibration is a general tool that allows to include auxiliary information from two frames. It also incorporates, as a special case, certain dual-frame estimators that have been proposed previously. The theoretical properties of our class of estimators are derived and discussed, and simulation studies conducted to compare the efficiency of the procedure, using different sets of auxiliary variables. Finally, the proposed methodology is applied to real data obtained from the Barometer of Culture of Andalusia survey.  相似文献   

4.
Abstract

The present study confirms the influential role of a positively and a negatively correlated auxiliary variables in enhancing the precision of estimates of current population mean in two occasion rotation (successive) sampling. Exponential-type estimators of current population mean have been proposed for three different situations: (i) the information on a positively correlated auxiliary variable is readily available on both occasions (ii) the information on a negatively correlated auxiliary variable is readily available on both occasions and (iii) the information on both positively and negatively correlated auxiliary variables are readily available on both the occasions. The characteristics of the proposed estimators have been explored and their efficacious performances are compared with the natural and recent contemporary estimators. Optimum replacement strategies of the proposed estimation procedures have been formulated. Simulation and empirical studies are carried out to justify the proposition of the proposed estimators and appropriate recommendations have been put forward to the survey practitioners.  相似文献   

5.
基于回归组合技术的连续性抽样估计方法研究   总被引:1,自引:1,他引:0  
在使用样本轮换的连续性抽样调查中,不仅可以利用前期调查的研究变量的信息,还可使用现期调查的辅助变量信息来建立回归模型进行回归估计,进而构造回归组合估计量,并在此基础上确定最优样本轮换率和最优权重系数,使得回归组合估计量的方差最小,从而更大程度地提高连续性抽样调查的估计精度。  相似文献   

6.
Ranked set sampling is a sampling approach that leads to improved statistical inference in situations where the units to be sampled can be ranked relative to each other prior to formal measurement. This ranking may be done either by subjective judgment or according to an auxiliary variable, and it need not be completely accurate. In fact, results in the literature have shown that no matter how poor the quality of the ranking, procedures based on ranked set sampling tend to be at least as efficient as procedures based on simple random sampling. However, efforts to quantify the gains in efficiency for ranked set sampling procedures have been hampered by a shortage of available models for imperfect rankings. In this paper, we introduce a new class of models for imperfect rankings, and we provide a rigorous proof that essentially any reasonable model for imperfect rankings is a limit of models in this class. We then describe a specific, easily applied method for selecting an appropriate imperfect rankings model from the class.  相似文献   

7.
In many socio-economic surveys the objective is estimation of total or proportion of persons with a particular attribute. Multi-stage area samples are drawn from geographic strata and population within areal units is used as an auxiliary variable in ratio estimation. For large administrative areas, the auxiliary variable totals are available as population projections based on the last census. However, for small areas population changes are significantly affected by non-demographic factors and hence projections with high enough reliability are not available for small areas. In such situations the efficiency of design-based estimators for small areas can be improved by a ratio adjustment based on the auxiliary variable total for a large area. An inequality on the efficiency of the ratio adjusted estimator is established and its bias and variance is investigated  相似文献   

8.
ABSTRACT

In successive sampling some recent works depict the use of super-population models where information on stable auxiliary variable over occasions has been utilized. Stability character of auxiliary variable may not sustain, if the duration between occasions is large. To cope with such situations, the present work is an attempt to develop some estimation procedures by utilizing the information on two independent auxiliary variables through a linear super-population model. Some estimators are proposed to estimate the current population mean in two occasions successive (rotation) sampling. Optimum replacement strategies are formulated and performances of the proposed estimators have been discussed. Results are interpreted through empirical studies.  相似文献   

9.
The authors describe a method for fitting failure time mixture models that postulate the existence of both susceptibles and long‐term survivors when covariate data are only partially observed. Their method is based on a joint model that combines a Weibull regression model for the susceptibles, a logistic regression model for the probability of being a susceptible, and a general location model for the distribution of the covariates. A Bayesian approach is taken, and Gibbs sampling is used to fit the model to the incomplete data. An application to clinical data on tonsil cancer and a small Monte Carlo study indicate potential large gains in efficiency over standard complete‐case analysis as well as reasonable performance in a variety of situations.  相似文献   

10.
In many biomedical studies, it is common that due to budget constraints, the primary covariate is only collected in a randomly selected subset from the full study cohort. Often, there is an inexpensive auxiliary covariate for the primary exposure variable that is readily available for all the cohort subjects. Valid statistical methods that make use of the auxiliary information to improve study efficiency need to be developed. To this end, we develop an estimated partial likelihood approach for correlated failure time data with auxiliary information. We assume a marginal hazard model with common baseline hazard function. The asymptotic properties for the proposed estimators are developed. The proof of the asymptotic results for the proposed estimators is nontrivial since the moments used in estimating equation are not martingale-based and the classical martingale theory is not sufficient. Instead, our proofs rely on modern empirical process theory. The proposed estimator is evaluated through simulation studies and is shown to have increased efficiency compared to existing methods. The proposed method is illustrated with a data set from the Framingham study.  相似文献   

11.
Summary.  The literature on multivariate linear regression includes multivariate normal models, models that are used in survival analysis and a variety of models that are used in other areas such as econometrics. The paper considers the class of location–scale models, which includes a large proportion of the preceding models. It is shown that, for complete data, the maximum likelihood estimators for regression coefficients in a linear location–scale framework are consistent even when the joint distribution is misspecified. In addition, gains in efficiency arising from the use of a bivariate model, as opposed to separate univariate models, are studied. A major area of application for multivariate regression models is to clustered, 'parallel' lifetime data, so we also study the case of censored responses. Estimators of regression coefficients are no longer consistent under model misspecification, but we give simulation results that show that the bias is small in many practical situations. Gains in efficiency from bivariate models are also examined in the censored data setting. The methodology in the paper is illustrated by using lifetime data from the Diabetic Retinopathy Study.  相似文献   

12.
The present investigation addresses the problem of estimating a finite population mean in two-phase cluster sampling in presence of random non response situations. Utilizing information on an auxiliary variable, regression type estimators has been proposed. Effective imputation techniques have been suggested to deal with the random non response situations. The properties of the proposed estimation strategies have been studied for different cases of random non response situations in practical surveys. The superiority of the suggested methodology over the natural sample mean estimator of population mean has been established through empirical studies carried over the data sets of natural population and artificially generated population.  相似文献   

13.
Randomized response methods for quantitative sensitive data are treated in an unified approach which includes the use of auxiliary information at the estimation stage. A class of estimators for the mean of a sensitive variable is proposed under a generic randomization model and the optimum estimator is obtained. Some special models are discussed in detail. To evaluate the degree of respondents’ confidentiality in models using auxiliary variables, a new measure of privacy protection is introduced. Different models are then compared both from the perspective of efficiency and privacy protection.  相似文献   

14.
Whenever there is auxiliary information available in any form, the researchers want to utilize it in the method of estimation to obtain the most efficient estimator. When there exists enough amount of correlation between the study and the auxiliary variables, and parallel to these associations, the ranks of the auxiliary variables are also correlated with the study variable, which can be used a valuable device for enhancing the precision of an estimator accordingly. This article addresses the problem of estimating the finite population mean that utilizes the complementary information in the presence of (i) the auxiliary variable and (ii) the ranks of the auxiliary variable for non response. We suggest an improved estimator for estimating the finite population mean using the auxiliary information in the presence of non response. Expressions for bias and mean squared error of considered estimators are derived up to the first order of approximation. The performance of estimators is compared theoretically and numerically. A numerical study is carried out to evaluate the performances of estimators. It is observed that the proposed estimator is more efficient than the usual sample mean and the regression estimators, and some other families of ratio and exponential type of estimators.  相似文献   

15.
The article considers Bayesian analysis of hierarchical models for count, binomial and multinomial data using efficient MCMC sampling procedures. To this end, an improved method of auxiliary mixture sampling is proposed. In contrast to previously proposed samplers the method uses a bounded number of latent variables per observation, independent of the intensity of the underlying Poisson process in the case of count data, or of the number of experiments in the case of binomial and multinomial data. The bounded number of latent variables results in a more general error distribution, which is a negative log-Gamma distribution with arbitrary integer shape parameter. The required approximations of these distributions by Gaussian mixtures have been computed. Overall, the improvement leads to a substantial increase in efficiency of auxiliary mixture sampling for highly structured models. The method is illustrated for finite mixtures of generalized linear models and an epidemiological case study.  相似文献   

16.
This paper studies penalized quantile regression for dynamic panel data with fixed effects, where the penalty involves l1 shrinkage of the fixed effects. Using extensive Monte Carlo simulations, we present evidence that the penalty term reduces the dynamic panel bias and increases the efficiency of the estimators. The underlying intuition is that there is no need to use instrumental variables for the lagged dependent variable in the dynamic panel data model without fixed effects. This provides an additional use for the shrinkage models, other than model selection and efficiency gains. We propose a Bayesian information criterion based estimator for the parameter that controls the degree of shrinkage. We illustrate the usefulness of the novel econometric technique by estimating a “target leverage” model that includes a speed of capital structure adjustment. Using the proposed penalized quantile regression model the estimates of the adjustment speeds lie between 3% and 44% across the quantiles, showing strong evidence that there is substantial heterogeneity in the speed of adjustment among firms.  相似文献   

17.
We propose correcting for non-compliance in randomized trials by estimating the parameters of a class of semi-parametric failure time models, the rank preserving structural failure time models, using a class of rank estimators. These models are the structural or strong version of the “accelerated failure time model with time-dependent covariates” of Cox and Oakes (1984). In this paper we develop a large sample theory for these estimators, derive the optimal estimator within this class, and briefly consider the construction of “partially adaptive” estimators whose efficiency may approach that of the optimal estimator. We show that in the absence of censoring the optimal estimator attains the semiparametric efficiency bound for the model.  相似文献   

18.
In many randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients due to censoring that occurs when the study ends before the patient’s response is observed or when the patients drop out of the study. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimator or test for the treatment effect may be biased. In this paper, we use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter which are applicable when the response variable is right censored. The baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also considered in our semiparametric model. These auxiliary covariates are used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.  相似文献   

19.
In this paper we consider the calibration procedure for a rare sensitive attribute with Poisson distribution which suggested by Land et al. (2012) using auxiliary information associated with the variable of interest. In the calibration procedure, we can use auxiliary information such as socio-demographical variables for the respondents of rare sensitive attribute questions from an external source, and then this estimator can be improved with respect to the problems of non coverage or non response. From the efficiency comparison study, we show that the calibrated Poisson RR estimators are more efficient than that of Land et al. (2012), when the known population cell and marginal counts of auxiliary information are used for the calibration procedure.  相似文献   

20.
In this paper we propose a modified version of the estimator of Hansen and Hurwitz [12] in the case of quantitative sensitive variable and consider a randomization mechanism on the second call that provides privacy protection to the respondents to get truthful information. We use variance of the modified estimator as a tool to measure privacy protection and it is observed that the higher is the variance, the lower is the efficiency but the higher is the privacy protection. To overcome this efficiency loss, we consider a linear regression estimator using known non-sensitive auxiliary information. With consideration of four scrambled models, we try to make a trade-off between efficiency and privacy protection. To show this compromise, analytical and numerical comparisons are obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号