首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Two‐phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s1 is drawn according to a specific sampling design p(s1) , and auxiliary data x are observed for the units is1 . Given the first‐phase sample s1 , a second‐phase sample s2 is selected from s1 according to a specified sampling design {p(s2s1) } , and (y, x) is observed for the units is2 . In some cases, the population totals of some components of x may also be known. Two‐phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson‐type variance estimators are used for variance estimation. However, the Horvitz–Thompson ( Horvitz & Thompson, J. Amer. Statist. Assoc. 1952 ) variance estimator in uni‐phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non‐negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy ( Sen , J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy , J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two‐phase sampling, assuming fixed first‐phase sample size and fixed second‐phase sample size given the first‐phase sample. We apply the new variance estimators to two‐phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy‐type variance estimators of the two‐phase regression estimators that make use of the first‐phase auxiliary data and known population totals of some of the auxiliary variables.  相似文献   

2.
Extreme value theory models have found applications in myriad fields. Maximum likelihood (ML) is attractive for fitting the models because it is statistically efficient and flexible. However, in small samples, ML is biased to O(N?1) and some classical hypothesis tests suffer from size distortions. This paper derives the analytical Cox–Snell bias correction for the generalized extreme value (GEV) model, and for the model's extension to multiple order statistics (GEVr). Using simulations, the paper compares this correction to bootstrap-based bias corrections, for the generalized Pareto, GEV, and GEVr. It then compares eight approaches to inference with respect to primary parameters and extreme quantiles, some including corrections. The Cox–Snell correction is not markedly superior to bootstrap-based correction. The likelihood ratio test appears most accurately sized. The methods are applied to the distribution of geomagnetic storms.  相似文献   

3.
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada  相似文献   

4.
Many robust regression estimators are defined by minimizing a measure of spread of the residuals. An accompanying R 2-measure, or multiple correlation coefficient, is then easily obtained. In this paper, local robustness properties of these robust R 2-coefficients are investigated. It is also shown how confidence intervals for the population multiple correlation coefficient can be constructed in the case of multivariate normality.  相似文献   

5.
Kadilar and Cingi [Ratio estimators in simple random sampling, Appl. Math. Comput. 151 (3) (2004), pp. 893–902] introduced some ratio-type estimators of finite population mean under simple random sampling. Recently, Kadilar and Cingi [New ratio estimators using correlation coefficient, Interstat 4 (2006), pp. 1–11] have suggested another form of ratio-type estimators by modifying the estimator developed by Singh and Tailor [Use of known correlation coefficient in estimating the finite population mean, Stat. Transit. 6 (2003), pp. 655–560]. Kadilar and Cingi [Improvement in estimating the population mean in simple random sampling, Appl. Math. Lett. 19 (1) (2006), pp. 75–79] have suggested yet another class of ratio-type estimators by taking a weighted average of the two known classes of estimators referenced above. In this article, we propose an alternative form of ratio-type estimators which are better than the competing ratio, regression, and other ratio-type estimators considered here. The results are also supported by the analysis of three real data sets that were considered by Kadilar and Cingi.  相似文献   

6.
Analyses of randomised trials are often based on regression models which adjust for baseline covariates, in addition to randomised group. Based on such models, one can obtain estimates of the marginal mean outcome for the population under assignment to each treatment, by averaging the model‐based predictions across the empirical distribution of the baseline covariates in the trial. We identify under what conditions such estimates are consistent, and in particular show that for canonical generalised linear models, the resulting estimates are always consistent. We show that a recently proposed variance estimator underestimates the variance of the estimator around the true marginal population mean when the baseline covariates are not fixed in repeated sampling and provide a simple adjustment to remedy this. We also describe an alternative semiparametric estimator, which is consistent even when the outcome regression model used is misspecified. The different estimators are compared through simulations and application to a recently conducted trial in asthma.  相似文献   

7.
The method of target estimation developed by Cabrera and Fernholz [(1999). Target estimation for bias and mean square error reduction. The Annals of Statistics, 27(3), 1080–1104.] to reduce bias and variance is applied to logistic regression models of several parameters. The expectation functions of the maximum likelihood estimators for the coefficients in the logistic regression models of one and two parameters are analyzed and simulations are given to show a reduction in both bias and variability after targeting the maximum likelihood estimators. In addition to bias and variance reduction, it is found that targeting can also correct the skewness of the original statistic. An example based on real data is given to show the advantage of using target estimators for obtaining better confidence intervals of the corresponding parameters. The notion of the target median is also presented with some applications to the logistic models.  相似文献   

8.
Abstract. Generalized autoregressive conditional heteroscedastic (GARCH) models have been widely used for analyzing financial time series with time‐varying volatilities. To overcome the defect of the Gaussian quasi‐maximum likelihood estimator (QMLE) when the innovations follow either heavy‐tailed or skewed distributions, Berkes & Horváth (Ann. Statist., 32, 633, 2004) and Lee & Lee (Scand. J. Statist. 36, 157, 2009) considered likelihood methods that use two‐sided exponential, Cauchy and normal mixture distributions. In this paper, we extend their methods for Box–Cox transformed threshold GARCH model by allowing distributions used in the construction of likelihood functions to include parameters and employing the estimated quasi‐likelihood estimators (QELE) to handle those parameters. We also demonstrate that the proposed QMLE and QELE are consistent and asymptotically normal under regularity conditions. Simulation results are provided for illustration.  相似文献   

9.
Abstract

In the present article, an effort has been made to develop calibration estimators of the population mean under two-stage stratified random sampling design when auxiliary information is available at primary stage unit (psu) level. The properties of the developed estimators are derived in-terms of design based approximate variance and approximate consistent design based estimator of the variance. Some simulation studies have been conducted to investigate the relative performance of calibration estimator over the usual estimator of the population mean without using auxiliary information in two-stage stratified random sampling. Proposed calibration estimators have outperformed the usual estimator without using auxiliary information.  相似文献   

10.
In this paper, we consider a regression analysis for a missing data problem in which the variables of primary interest are unobserved under a general biased sampling scheme, an outcome‐dependent sampling (ODS) design. We propose a semiparametric empirical likelihood method for accessing the association between a continuous outcome response and unobservable interesting factors. Simulation study results show that ODS design can produce more efficient estimators than the simple random design of the same sample size. We demonstrate the proposed approach with a data set from an environmental study for the genetic effects on human lung function in COPD smokers. The Canadian Journal of Statistics 40: 282–303; 2012 © 2012 Statistical Society of Canada  相似文献   

11.
Conditional logistic regression is a popular method for estimating a treatment effect while eliminating cluster-specific nuisance parameters when they are not of interest. Under a cluster-specific 1: m matched treatment–control study design, we present a new closed-form relationship between the conditional logistic regression estimator and the ordinary logistic regression estimator. In addition, we prove an equivalence between the ordinary logistic regression and the conditional logistic regression estimators, when the clusters are replicated infinitely often, which indicates that potential bias concerns when applying conditional logistic regression to complex survey samples.  相似文献   

12.
Motivated by Sampath [Finite population variance estimation under LSS with multiple random starts, Commun. Statist. – Theory Methods 38 (2009), pp. 3596–3607], in this paper unbiased estimators for population variance have been developed under linear systematic sampling, balanced systematic sampling and modified systematic sampling with multiple random starts. Expressions for variances of the estimators are also developed. Detailed numerical comparative studies have been carried out to study the performances of the estimators under various systematic sampling schemes with multiple random starts and some interesting conclusions have been drawn out of the study.  相似文献   

13.
Recently, Shabbir and Gupta [Shabbir, J. and Gupta, S. (2011). On estimating finite population mean in simple and stratified random sampling. Communications in Statistics-Theory and Methods, 40(2), 199–212] defined a class of ratio type exponential estimators of population mean under a very specific linear transformation of auxiliary variable. In the present article, we propose a generalized class of ratio type exponential estimators of population mean in simple random sampling under a very general linear transformation of auxiliary variable. Shabbir and Gupta's [Shabbir, J. and Gupta, S. (2011). On estimating finite population mean in simple and stratified random sampling. Communications in Statistics-Theory and Methods, 40(2), 199–212] class of estimators is a particular member of our proposed class of estimators. It has been found that the optimal estimator of our proposed generalized class of estimators is always more efficient than almost all the existing estimators defined under the same situations. Moreover, in comparison to a few existing estimators, our proposed estimator becomes more efficient under some simple conditions. Theoretical results obtained in the article have been verified by taking a numerical illustration. Finally, a simulation study has been carried out to see the relative performance of our proposed estimator with respect to some existing estimators which are less efficient under certain conditions as compared to the proposed estimator.  相似文献   

14.
In this paper, we suggest three new ratio estimators of the population mean using quartiles of the auxiliary variable when there are missing data from the sample units. The suggested estimators are investigated under the simple random sampling method. We obtain the mean square errors equations for these estimators. The suggested estimators are compared with the sample mean and ratio estimators in the case of missing data. Also, they are compared with estimators in Singh and Horn [Compromised imputation in survey sampling, Metrika 51 (2000), pp. 267–276], Singh and Deo [Imputation by power transformation, Statist. Papers 45 (2003), pp. 555–579], and Kadilar and Cingi [Estimators for the population mean in the case of missing data, Commun. Stat.-Theory Methods, 37 (2008), pp. 2226–2236] and present under which conditions the proposed estimators are more efficient than other estimators. In terms of accuracy and of the coverage of the bootstrap confidence intervals, the suggested estimators performed better than other estimators.  相似文献   

15.
Rp of a linear regression model of the type Y = Xθ + ɛ, where X is the design matrix, Y the vector of the response variable and ɛ the random error vector that follows an AR(1) correlation structure. These estimators are asymptotically analyzed, by proving their strong consistency, asymptotic normality and asymptotic efficiency. In a simulation study, a better behaviour of the Mean Squared Error of the proposed estimator with respect to that of the generalized least squares estimators is observed. Received: November 16, 1998; revised version: May 10, 2000  相似文献   

16.
Abstract

The regression model with ordinal outcome has been widely used in a lot of fields because of its significant effect. Moreover, predictors measured with error and multicollinearity are long-standing problems and often occur in regression analysis. However there are not many studies on dealing with measurement error models with generally ordinal response, even fewer when they suffer from multicollinearity. The purpose of this article is to estimate parameters of ordinal probit models with measurement error and multicollinearity. First, we propose to use regression calibration and refined regression calibration to estimate parameters in ordinal probit models with measurement error. Second, we develop new methods to obtain estimators of parameters in the presence of multicollinearity and measurement error in ordinal probit model. Furthermore we also extend all the methods to quadratic ordinal probit models and talk about the situation in ordinal logistic models. These estimators are consistent and asymptotically normally distributed under general conditions. They are easy to compute, perform well and are robust against the normality assumption for the predictor variables in our simulation studies. The proposed methods are applied to some real datasets.  相似文献   

17.
Many sampling problems from multiple populations can be considered under the semiparametric framework of the biased, or weighted, sampling model. Included under this framework is logistic regression under case–control sampling. For any model, atypical observations can greatly influence the maximum likelihood estimate of the parameters. Several robust alternatives have been proposed for the special case of logistic regression. However, some current techniques can exhibit poor behavior in many common situations. In this paper a new family of procedures are constructed to estimate the parameters in the semiparametric biased sampling model. The procedures incorporate a minimum distance approach, but are instead based on characteristic functions. The estimators can also be represented as the minimizers of quadratic forms in simple residuals, thus yielding straightforward computation. For the case of logistic regression, the resulting estimators are shown to be competitive with the existing robust approaches in terms of both robustness and efficiency, while maintaining affine equivariance. The approach is developed under the case–control sampling scheme, yet is shown to be applicable under prospective sampling logistic regression as well.  相似文献   

18.
Motivated by a recent tuberculosis (TB) study, this paper is concerned with covariates missing not at random (MNAR) and models the potential intracluster correlation by a frailty. We consider the regression analysis of right‐censored event times from clustered subjects under a Cox proportional hazards frailty model and present the semiparametric maximum likelihood estimator (SPMLE) of the model parameters. An easy‐to‐implement pseudo‐SPMLE is then proposed to accommodate more realistic situations using readily available supplementary information on the missing covariates. Algorithms are provided to compute the estimators and their consistent variance estimators. We demonstrate that both the SPMLE and the pseudo‐SPMLE are consistent and asymptotically normal by the arguments based on the theory of modern empirical processes. The proposed approach is examined numerically via simulation and illustrated with an analysis of the motivating TB study data.  相似文献   

19.
This paper addresses the problem of estimating the population variance S2y of the study variable y using auxiliary information in sample surveys. We have suggested a class of estimators of the population variance S2y of the study variable y when the population variance S2x of the auxiliary variable x is known. Asymptotic expressions of bias and mean squared error (MSE) of the proposed class of estimators have been obtained. Asymptotic optimum estimators in the proposed class of estimators have also been identified along with its MSE formula. A comparison has been provided. We have further provided the double sampling version of the proposed class of estimators. The properties of the double sampling version have been provided under large sample approximation. In addition, we support the present study with aid of a numerical illustration.  相似文献   

20.
Sarjinder Singh 《Statistics》2013,47(3):566-574
In this note, a dual problem to the calibration of design weights of the Deville and Särndal [Calibration estimators in survey sampling, J. Amer. Statist. Assoc. 87 (1992), pp. 376–382] method has been considered. We conclude that the chi-squared distance between the design weights and the calibrated weights equals the square of the standardized Z-score obtained by the difference between the known population total of the auxiliary variable and its corresponding Horvitz and Thompson [A generalization of sampling without replacement from a finite universe, J. Amer. Statist. Assoc. 47 (1952), pp. 663–685] estimator divided by the sample standard deviation of the auxiliary variable to obtain the linear regression estimator in survey sampling.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号