期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

High breakdown point robust estimators with missing data

Florencia Statti Victor J. Yohai 《统计学通讯:理论与方法》2018,47(21):5145-5162

In this paper, we propose a new procedure to estimate the distribution of a variable y when there are missing data. To compensate the presence of missing responses, it is assumed that a covariate vector x is observed and that y and x are related by means of a semi-parametric regression model. Observed residuals are combined with predicted values to estimate the missing response distribution. Once the responses distribution is consistently estimated, we can estimate any parameter defined through a continuous functional T using a plug in procedure. We prove that the proposed estimators have high breakdown point. 相似文献

2.

Improvement over variance estimation using auxiliary information in sample surveys

Housila P. Singh 《统计学通讯:理论与方法》2017,46(15):7732-7750

This paper addresses the problem of estimating the population variance S²_y of the study variable y using auxiliary information in sample surveys. We have suggested a class of estimators of the population variance S²_y of the study variable y when the population variance S²_x of the auxiliary variable x is known. Asymptotic expressions of bias and mean squared error (MSE) of the proposed class of estimators have been obtained. Asymptotic optimum estimators in the proposed class of estimators have also been identified along with its MSE formula. A comparison has been provided. We have further provided the double sampling version of the proposed class of estimators. The properties of the double sampling version have been provided under large sample approximation. In addition, we support the present study with aid of a numerical illustration. 相似文献

3.

VARIANCE ESTIMATION IN TWO-PHASE SAMPLING

M.A. Hidiroglou J.N.K. Rao David Haziza 《Australian & New Zealand Journal of Statistics》2009,51(2):127-141

Two‐phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s₁ is drawn according to a specific sampling design p(s₁) , and auxiliary data x are observed for the units i∈s₁ . Given the first‐phase sample s₁ , a second‐phase sample s₂ is selected from s₁ according to a specified sampling design {p(s₂∣s₁) } , and (y, x) is observed for the units i∈s₂ . In some cases, the population totals of some components of x may also be known. Two‐phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson‐type variance estimators are used for variance estimation. However, the Horvitz–Thompson ( Horvitz & Thompson, J. Amer. Statist. Assoc. 1952 ) variance estimator in uni‐phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non‐negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy ( Sen , J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy , J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two‐phase sampling, assuming fixed first‐phase sample size and fixed second‐phase sample size given the first‐phase sample. We apply the new variance estimators to two‐phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy‐type variance estimators of the two‐phase regression estimators that make use of the first‐phase auxiliary data and known population totals of some of the auxiliary variables. 相似文献

4.

Estimation of the linear-plateau segmented regression model in the presence of measurement error

Scott D. Grimshaw 《统计学通讯:理论与方法》2013,42(8):2399-2413

It is well known that when the true values of the independent variable are unobservable due to measurement error, the least squares estimator for a regression model is biased and inconsistent. When repeated observations on each x_i are taken, consistent estimators for the linear-plateau model can be formed. The repeated observations are required to classify each observation to the appropriate line segment. Two cases of repeated observations are treated in detail. First, when a single value of y_i is observed with the repeated observations of x_i the least squares estimator using the mean of the repeated x_i observations is consistent and asymptotically normal. Second, when repeated observations on the pair (x_i, y_i ) are taken the least squares estimator is inconsistent, but two consistent estimators are proposed: one that consistently estimates the bias of the least squares estimator and adjusts accordingly; the second is the least squares estimator using the mean of the repeated observations on each pair. 相似文献

5.

Improved estimator of population total in PPS sampling

Housila P. Singh Abhishek C. Mishra 《统计学通讯:理论与方法》2018,47(4):912-934

In this paper, we have considered an estimation of the population total Y of the study variable y, making use of information on an auxiliary variable x. A class of estimators for the population total Y using transformation on both the variables study as well as auxiliary has been suggested based on the probability proportional to size with replacement (PPSWR). In addition to many the usual PPS estimator, Reddy and Rao's (1977) estimator and Srivenkataramana and Tracy's (1979, 1984, 1986) estimators are shown to be members of the proposed class of estimators. The variance of the proposed class of estimators has been obtained. In particular, the properties of 75 estimators based on different known population parameters of the study as well as auxiliary variables have been derived from the proposed class of estimators. In support of the present study, numerical illustrations are given. 相似文献

6.

A study of ratio and product estimators under a super population model

Yogendra P. Chaubey 《统计学通讯:理论与方法》2013,42(5-6):1731-1746

In this paper ratio and product estimators are studied under a super population model considered by Durbin (1959. Biometrika) where a regression model of y (the characteristic variablel on x(the auxiliary variable) is assumed. The comparison of the ratio and the product estimators have been made in the literature (see Chaubey, Dwivedi and Singh (1984), Commun. Statist. - Theor. Meth.) When the auxiliary variable has a gamma distribution. In this paper similar analysis has been carried out when the auxiliary variable has an inverse Gaussian distribution. 相似文献

7.

Estimation of regression parameters in missing data problems

Donald L. Mcleish Cyntha A. Struthers 《Revue canadienne de statistique》2006,34(2):233-259

Let Y be a response variable, possibly multivariate, with a density function f (y|x, v; β) conditional on vectors x and v of covariates and a vector β of unknown parameters. The authors consider the problem of estimating β when the values taken by the covariate vector v are available for all observations while some of those taken by the covariate x are missing at random. They compare the profile estimator to several alternatives, both in terms of bias and standard deviation, when the response and covariates are discrete or continuous. 相似文献

8.

Partly linear models on Riemannian manifolds

Wenceslao Gonzalez-Manteiga Guillermo Henry 《Journal of applied statistics》2012,39(8):1797-1809

In partly linear models, the dependence of the response y on (x ^T, t) is modeled through the relationship y=x ^T β+g(t)+?, where ? is independent of (x ^T, t). We are interested in developing an estimation procedure that allows us to combine the flexibility of the partly linear models, studied by several authors, but including some variables that belong to a non-Euclidean space. The motivating application of this paper deals with the explanation of the atmospheric SO₂ pollution incidents using these models when some of the predictive variables belong in a cylinder. In this paper, the estimators of β and g are constructed when the explanatory variables t take values on a Riemannian manifold and the asymptotic properties of the proposed estimators are obtained under suitable conditions. We illustrate the use of this estimation approach using an environmental data set and we explore the performance of the estimators through a simulation study. 相似文献

9.

A REGRESSION APPROACH TO THE ESTIMATION OF THE FINITE POPULATION MEAN IN THE PRESENCE OF NON‐RESPONSE

Housila P. Singh Sunil Kumar 《Australian & New Zealand Journal of Statistics》2008,50(4):395-408

This paper presents various estimators for estimating the population mean of the study variable y using information on the auxiliary variable x in the presence of non‐response. Properties of the suggested estimators are studied and compared with those of existing estimators. It is shown that the estimators suggested in this paper are among the best of all the estimators considered. An empirical study is carried out to demonstrate the performance of the suggested estimators and of others, and it is found that the empirical results support the theoretical study. 相似文献

10.

A study on the chain ratio-ratio-type exponential estimator for finite population variance

Housila P. Singh Anita Yadav 《统计学通讯:理论与方法》2018,47(6):1442-1458

This paper considers the problem of estimating the population variance S²_y of the study variable y using the auxiliary information in sample surveys. We have suggested the (i) chain ratio-type estimator (on the lines of Kadilar and Cingi (2003)), (ii) chain ratio-ratio-type exponential estimator and their generalized version [on the lines of Singh and Pal (2015)] and studied their properties under large sample approximation. Conditions are obtained under which the proposed estimators are more efficient than usual unbiased estimator s²_y and Isaki (1893) ratio estimator. Improved version of the suggested class of estimators is also given along with its properties. An empirical study is carried out in support of the present study. 相似文献

11.

Confidence Intervals for Nonparametric Regression Functions with Missing Data

Yongsong Qin Tao Qiu Qingzhu Lei 《统计学通讯:理论与方法》2014,43(19):4123-4142

Suppose that we have a nonparametric regression model Y = m(X) + ε with X ∈ R^p, where X is a random design variable and is observed completely, and Y is the response variable and some Y-values are missing at random. Based on the “complete” data sets for Y after nonaprametric regression imputation and inverse probability weighted imputation, two estimators of the regression function m(x₀) for fixed x₀ ∈ R^p are proposed. Asymptotic normality of two estimators is established, which is used to construct normal approximation-based confidence intervals for m(x₀). We also construct an empirical likelihood (EL) statistic for m(x₀) with limiting distribution of χ²₁, which is used to construct an EL confidence interval for m(x₀). 相似文献

12.

Predictive Influence of Unavailable Values of Future Explanatory Variables in a Linear Model

S. K. Bhattacharjee Ahmed Shamiri Md. Sabiruzzaman S. Rao Jammalamadaka 《统计学通讯:理论与方法》2013,42(24):4458-4466

We consider an approach to prediction in linear model when values of the future explanatory variables are unavailable, we predict a future response y ^f at a future sample point x ^f when some components of x ^f are unavailable. We consider both the cases where x ^f are dependent and independent but normally distributed. A Taylor expansion is used to derive an approximation to the predictive density, and the influence of missing future explanatory variables (the loss or discrepancy) is assessed using the Kullback–Leibler measure of divergence. This discrepancy is compared in different scenarios including the situation where the missing variables are dropped entirely. 相似文献

13.

Penalized inverse probability weighted estimators for weighted rank regression with missing covariates

Hu Yang Jing Lv 《统计学通讯:理论与方法》2013,42(5):1388-1402

Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators. 相似文献

14.

A Class of Estimators Using Auxiliary Information for Estimating Finite Population Variance in Presence of Measurement Errors

Housila P. Singh 《统计学通讯:理论与方法》2013,42(5):734-741

This article addresses the problem of estimating the population variance using auxiliary information in the presence of measurement errors. When the measurement error variance associated with study variable is known, a class of estimators of the population variance using auxiliary information has been proposed. We obtain the bias and mean squared errors of the suggested class of estimators upto the terms of order n ^?1, and also optimum estimators in asymptotic sense of the class with approximate mean squared error formula. 相似文献

15.

RANKED SET SAMPLING FROM LOCATION-SCALE FAMILIES OF SYMMETRIC DISTRIBUTIONS

《统计学通讯:理论与方法》2013,42(8-9):1641-1659

Statistical inference based on ranked set sampling has primarily been motivated by nonparametric problems. However, the sampling procedure can provide an improved estimator of the population mean when the population is partially known. In this article, we consider estimation of the population mean and variance for the location-scale families of distributions. We derive and compare different unbiased estimators of these parameters based on rindependent replications of a ranked set sample of size n.Large sample properties, along with asymptotic relative efficiencies, help identify which estimators are best suited for different location-scale distributions. 相似文献

16.

Estimating the population mean in the case of missing data using simple random sampling

Carlos N. Bouza Amer Ibrahim Al-Omari 《Statistics》2013,47(2):279-290

In this paper, we suggest three new ratio estimators of the population mean using quartiles of the auxiliary variable when there are missing data from the sample units. The suggested estimators are investigated under the simple random sampling method. We obtain the mean square errors equations for these estimators. The suggested estimators are compared with the sample mean and ratio estimators in the case of missing data. Also, they are compared with estimators in Singh and Horn [Compromised imputation in survey sampling, Metrika 51 (2000), pp. 267–276], Singh and Deo [Imputation by power transformation, Statist. Papers 45 (2003), pp. 555–579], and Kadilar and Cingi [Estimators for the population mean in the case of missing data, Commun. Stat.-Theory Methods, 37 (2008), pp. 2226–2236] and present under which conditions the proposed estimators are more efficient than other estimators. In terms of accuracy and of the coverage of the bootstrap confidence intervals, the suggested estimators performed better than other estimators. 相似文献

17.

The Relative Effectiveness of Procedures Commonly Used in Multiple Regression Analysis for Dealing with Missing Values

Allan Donner 《The American statistician》2013,67(4):378-381

Expressions are derived for the bias and variance associated with procedures frequently used to estimate partial regression coefficients in a linear model having the two explanatory variables x ₁ and x ₂, with missing values on x ₂ only. The expressions are used to help gain insight into the relative effectiveness of these procedures for handling more complex patterns of missing data. 相似文献

18.

Missing at random (MAR) in nonparametric regression - A simulation experiment

Thomas Nittner 《Statistical Methods and Applications》2003,12(2):195-210

The additive model is considered when some observations on x are missing at random but corresponding observations on y are available. Especially for this model, missing at random is an interesting case because the complete case analysis is expected to be no more suitable. A simulation experiment is reported and the different methods are compared based on their superiority with respect to the sample mean squared error. Some focus is also given on the sample variance and the estimated bias. In detail, the complete case analysis, a kind of stochastic mean imputation, a single imputation and the nearest neighbor imputation are discussed. 相似文献

19.

Combining unbiased estimates —a further examination of some old estimators

《Journal of Statistical Computation and Simulation》2012,82(2):147-160

This paper considers the problem of combining k unbiased estimates, x _i of a parameter,μ, where each estimate, x _i is the average of n _i + l independent normal observations with unknown mean, μ, and unknown variance, σ_i ². The behavior of several commonly used estimators of μ is studied by means of an empirical sampling study, and the empirical results of this experiment are interpreted in terms of previous theoretical results. Finally, some extrapolations are made as to how these estimators might behave under varying conditions, and some new estimators are proposed which might have higher efficiencies under certain conditions than those which are generally used. 相似文献

20.

Robust Linear Calibration

Christos P. Kitsos christine H. Müller 《Statistics》2013,47(1-2):93-106

We regard the simple linear calibration problem where only the response y of the regression line y = β₀ + β₁ t is observed with errors. The experimental conditions t are observed without error. For the errors of the observations y we assume that there may be some gross errors providing outlying observations. This situation can be modeled by a conditionally contaminated regression model. In this model the classical calibration estimator based on the least squares estimator has an unbounded asymptotic bias. Therefore we introduce calibration estimators based on robust one-step-M-estimators which have a bounded asymptotic bias. For this class of estimators we discuss two problems: The optimal estimators and their corresponding optimal designs. We derive the locally optimal solutions and show that the maximin efficient designs for non-robust estimation and robust estimation coincide. 相似文献