期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

周巍等《统计研究》2015,32(7):81-86

遥感影像是大数据的一种,利用遥感对农作物播种面积进行估算常采用回归估计量或校准估计量,通常都需要将地面样本数据与遥感分类信息相结合。但对于大多数回归估计量,对省级总体的农作物面积估算只能满足对省级总体的精度要求而不能分解到更小区域,比如县和乡级。本文利用黑龙江省2011年的地面实测样本数据结合遥感分类结果,构建了单元层次的多响应变量的多元回归形式的小域模型,并将小域效应设定为固定形式。这样基于回归估计方法,既可以估算分县的主要作物播种面积,也可以使得各县播种面积估计结果相加就等于回归模型含义下的省级总体的总量估计。对黑龙江省玉米、水稻、大豆分县小域估计结果的精度评价（变异系数C.V）,平均而言均可以满足县级精度要求。本文的结果表明小域估计方法在解决省级总体对全省和分县的农作物种植面积多级估算问题中具有很好的应用。相似文献

2.

Smoothing parameter selection methods for nonparametric regression with spatially correlated errors

Mario Francisco‐Fernandez Jean D. Opsomer 《Revue canadienne de statistique》2005,33(2):279-295

When spatial data are correlated, currently available data‐driven smoothing parameter selection methods for nonparametric regression will often fail to provide useful results. The authors propose a method that adjusts the generalized cross‐validation criterion for the effect of spatial correlation in the case of bivariate local polynomial regression. Their approach uses a pilot fit to the data and the estimation of a parametric covariance model. The method is easy to implement and leads to improved smoothing parameter selection, even when the covariance model is misspecified. The methodology is illustrated using water chemistry data collected in a survey of lakes in the Northeastern United States. 相似文献

3.

Combining information from multiple surveys by using regression for efficient small domain estimation 总被引：1，自引：0，他引：1

Takis Merkouris 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2010,72(1):27-48

Summary. In sample surveys of finite populations, subpopulations for which the sample size is too small for estimation of adequate precision are referred to as small domains. Demand for small domain estimates has been growing in recent years among users of survey data. We explore the possibility of enhancing the precision of domain estimators by combining comparable information collected in multiple surveys of the same population. For this, we propose a regression method of estimation that is essentially an extended calibration procedure whereby comparable domain estimates from the various surveys are calibrated to each other. We show through analytic results and an empirical study that this method may greatly improve the precision of domain estimators for the variables that are common to these surveys, as these estimators make effective use of increased sample size for the common survey items. The design-based direct estimators proposed involve only domain-specific data on the variables of interest. This is in contrast with small domain (mostly small area) indirect estimators, based on a single survey, which incorporate through modelling data that are external to the targeted small domains. The approach proposed is also highly effective in handling the closely related problem of estimation for rare population characteristics. 相似文献

4.

Use of satellite data in agricultural surveys

A. G. Houston F.G. Hall 《统计学通讯:理论与方法》2013,42(23):2857-2880

A major application of satellite remote sensing is the estimation of the acreage of agricultural crops. The potential for crop yield estimation using satellite remote sensing exists, but research in this area is still in its early stages. In this paper we survey the methodology for using remotely sensed data in agricultural surveys, based primarily on research conducted during the Large Area Crop Inventory Experiment (LACIE) and the follow-on program Agricultural Research and Inventory Surveys Through Aerospace Remote Sensing (AgRISTARS). The data obtained from multispectral scanner (MSS) and thematic mapper (TM) sensors onboard the Landsat series of satellites are described. Approaches for preprocessing, transferring, and modeling these data for understanding the relationship between their temporal behavior and crop growth cycles are discussed. Finally, techniques for crop identification and area and yield estimation are briefly described 相似文献

5.

Generalized Additive Models for Current Status Data

Shiboski Stephen C. 《Lifetime data analysis》1998,4(1):29-50

Current status data arise in studies where the target measurement is the time of occurrence of some event, but observations are limited to indicators of whether or not the event has occurred at the time the sample is collected - only the current status of each individual with respect to event occurrence is observed. Examples of such data arise in several fields, including demography, epidemiology, econometrics and bioassay. Although estimation of the marginal distribution of times of event occurrence is well understood, techniques for incorporating covariate information are not well developed. This paper proposes a semiparametric approach to estimation for regression models of current status data, using techniques from generalized additive modeling and isotonic regression. This procedure provides simultaneous estimates of the baseline distribution of event times and covariate effects. No parametric assumptions about the form of the baseline distribution are required. The results are illustrated using data from a demographic survey of breastfeeding practices in developing countries, and from an epidemiological study of heterosexual Human Immunodeficiency Virus (HIV) transmission. This revised version was published online in July 2006 with corrections to the Cover Date. 相似文献

6.

The use of power transformations in small area estimation

Getachew Asfaw Dagne 《Journal of applied statistics》2003,30(4):411-423

Sample surveys are usually designed and analysed to produce estimates for larger areas. Nevertheless, sample sizes are often not large enough to give adequate precision for small area estimates of interest. To overcome such difficulties, borrowing strength from related small areas via modelling becomes essential. In line with this, we propose components of variance models with power transformations for small area estimation. This paper reports the results of a study aimed at incorporating the power transformation in small area estimation for improving the quality of small area predictions. The proposed methods are demonstrated on satellite data in conjunction with survey data to estimate mean acreage under a specified crop for counties in Iowa. 相似文献

7.

Examination of Influential Observations to Improve Discriminant Analysis

Elizabeth Lesquoy Richard Tomassone 《Statistics》2013,47(4):597-606

Three simple transformations are proposed in the context of ratio and product methods of estimation, based on any probability sampling design, and the usual unbiased estimation under varying probability sampling. These transformations may be effected

after the data are collected in a survey. The objective is to obtain improved estimators of the population total 相似文献

8.

Functional regression concurrent model with spatially correlated errors: application to rainfall ground validation

Johann Ospína-Galindez Mercedes Andrade-Bejarano 《Journal of applied statistics》2019,46(8):1350-1363

In this paper, we give an extension of the functional regression concurrent model to the case of spatially correlated errors. We propose estimating the spatial correlation structure by using functional geostatistics. The estimation of the regression parameters is carried out by feasible generalized least squares. This modeling approach is motivated by the problem of validating rainfall data retrieved from satellite sensors. In this sense, we use the methodology to study the relationship between satellite and ground rainfall time series recorded in 82 weather stations from Department of Valle del Cauca, Colombia. The model obtained allows predicting pentadal rainfall curves in many sites of the region of interest by using as input the satellite information. A residual analysis shows a good performance of the methodology proposed. 相似文献

9.

ROBUST ESTIMATION OF SMALL‐AREA MEANS AND QUANTILES

Nikos Tzavidis Stefano Marchetti Ray Chambers 《Australian & New Zealand Journal of Statistics》2010,52(2):167-186

Small‐area estimation techniques have typically relied on plug‐in estimation based on models containing random area effects. More recently, regression M‐quantiles have been suggested for this purpose, thus avoiding conventional Gaussian assumptions, as well as problems associated with the specification of random effects. However, the plug‐in M‐quantile estimator for the small‐area mean can be shown to be the expected value of this mean with respect to a generally biased estimator of the small‐area cumulative distribution function of the characteristic of interest. To correct this problem, we propose a general framework for robust small‐area estimation, based on representing a small‐area estimator as a functional of a predictor of this small‐area cumulative distribution function. Key advantages of this framework are that it naturally leads to integrated estimation of small‐area means and quantiles and is not restricted to M‐quantile models. We also discuss mean squared error estimation for the resulting estimators, and demonstrate the advantages of our approach through model‐based and design‐based simulations, with the latter using economic data collected in an Australian farm survey. 相似文献

10.

Statistical data integration using multilevel models to predict employee compensation

Andreea L. Erciulescu Jean D. Opsomer Benjamin J. Schneider 《Revue canadienne de statistique》2023,51(1):312-326

This article considers the case where two surveys collect data on a common variable, with one survey being much smaller than the other. The smaller survey collects data on an additional variable of interest, related to the common variable collected in the two surveys, and out-of-scope with respect to the larger survey. Estimation of the two related variables is of interest at domains defined at a granular level. We propose a multilevel model for integrating data from the two surveys, by reconciling survey estimates available for the common variable, accounting for the relationship between the two variables, and expanding estimation for the other variable, for all the domains of interest. The model is specified as a hierarchical Bayes model for domain-level survey data, and posterior distributions are constructed for the two variables of interest. A synthetic estimation approach is considered as an alternative to the hierarchical modelling approach. The methodology is applied to wage and benefits estimation using data from the National Compensation Survey and the Occupational Employment Statistics Survey, available from the Bureau of Labor Statistics, Department of Labor, United States. 相似文献

11.

Semi‐parametric small‐area estimation by combining time‐series and cross‐sectional data methods

下载免费PDF全文

Farhad Shokoohi Mahmoud Torabi 《Australian & New Zealand Journal of Statistics》2018,60(3):323-342

In survey sampling, policymaking regarding the allocation of resources to subgroups (called small areas) or the determination of subgroups with specific properties in a population should be based on reliable estimates. Information, however, is often collected at a different scale than that of these subgroups; hence, the estimation can only be obtained on finer scale data. Parametric mixed models are commonly used in small‐area estimation. The relationship between predictors and response, however, may not be linear in some real situations. Recently, small‐area estimation using a generalised linear mixed model (GLMM) with a penalised spline (P‐spline) regression model, for the fixed part of the model, has been proposed to analyse cross‐sectional responses, both normal and non‐normal. However, there are many situations in which the responses in small areas are serially dependent over time. Such a situation is exemplified by a data set on the annual number of visits to physicians by patients seeking treatment for asthma, in different areas of Manitoba, Canada. In cases where covariates that can possibly predict physician visits by asthma patients (e.g. age and genetic and environmental factors) may not have a linear relationship with the response, new models for analysing such data sets are required. In the current work, using both time‐series and cross‐sectional data methods, we propose P‐spline regression models for small‐area estimation under GLMMs. Our proposed model covers both normal and non‐normal responses. In particular, the empirical best predictors of small‐area parameters and their corresponding prediction intervals are studied with the maximum likelihood estimation approach being used to estimate the model parameters. The performance of the proposed approach is evaluated using some simulations and also by analysing two real data sets (precipitation and asthma). 相似文献

12.

Logistic regression analyses for indirect data

Heiko Groenitz 《统计学通讯:理论与方法》2018,47(16):3838-3856

The article’s topic is logistic regression for direct data on the covariates, but indirect data on the endogenous variable. The indirect data may result from a privacy-protecting survey procedure for sensitive characteristics or from statistical disclosure control. Various procedures to generate the indirect data exist. However, we show that it is possible to develop a general approach for logistic regression analyses with indirect data that covers many procedures. We first derive a general algorithm for the maximum likelihood estimation and a general procedure for variance estimation. Subsequently, lots of examples demonstrate the broad applicability of our general framework. 相似文献

13.

The use of landsat for county estimates of crop areas: evaluation of the huddleston-ray and the battese-fuller estimators for & the case of stratified sampling

Gail Walker Richard Sigman 《统计学通讯:理论与方法》2013,42(23):2975-2996

The paper addresses the problem of using LANDSAT data to obtain estimates of crop areas at the county level. In the paper, LANDSAT data are used to supplement ground data collected in a nationwide agricultural survey. The paper extends the Battese-Fuller estimation model to a stratified sample design. The resulting estimator is evaluated on a six county area in South Dakota 相似文献

14.

Asymptotic design-cum-model based estimation of variances of estimated linear regression coefficients in survey sampling with unequal probabilities

Arijit Chaudhuri Tapabrata Maiti 《Statistical Papers》1996,37(1):79-84

Postulating a super-population linear regression model for a variable of interest on an auxiliary variable we consider design-based estimation of regression coefficients on drawing a sample with unequal probabilities from a survey population. Asymptotic design-cum-model based variance estimation procedures are proposed. 相似文献

15.

On a Unified Generalized Quasi–likelihood Approach for Familial–Longitudinal Non‐Stationary Count Data

BRAJENDRA C. SUTRADHAR VANDNA JOWAHEER GARY SNEDDON 《Scandinavian Journal of Statistics》2008,35(4):597-612

Abstract. In this paper, conditional on random family effects, we consider an auto‐regression model for repeated count data and their corresponding time‐dependent covariates, collected from the members of a large number of independent families. The count responses, in such a set up, unconditionally exhibit a non‐stationary familial–longitudinal correlation structure. We then take this two‐way correlation structure into account, and develop a generalized quasilikelihood (GQL) approach for the estimation of the regression effects and the familial correlation index parameter, whereas the longitudinal correlation parameter is estimated by using the well‐known method of moments. The performance of the proposed estimation approach is examined through a simulation study. Some model mis‐specification effects are also studied. The estimation methodology is illustrated by analysing real life healthcare utilization count data collected from 36 families of size four over a period of 4 years. 相似文献

16.

The relationship between crime, punishment and economic conditions: is reliable inference possible when crimes are under-recorded?

S. Pudney D. Deadman & D. Pyle 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1999,163(1):81-97

We investigate the estimation of dynamic models of criminal activity, when there is significant under-recording of crime. We give a theoretical analysis and use simulation techniques to investigate the resulting biases in conventional regression estimates. We find the biases to be of little practical significance. We develop and apply a new simulated maximum likelihood procedure that estimates simultaneously the measurement error and crime processes, using extraneous survey data. This also confirms that measurement error biases are small. Our estimation results for data from England and Wales imply a significant response of crime to both the economic and the enforcement environment. 相似文献

17.

A unified empirical likelihood approach for testing MCAR and subsequent estimation

Shixiao Zhang Peisong Han Changbao Wu 《Scandinavian Journal of Statistics》2019,46(1):272-288

For an estimation with missing data, a crucial step is to determine if the data are missing completely at random (MCAR), in which case a complete‐case analysis would suffice. Most existing tests for MCAR do not provide a method for a subsequent estimation once the MCAR is rejected. In the setting of estimating means, we propose a unified approach for testing MCAR and the subsequent estimation. Upon rejecting MCAR, the same set of weights used for testing can then be used for estimation. The resulting estimators are consistent if the missingness of each response variable depends only on a set of fully observed auxiliary variables and the true outcome regression model is among the user‐specified functions for deriving the weights. The proposed method is based on the calibration idea from survey sampling literature and the empirical likelihood theory. 相似文献

18.

Population constraints on pooled surveys in demographic hazard modeling

Michael S. Rendall Ryan Admiraal Alessandra DeRose Paola DiGiulio Mark S. Handcock Filomena Racioppi 《Statistical Methods and Applications》2008,17(4):519-539

In non-experimental research, data on the same population process may be collected simultaneously by more than one instrument. For example, in the present application, two sample surveys and a population birth registration system all collect observations on first births by age and year, while the two surveys additionally collect information on women’s education. To make maximum use of the three data sources, the survey data are pooled and the population data introduced as constraints in a logistic regression equation. Reductions in standard errors about the age and birth-cohort parameters of the regression equation in the order of three-quarters are obtained by introducing the population data as constraints. A halving of the standard errors about the education parameters is achieved by pooling observations from the larger survey dataset with those from the smaller survey. The percentage reduction in the standard errors through imposing population constraints is independent of the total survey sample size. 相似文献

19.

中国广义回归抽样估计系统的构建及应用

陈光慧《统计研究》2015,32(7):93-99

在抽样理论和应用研究方面,中国一直比较重视抽样方案设计,而忽视抽样估计方法研究。本文在系统总结加拿大等西方国家成功经验的基础上,引入并改进了一套广义回归估计系统,应用在复杂的连续多阶抽样调查中。本文以各类常见的抽样设计为基础,通过模型组和模型水平将现有的超总体模型进行扩展,建立各种类型的回归模型进行模型辅助的广义回归估计,最终形成一套广义回归估计系统,为中国抽样估计的应用研究奠定理论基础。最后,本文以中国农产量的连续多阶抽样调查为例,给出了具体的回归估计程序,从而验证这套系统的实践性和应用价值。相似文献

20.

An elastic net penalized small area model combining unit- and area-level data for regional hypertension prevalence estimation

J. P. Burgard J. Krause R. Münnich 《Journal of applied statistics》2021,48(9):1659

Hypertension is a highly prevalent cardiovascular disease. It marks a considerable cost factor to many national health systems. Despite its prevalence, regional disease distributions are often unknown and must be estimated from survey data. However, health surveys frequently lack in regional observations due to limited resources. Obtained prevalence estimates suffer from unacceptably large sampling variances and are not reliable. Small area estimation solves this problem by linking auxiliary data from multiple regions in suitable regression models. Typically, either unit- or area-level observations are considered for this purpose. But with respect to hypertension, both levels should be used. Hypertension has characteristic comorbidities and is strongly related to lifestyle features, which are unit-level information. It is also correlated with socioeconomic indicators that are usually measured on the area-level. But the level combination is challenging as it requires multi-level model parameter estimation from small samples. We use a multi-level small area model with level-specific penalization to overcome this issue. Model parameter estimation is performed via stochastic coordinate gradient descent. A jackknife estimator of the mean squared error is presented. The methodology is applied to combine health survey data and administrative records to estimate regional hypertension prevalence in Germany. 相似文献