首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 774 毫秒
1.
Summary.  The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the effect on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and we develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust; however, more complex designs are now feasible owing to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimizing survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically by using census and health survey data, showing considerable improvement over existing methods in some cases.  相似文献   

2.
Summary.  Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modelling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies that were conducted at specific household locations as well as 15 ambient monitoring sites in the area. The models allow for both flexible non-linear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic particles, with some recording only outdoor concentrations of black or elemental carbon, some recording indoor concentrations of black carbon and others recording both indoor and outdoor concentrations of black carbon. A joint model for outdoor and indoor exposure that specifies a spatially varying latent variable provides greater spatial coverage in the area of interest. We propose a penalized spline formulation of the model that relates to generalized kriging of the latent traffic pollution variable and leads to a natural Bayesian Markov chain Monte Carlo algorithm for model fitting. We propose methods that allow us to control the degrees of freedom of the smoother in a Bayesian framework. Finally, we present results from an analysis that applies the model to data from summer and winter separately.  相似文献   

3.
A survey on health insurance was conducted in July and August of 2011 in three major cities in China. In this study, we analyze the household coverage rate, which is an important index of the quality of health insurance. The coverage rate is restricted to the unit interval [0, 1], and it may differ from other rate data in that the “two corners” are nonzero. That is, there are nonzero probabilities of zero and full coverage. Such data may also be encountered in economics, finance, medicine, and many other areas. The existing approaches may not be able to properly accommodate such data. In this study, we develop a three-part model that properly describes fractional response variables with non-ignorable zeros and ones. We investigate estimation and inference under two proportional constraints on the regression parameters. Such constraints may lead to more lucid interpretations and fewer unknown parameters and hence more accurate estimation. A simulation study is conducted to compare the performance of constrained and unconstrained models and show that estimation under constraint can be more efficient. The analysis of household health insurance coverage data suggests that household size, income, expense, and presence of chronic disease are associated with insurance coverage.  相似文献   

4.
中国经济发展的城乡二元结构特征及东中西部梯度开发的区域特征必然会在一定程度上影响居民的消费行为,对中国城乡以及不同地区居民的消费习惯形成进行差异性研究能够揭示许多政策意义上的事实。借鉴Dynan提出的具有习惯形成的生命周期消费模型,一方面,结合状态空间模型求解出各时点上城乡居民的消费习惯参数,实证结果揭示了城乡居民习惯形成的变化路径存在较为显著的差异;另一方面,结合面板数据模型求解出东中西部居民的习惯形成参数,揭示了中国居民消费习惯形成所具有的地区差异性。分析结果表明,消费习惯形成与经济发展水平、现期与预期收入水平以及市场化发展水平相关。  相似文献   

5.
In this article we argue that the life-cycle model that allows demographics to affect household preferences and relaxes the assumption of certainty equivalence can generate hump-shaped consumption profiles over age that are very similar to those observed in household-level data sources and, in particular, match the differences in shape across different education groups. Liquidity constraints or myopia are not required to explain the empirical features of observed life-cycle patterns.  相似文献   

6.
Mixed models are regularly used in the analysis of clustered data, but are only recently being used for imputation of missing data. In household surveys where multiple people are selected from each household, imputation of missing values should preserve the structure pertaining to people within households and should not artificially change the apparent intracluster correlation (ICC). This paper focuses on the use of multilevel models for imputation of missing data in household surveys. In particular, the performance of a best linear unbiased predictor for both stochastic and deterministic imputation using a linear mixed model is compared to imputation based on a single level linear model, both with and without information about household respondents. In this paper an evaluation is carried out in the context of imputing hourly wage rate in the Household, Income and Labour Dynamics of Australia Survey. Nonresponse is generated under various assumptions about the missingness mechanism for persons and households, and with low, moderate and high intra‐household correlation to assess the benefits of the multilevel imputation model under different conditions. The mixed model and single level model with information about the household respondent lead to clear improvements when the ICC is moderate or high, and when there is informative missingness.  相似文献   

7.
This paper demonstrates the usefulness of nonparametric regression analysis for functional specfication of houshold Engel curves.

After a brief review in section 2 of the literature on demand functions and equivalence scales and the functional specifications used, we first discuss in section 3 the issues of using income versus total expenditure, the origin and nature of the error terms in the light of utility theroy, and the interpretation of empirical demand functions. we shall reach the unorthodox view that household demand functions should be interpreted as conditional expectations relative to prices, household composition and either income or the conditional expectation of total expenditure (rather that total expenditure itself), where the latter conditional expectation is taken relative to income, prices and household composition. these two forms appear to be equivalent. this result also solves the simultaneity problem: the error variance matrix is no longer singular. Moreover, the errors are in general heteroskedastic.

In section 4 we discuss the model and the data, and in section 5 we review the nonparametric kernal regression approach.

In section 6 we derive the functional form of our household engel curves from nonparametric regression results, using the 1980 budget survey for the netherlands, in order to avoid model misspecification. thus the modl is derived directly from the data, without restricting its functional form. the nonparametric regression results are then translated to suitable parametric functional specifications, i.e., we choose parametric functional forms in accordance with the nanparametric regression results. these parametric specification are estimated by least squares, and various parameter restrictions are tested in order to simplify the models. this yields very simple final specifications of the household engel curves involved, namely linear functions of income and the number of children in two age groups.  相似文献   

8.
Specification of household engel curves by nonparametric regression   总被引:1,自引:0,他引:1  
This paper demonstrates the usefulness of nonparametric regression analysis for functional specfication of houshold Engel curves.

After a brief review in section 2 of the literature on demand functions and equivalence scales and the functional specifications used, we first discuss in section 3 the issues of using income versus total expenditure, the origin and nature of the error terms in the light of utility theroy, and the interpretation of empirical demand functions. we shall reach the unorthodox view that household demand functions should be interpreted as conditional expectations relative to prices, household composition and either income or the conditional expectation of total expenditure (rather that total expenditure itself), where the latter conditional expectation is taken relative to income, prices and household composition. these two forms appear to be equivalent. this result also solves the simultaneity problem: the error variance matrix is no longer singular. Moreover, the errors are in general heteroskedastic.

In section 4 we discuss the model and the data, and in section 5 we review the nonparametric kernal regression approach.

In section 6 we derive the functional form of our household engel curves from nonparametric regression results, using the 1980 budget survey for the netherlands, in order to avoid model misspecification. thus the modl is derived directly from the data, without restricting its functional form. the nonparametric regression results are then translated to suitable parametric functional specifications, i.e., we choose parametric functional forms in accordance with the nanparametric regression results. these parametric specification are estimated by least squares, and various parameter restrictions are tested in order to simplify the models. this yields very simple final specifications of the household engel curves involved, namely linear functions of income and the number of children in two age groups.  相似文献   

9.
基于中国综合社会调查(CGSS)2008年的数据,利用多元线性回归模型检验社会经济地位等客观社会分层变量和阶层意识等主观社会分层变量对居民自感健康的影响。研究表明:居民的受教育程度、个人年收入、家庭年收入等客观社会分层变量和阶层意识、自评家庭经济地位等主观社会分层变量与其自感健康水平显著相关。同时,年龄、性别、政治面貌等个体特征变量能较好地解释居民自感健康的差异。  相似文献   

10.
In this study, we propose a multivariate stochastic model for Web site visit duration, page views, purchase incidence, and the sale amount for online retailers. The model is constructed by composition from carefully selected distributions and involves copula components. It allows for the strong nonlinear relationships between the sales and visit variables to be explored in detail, and can be used to construct sales predictions. The model is readily estimated using maximum likelihood, making it an attractive choice in practice given the large sample sizes that are commonplace in online retail studies. We examine a number of top-ranked U.S. online retailers, and find that the visit duration and the number of pages viewed are both related to sales, but in very different ways for different products. Using Bayesian methodology, we show how the model can be extended to a finite mixture model to account for consumer heterogeneity via latent household segmentation. The model can also be adjusted to accommodate a more accurate analysis of online retailers like apple.com that sell products at a very limited number of price points. In a validation study across a range of different Web sites, we find that the purchase incidence and sales amount are both forecast more accurately using our model, when compared to regression, probit regression, a popular data-mining method, and a survival model employed previously in an online retail study. Supplementary materials for this article are available online.  相似文献   

11.
徐蔼婷 《统计研究》2008,25(12):79-84
 NOE在我国已经形成了相当规模,对国民经济的正常运行造成了不容忽视的影响。本文分析了收支差异法的估算途径,一方面重点探讨了国民账户收支差异法对中国NOE估算实践的可行性,另一方面根据社会资金运动规律,考察了住户部门的资金来源和资金运用,提出了从单个机构部门估算NOE规模的方法——微观差异法。最后,运用新方法实现了1985—2006年中国NOE规模的估算。  相似文献   

12.
This study considers semiparametric spatial autoregressive models that allow for endogenous regressors, as well as the heterogenous effects of these regressors across spatial units. For the model estimation, we propose a semiparametric series generalized method of moments estimator. We establish that the proposed estimator is both consistent and asymptotically normal. As an empirical illustration, we apply the proposed model and method to Tokyo crime data to estimate how the existence of a neighborhood police substation (NPS) affects the household burglary rate. The results indicate that the presence of an NPS helps reduce household burglaries, and that the effects of some variables are heterogenous with respect to residential distribution patterns. Furthermore, we show that using a model that does not adjust for the endogeneity of NPS does not allow us to observe the significant relationship between NPS and the household burglary rate. Supplementary materials for this article are available online.  相似文献   

13.
Recurrent events involve the occurrences of the same type of event repeatedly over time and are commonly encountered in longitudinal studies. Examples include seizures in epileptic studies or occurrence of cancer tumors. In such studies, interest lies in the number of events that occur over a fixed period of time. One considerable challenge in analyzing such data arises when a large proportion of patients discontinues before the end of the study, for example, because of adverse events, leading to partially observed data. In this situation, data are often modeled using a negative binomial distribution with time‐in‐study as offset. Such an analysis assumes that data are missing at random (MAR). As we cannot test the adequacy of MAR, sensitivity analyses that assess the robustness of conclusions across a range of different assumptions need to be performed. Sophisticated sensitivity analyses for continuous data are being frequently performed. However, this is less the case for recurrent event or count data. We will present a flexible approach to perform clinically interpretable sensitivity analyses for recurrent event data. Our approach fits into the framework of reference‐based imputations, where information from reference arms can be borrowed to impute post‐discontinuation data. Different assumptions about the future behavior of dropouts dependent on reasons for dropout and received treatment can be made. The imputation model is based on a flexible model that allows for time‐varying baseline intensities. We assess the performance in a simulation study and provide an illustration with a clinical trial in patients who suffer from bladder cancer. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
盛来运等 《统计研究》2021,38(11):35-46
居民消费与家庭人口结构密切相关,本文基于2018年和2019年国家统计局住户调查数据,构建基于微观家庭的平衡面板数据随机效应和固定效应模型,结合我国人口未来变动趋势,从家庭人口年龄结构、城镇化属性、受教育水平三个维度着手,就家庭人口结构变动对家庭平均消费率和消费收入弹性的影响进行实证分析。结果表明,城镇化率提高、城镇化发展质量提升以及居民受教育水平提高有助于提高家庭平均消费率和消费收入弹性,人口老龄化对家庭平均消费率具有负面效应,更为积极的生育政策能够促进居民消费。本文建议持续推进以人为核心的新型城镇化,坚持教育优先发展,持续优化生育政策,积极应对人口老龄化,挖掘老年人口消费潜力,推动建设高水平国内消费市场。  相似文献   

15.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

16.
孙翊  王铮 《统计研究》2010,27(10):56-62
 通过构建一个后危机背景下的中国多区域支付政策模型及模拟系统来进行相关政策模拟。该模型基于多区域经济学理论,应用可计算一般均衡技术与方法,数据上采用8区域8部门的社会核算矩阵。模型考虑了人口分组与流动,资本流动和区域均衡机制等,使劳动力和资本可以跨区域跨部门流动,并可通过区域变量可以调控和测度区域差距。最后利用该模型针对3种中国区域支付政策方案(单区域和多区域)进行了模拟。模拟发现:当地方政府增加对本地居民的支付后城市居民的终生累积效用会得到增长,发达地区的农村居民效用会有小幅下降,而欠发达地区的农村居民效用会有小幅增长;当地城镇就业岗位会得到增加,但是农村就业会有所减少;同时中央政府源于该区域的财政收入会增长,而地方政府收入由于转移支出而下降。  相似文献   

17.
The Weibull, log-logistic and log-normal distributions are extensively used to model time-to-event data. The Weibull family accommodates only monotone hazard rates, whereas the log-logistic and log-normal are widely used to model unimodal hazard functions. The increasing availability of lifetime data with a wide range of characteristics motivate us to develop more flexible models that accommodate both monotone and nonmonotone hazard functions. One such model is the exponentiated Weibull distribution which not only accommodates monotone hazard functions but also allows for unimodal and bathtub shape hazard rates. This distribution has demonstrated considerable potential in univariate analysis of time-to-event data. However, the primary focus of many studies is rather on understanding the relationship between the time to the occurrence of an event and one or more covariates. This leads to a consideration of regression models that can be formulated in different ways in survival analysis. One such strategy involves formulating models for the accelerated failure time family of distributions. The most commonly used distributions serving this purpose are the Weibull, log-logistic and log-normal distributions. In this study, we show that the exponentiated Weibull distribution is closed under the accelerated failure time family. We then formulate a regression model based on the exponentiated Weibull distribution, and develop large sample theory for statistical inference. We also describe a Bayesian approach for inference. Two comparative studies based on real and simulated data sets reveal that the exponentiated Weibull regression can be valuable in adequately describing different types of time-to-event data.  相似文献   

18.
We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondent's model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.  相似文献   

19.
Tail probabilities are calculated by saddle-point approximation in a probabilistic-statistical model for the accumulated splice loss that results from a number of fusion splices in the installation of fibre-optic networks. When these probabilities, representing the risk of exceeding a specified total loss, can be controlled and kept low, the requirements on the individual losses can be substantially relaxed from their customary settings. As a consequence, it should be possible to save considerable installation time and cost. The probabilistic model, which can be theoretically motivated, states that the individual loss is basically exponentially distributed, but with a Gaussian contribution added and truncated at a set value, and that the loss is additive over splices. An extensive set of installation data fitted well with this model, except for occasional high losses. Therefore, the model described was extended to allow for a frequency of unspecified high losses of this sort. It is also indicated how the model parameters can be estimated from data.  相似文献   

20.
Abstract.  Much recent methodological progress in the analysis of infectious disease data has been due to Markov chain Monte Carlo (MCMC) methodology. In this paper, it is illustrated that rejection sampling can also be applied to a family of inference problems in the context of epidemic models, avoiding the issues of convergence associated with MCMC methods. Specifically, we consider models for epidemic data arising from a population divided into households. The models allow individuals to be potentially infected both from outside and from within the household. We develop methodology for selection between competing models via the computation of Bayes factors. We also demonstrate how an initial sample can be used to adjust the algorithm and improve efficiency. The data are assumed to consist of the final numbers ultimately infected within a sample of households in some community. The methods are applied to data taken from outbreaks of influenza.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号