首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Small area estimation has received considerable attention in recent years because of growing demand for small area statistics. Basic area‐level and unit‐level models have been studied in the literature to obtain empirical best linear unbiased prediction (EBLUP) estimators of small area means. Although this classical method is useful for estimating the small area means efficiently under normality assumptions, it can be highly influenced by the presence of outliers in the data. In this article, the authors investigate the robustness properties of the classical estimators and propose a resistant method for small area estimation, which is useful for downweighting any influential observations in the data when estimating the model parameters. To estimate the mean squared errors of the robust estimators of small area means, a parametric bootstrap method is adopted here, which is applicable to models with block diagonal covariance structures. Simulations are carried out to study the behaviour of the proposed robust estimators in the presence of outliers, and these estimators are also compared to the EBLUP estimators. Performance of the bootstrap mean squared error estimator is also investigated in the simulation study. The proposed robust method is also applied to some real data to estimate crop areas for counties in Iowa, using farm‐interview data on crop areas and LANDSAT satellite data as auxiliary information. The Canadian Journal of Statistics 37: 381–399; 2009 © 2009 Statistical Society of Canada  相似文献   

2.
Survival data with missing censoring indicators are frequently encountered in biomedical studies. In this paper, we consider statistical inference for this type of data under the additive hazard model. Reweighting methods based on simple and augmented inverse probability are proposed. The asymptotic properties of the proposed estimators are established. Furthermore, we provide a numerical technique for checking adequacy of the fitted model with missing censoring indicators. Our simulation results show that the proposed estimators outperform the simple and augmented inverse probability weighted estimators without reweighting. The proposed methods are illustrated by analyzing a dataset from a breast cancer study.  相似文献   

3.
This paper deals with statistical inference on the parameters of a stochastic model, describing curved fibrous objects in three dimensions, that is based on multivariate autoregressive processes. The model is fitted to experimental data consisting of a large number of short independently sampled trajectories of multivariate autoregressive processes. We discuss relevant statistical properties (e.g. asymptotic behaviour as the number of trajectories tends to infinity) of the maximum likelihood (ML) estimators for such processes. Numerical studies are also performed to analyse some of the more intractable properties of the ML estimators. Finally the whole methodology, i.e., the fibre model and its statistical inference, is applied to appropriately describe the tracking of fibres in real materials.  相似文献   

4.
In this paper we suggest several nonparametric quantile estimators based on Beta kernel. They are applied to transformed data by the generalized Champernowne distribution initially fitted to the data. A Monte Carlo based study has shown that those estimators improve the efficiency of the traditional ones, not only for light tailed distributions, but also for heavy tailed, when the probability level is close to 1. We also compare these estimators with the Extreme Value Theory Quantile applied to Danish data on large fire insurance losses.  相似文献   

5.
We construct nonparametric estimators of state waiting time distribution functions in a Markov multistate model using current status data. This is a particularly difficult problem since neither the entry nor the exit times of a given state are directly observed. These estimators are obtained, using the Markov property, from estimators of counting processes of state entry and exit times, as well as, the size of “at risk” sets of state entry and transitions out of that state. Consistency of our estimators is established. Finite-sample behavior of our estimators is studied by simulation, in which we show that our estimators based on current status data compare well with those based on complete data. We also illustrate our method using a pubertal development data set obtained from the NHANES III [1997. NHANES III Reference Manuals and Reports (CD-ROM). Analytic and Reporting Guidelines: The Third National Health and Nutrition Examination Survey (1988–94). National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD] study.  相似文献   

6.
Inverse probability weighting (IPW) can deal with confounding in non randomized studies. The inverse weights are probabilities of treatment assignment (propensity scores), estimated by regressing assignment on predictors. Problems arise if predictors can be missing. Solutions previously proposed include assuming assignment depends only on observed predictors and multiple imputation (MI) of missing predictors. For the MI approach, it was recommended that missingness indicators be used with the other predictors. We determine when the two MI approaches, (with/without missingness indicators) yield consistent estimators and compare their efficiencies.We find that, although including indicators can reduce bias when predictors are missing not at random, it can induce bias when they are missing at random. We propose a consistent variance estimator and investigate performance of the simpler Rubin’s Rules variance estimator. In simulations we find both estimators perform well. IPW is also used to correct bias when an analysis model is fitted to incomplete data by restricting to complete cases. Here, weights are inverse probabilities of being a complete case. We explain how the same MI methods can be used in this situation to deal with missing predictors in the weight model, and illustrate this approach using data from the National Child Development Survey.  相似文献   

7.
As the number of random variables for the categorical data increases, the possible number of log-linear models which can be fitted to the data increases rapidly, so that various model selection methods are developed. However, we often found that some models chosen by different selection criteria do not coincide. In this paper, we propose a comparison method to test the final models which are non-nested. The statistic of Cox (1961, 1962) is applied to log-linear models for testing non-nested models, and the Kullback-Leibler measure of closeness (Pesaran 1987) is explored. In log-linear models, pseudo estimators for the expectation and the variance of Cox's statistic are not only derived but also shown to be consistent estimators.  相似文献   

8.
Summary.  We consider non-stationary spatiotemporal modelling in an investigation into karst water levels in western Hungary. A strong feature of the data set is the extraction of large amounts of water from mines, which caused the water levels to reduce until about 1990 when the mining ceased, and then the levels increased quickly. We discuss some traditional hydrogeological models which might be considered to be appropriate for this situation, and various alternative stochastic models. In particular, a separable space–time covariance model is proposed which is then deformed in time to account for the non-stationary nature of the lagged correlations between sites. Suitable covariance functions are investigated and then the models are fitted by using weighted least squares and cross-validation. Forecasting and prediction are carried out by using spatiotemporal kriging. We assess the performance of the method with one-step-ahead forecasting and make comparisons with naïve estimators. We also consider spatiotemporal prediction at a set of new sites. The new model performs favourably compared with the deterministic model and the naïve estimators, and the deformation by time shifting is worthwhile.  相似文献   

9.
In this paper, the semi varying coefficient zero-inflated generalized Poisson model is discussed based on penalized log-likelihood. All the coefficient functions are fitted by penalized spline (P-spline), and Expectation-maximization algorithm is used to drive these estimators. The estimation approach is rapid and computationally stable. Under some mild conditions, the consistency and the asymptotic normality of these resulting estimators are given. The score test statistics about dispersion parameter is discussed based on the P-spline estimation. Both simulated and real data example are used to illustrate our proposed methods.  相似文献   

10.
Clustered longitudinal data feature cross‐sectional associations within clusters, serial dependence within subjects, and associations between responses at different time points from different subjects within the same cluster. Generalized estimating equations are often used for inference with data of this sort since they do not require full specification of the response model. When data are incomplete, however, they require data to be missing completely at random unless inverse probability weights are introduced based on a model for the missing data process. The authors propose a robust approach for incomplete clustered longitudinal data using composite likelihood. Specifically, pairwise likelihood methods are described for conducting robust estimation with minimal model assumptions made. The authors also show that the resulting estimates remain valid for a wide variety of missing data problems including missing at random mechanisms and so in such cases there is no need to model the missing data process. In addition to describing the asymptotic properties of the resulting estimators, it is shown that the method performs well empirically through simulation studies for complete and incomplete data. Pairwise likelihood estimators are also compared with estimators obtained from inverse probability weighted alternating logistic regression. An application to data from the Waterloo Smoking Prevention Project is provided for illustration. The Canadian Journal of Statistics 39: 34–51; 2011 © 2010 Statistical Society of Canada  相似文献   

11.
The consistency of estimators in finite mixture models has been discussed under the topology of the quotient space obtained by collapsing the true parameter set into a single point. In this paper, we extend the results of Cheng and Liu (2001) to give conditions under which the maximum likelihood estimator (MLE) is strongly consistent in such a sense in finite mixture models with censored data. We also show that the fitted model tends to the true model under a weak condition as the sample size tends to infinity.  相似文献   

12.
In this article, we assume that the distribution of the error terms is skew t in two-way analysis of variance (ANOVA). Skew t distribution is very flexible for modeling the symmetric and the skew datasets, since it reduces to the well-known normal, skew normal, and Student's t distributions. We obtain the estimators of the model parameters by using the maximum likelihood (ML) and the modified maximum likelihood (MML) methodologies. We also propose new test statistics based on these estimators for testing the equality of the treatment and the block means and also the interaction effect. The efficiencies of the ML and the MML estimators and the power values of the test statistics based on them are compared with the corresponding normal theory results via Monte Carlo simulation study. Simulation results show that the proposed methodologies are more preferable. We also show that the test statistics based on the ML estimators are more powerful than the test statistics based on the MML estimators as expected. However, power values of the test statistics based on the MML estimators are very close to the corresponding test statistics based on the ML estimators. At the end of the study, a real life example is given to show the implementation of the proposed methodologies.  相似文献   

13.
A new generalized logarithmic series distribution (GLSD) with two parameters is proposed.The proposed model is flexible enough to describe short-tailed as well as long-tailed data.Some recurence relations for its probabilities and the factorial moments are presente.These recurrence relations are utilized to obtain the minimum chi-square estimators for the parmaters.Maximum likelihood estimators and some other estimators based on first few moments and probabilities are also suggested.Asymptotic relative efficiency of some of these estimators is also obtained and compared.Two test statistics based on the minimum chi-square estimators fo testing some hypotheses regarding the GLSD are proposed.The fit of the model and the application of the test statistics are exemplified by some data sets.Finally, a graphical method is suggested for differentiating between the ordinary logarithmic series distribution and the GLSD.  相似文献   

14.
A non-homogeneous hidden Markov model for precipitation occurrence   总被引:9,自引:0,他引:9  
A non-homogeneous hidden Markov model is proposed for relating precipitation occurrences at multiple rain-gauge stations to broad scale atmospheric circulation patterns (the so-called 'downscaling problem'). We model a 15-year sequence of winter data from 30 rain stations in south-western Australia. The first 10 years of data are used for model development and the remaining 5 years are used for model evaluation. The fitted model accurately reproduces the observed rainfall statistics in the reserved data despite a shift in atmospheric circulation (and, consequently, rainfall) between the two periods. The fitted model also provides some useful insights into the processes driving rainfall in this region.  相似文献   

15.
The asymptotic distributions of squared and absolute residual autocorrelations for GARCH model estimated by M-estimators are derived. Two diagnostic tests are developed which can be used to check the adequacy of GARCH model fitted by using M-estimators. Simulation results show that the empirical sizes of both tests are close to the nominal size in most of the cases. The power of test based on absolute residual autocorrelation is found better than test based on squared residual autocorrelations. Our results reveal that there are estimators that can fit GARCH-type models better than the commonly used quasi-maximum likelihood estimator under non normal errors. An application to real data set is also presented.  相似文献   

16.
Li G  Wu TT 《Statistica Sinica》2010,20(4):1581-1607
In this article we study a semiparametric additive risks model (McKeague and Sasieni (1994)) for two-stage design survival data where accurate information is available only on second stage subjects, a subset of the first stage study. We derive two-stage estimators by combining data from both stages. Large sample inferences are developed. As a by-product, we also obtain asymptotic properties of the single stage estimators of McKeague and Sasieni (1994) when the semiparametric additive risks model is misspecified. The proposed two-stage estimators are shown to be asymptotically more efficient than the second stage estimators. They also demonstrate smaller bias and variance for finite samples. The developed methods are illustrated using small intestine cancer data from the SEER (Surveillance, Epidemiology, and End Results) Program.  相似文献   

17.
We study a group lasso estimator for the multivariate linear regression model that accounts for correlated error terms. A block coordinate descent algorithm is used to compute this estimator. We perform a simulation study with categorical data and multivariate time series data, typical settings with a natural grouping among the predictor variables. Our simulation studies show the good performance of the proposed group lasso estimator compared to alternative estimators. We illustrate the method on a time series data set of gene expressions.  相似文献   

18.
In this paper, a new small domain estimator for area-level data is proposed. The proposed estimator is driven by a real problem of estimating the mean price of habitation transaction at a regional level in a European country, using data collected from a longitudinal survey conducted by a national statistical office. At the desired level of inference, it is not possible to provide accurate direct estimates because the sample sizes in these domains are very small. An area-level model with a heterogeneous covariance structure of random effects assists the proposed combined estimator. This model is an extension of a model due to Fay and Herriot [5], but it integrates information across domains and over several periods of time. In addition, a modified method of estimation of variance components for time-series and cross-sectional area-level models is proposed by including the design weights. A Monte Carlo simulation, based on real data, is conducted to investigate the performance of the proposed estimators in comparison with other estimators frequently used in small area estimation problems. In particular, we compare the performance of these estimators with the estimator based on the Rao–Yu model [23]. The simulation study also accesses the performance of the modified variance component estimators in comparison with the traditional ANOVA method. Simulation results show that the estimators proposed perform better than the other estimators in terms of both precision and bias.  相似文献   

19.
Data from past time periods and temporal correlation are rich sources of information for estimating small area parameters at the current period. This paper investigates the use of unit-level temporal linear mixed models for estimating linear parameters. Two models are considered, with domain and domain-time random effects. The first model assumes time independency and the second one AR(1)-type time correlation. They are fitted by a Fisher-scoring algorithm that calculates the residual maximum likelihood estimators of the model parameters. Based on the introduced models, empirical best linear unbiased predictors of small area linear parameters are studied, and analytic estimators for evaluating the performance of their mean squared errors are proposed. Three simulation experiments are carried out to study the behaviour of the fitting algorithm, the small area predictors and the estimators of the mean squared error. By using data of the Spanish surveys of income and living conditions of 2004–2008, an application to the estimation of 2008 average normalized net annual incomes in Spanish provinces by sex is given.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号