The likelihood of a generalized linear mixed model (GLMM) often involves high-dimensional integrals, which in general cannot be computed explicitly. When direct computation is not available, method of simulated moments (MSM) is a fairly simple way to estimate the parameters of interest. In this research, we compared parametric bootstrap (PB) and nonparametric bootstrap methods (NPB) in estimating the standard errors of MSM estimators for GLMM. Simulation results show that when the group size is large, the PB and NPB perform similarly; when group size is medium, NPB performs better than PB in estimating standard errors of the mean.  相似文献   

The generalized bootstrap is a parametric bootstrap method in which the underlying distribution function is estimated by fitting a generalized lambda distribution to the observed data. In this study, the generalized bootstrap is compared with the traditional parametric and non-parametric bootstrap methods in estimating the quantiles at different levels, especially for high quantiles. The performances of the three methods are evaluated in terms of cover rate, average interval width and standard deviation of width of the 95% bootstrap confidence intervals. Simulation results showed that the generalized bootstrap has overall better performance than the non-parametric bootstrap in high quantile estimation.  相似文献   

In survey sampling, policy decisions regarding the allocation of resources to sub‐groups of a population depend on reliable predictors of their underlying parameters. However, in some sub‐groups, called small areas due to small sample sizes relative to the population, the information needed for reliable estimation is typically not available. Consequently, data on a coarser scale are used to predict the characteristics of small areas. Mixed models are the primary tools in small area estimation (SAE) and also borrow information from alternative sources (e.g., previous surveys and administrative and census data sets). In many circumstances, small area predictors are associated with location. For instance, in the case of chronic disease or cancer, it is important for policy makers to understand spatial patterns of disease in order to determine small areas with high risk of disease and establish prevention strategies. The literature considering SAE with spatial random effects is sparse and mostly in the context of spatial linear mixed models. In this article, small area models are proposed for the class of spatial generalized linear mixed models to obtain small area predictors and corresponding second‐order unbiased mean squared prediction errors via Taylor expansion and a parametric bootstrap approach. The performance of the proposed approach is evaluated through simulation studies and application of the models to a real esophageal cancer data set from Minnesota, U.S.A. The Canadian Journal of Statistics 47: 426–437; 2019 © 2019 Statistical Society of Canada  相似文献   

A version of the nonparametric bootstrap, which resamples the entire subjects from original data, called the case bootstrap, has been increasingly used for estimating uncertainty of parameters in mixed‐effects models. It is usually applied to obtain more robust estimates of the parameters and more realistic confidence intervals (CIs). Alternative bootstrap methods, such as residual bootstrap and parametric bootstrap that resample both random effects and residuals, have been proposed to better take into account the hierarchical structure of multi‐level and longitudinal data. However, few studies have been performed to compare these different approaches. In this study, we used simulation to evaluate bootstrap methods proposed for linear mixed‐effect models. We also compared the results obtained by maximum likelihood (ML) and restricted maximum likelihood (REML). Our simulation studies evidenced the good performance of the case bootstrap as well as the bootstraps of both random effects and residuals. On the other hand, the bootstrap methods that resample only the residuals and the bootstraps combining case and residuals performed poorly. REML and ML provided similar bootstrap estimates of uncertainty, but there was slightly more bias and poorer coverage rate for variance parameters with ML in the sparse design. We applied the proposed methods to a real dataset from a study investigating the natural evolution of Parkinson's disease and were able to confirm that the methods provide plausible estimates of uncertainty. Given that most real‐life datasets tend to exhibit heterogeneity in sampling schedules, the residual bootstraps would be expected to perform better than the case bootstrap. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

It is well known that in a traditional outlier-free situation, the generalized quasi-likelihood (GQL) approach [B.C. Sutradhar, On exact quasilikelihood inference in generalized linear mixed models, Sankhya: Indian J. Statist. 66 (2004), pp. 261–289] performs very well to obtain the consistent as well as the efficient estimates for the parameters involved in the generalized linear mixed models (GLMMs). In this paper, we first examine the effect of the presence of one or more outliers on the GQL estimation for the parameters in such GLMMs, especially in two important models such as count and binary mixed models. The outliers appear to cause serious biases and hence inconsistency in the estimation. As a remedy, we then propose a robust GQL (RGQL) approach in order to obtain the consistent estimates for the parameters in the GLMMs in the presence of one or more outliers. An extensive simulation study is conducted to examine the consistency performance of the proposed RGQL approach.  相似文献   

The aim of the article is to identify the intraday seasonality in a wind speed time series. Following the traditional approach, the marginal probability law is Weibull and, consequently, we consider seasonal Weibull law. A new estimation and decision procedure to estimate the seasonal Weibull law intraday scale parameter is presented. We will also give statistical decision-making tools to discard or not the trend parameter and to validate the seasonal model.  相似文献   

Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach.  相似文献   

Estimation and prediction in generalized linear mixed models are often hampered by intractable high dimensional integrals. This paper provides a framework to solve this intractability, using asymptotic expansions when the number of random effects is large. To that end, we first derive a modified Laplace approximation when the number of random effects is increasing at a lower rate than the sample size. Second, we propose an approximate likelihood method based on the asymptotic expansion of the log-likelihood using the modified Laplace approximation which is maximized using a quasi-Newton algorithm. Finally, we define the second order plug-in predictive density based on a similar expansion to the plug-in predictive density and show that it is a normal density. Our simulations show that in comparison to other approximations, our method has better performance. Our methods are readily applied to non-Gaussian spatial data and as an example, the analysis of the rhizoctonia root rot data is presented.  相似文献   

The mixed model is defined. The exact posterior distribution for the fixed effect vector is obtained. The exact posterior distribution for the error variance is obtained. The exact posterior mean and variance of a Bayesian estimator for the variances of random effects is also derived. All computations are non-iterative and avoid numerical integrations.  相似文献   

Earlier investigations used a one-sided inequality to consltuct confidence regions for the variance ratios or balanced randoiu models. In this study, confidence regions are based on a two-sided generalisation of this inequality and the results are illustrated by estimating the parameters of some elementary random models.  相似文献   

Categorical longitudinal data are frequently applied in a variety of fields, and are commonly fitted by generalized linear mixed models (GLMMs) and generalized estimating equations models. The cumulative logit is one of the useful link functions to deal with the problem involving repeated ordinal responses. To check the adequacy of the GLMMs with cumulative logit link function, two goodness-of-fit tests constructed by the unweighted sum of squared model residuals using numerical integration and bootstrap resampling technique are proposed. The empirical type I error rates and powers of the proposed tests are examined by simulation studies. The ordinal longitudinal studies are utilized to illustrate the application of the two proposed tests.  相似文献   

Under a unit-level bivariate linear mixed model, this paper introduces small area predictors of expenditure means and ratios, and derives approximations and estimators of the corresponding mean squared errors. For the considered model, the REML estimation method is implemented. Several simulation experiments, designed to analyze the behavior of the introduced fitting algorithm, predictors and mean squared error estimators, are carried out. An application to real data from the Spanish household budget survey illustrates the behavior of the proposed statistical methodology. The target is the estimation of means of food and non-food household annual expenditures and of ratios of food household expenditures by Spanish provinces.  相似文献   

We analyze the multivariate spatial distribution of plant species diversity, distributed across three ecologically distinct land uses, the urban residential, urban non-residential, and desert. We model these data using a spatial generalized linear mixed model. Here plant species counts are assumed to be correlated within and among the spatial locations. We implement this model across the Phoenix metropolis and surrounding desert. Using a Bayesian approach, we utilized the Langevin–Hastings hybrid algorithm. Under a generalization of a spatial log-Gaussian Cox model, the log-intensities of the species count processes follow Gaussian distributions. The purely spatial component corresponding to these log-intensities are jointly modeled using a cross-convolution approach, in order to depict a valid cross-correlation structure. We observe that this approach yields non-stationarity of the model ensuing from different land use types. We obtain predictions of various measures of plant diversity including plant richness and the Shannon–Weiner diversity at observed locations. We also obtain a prediction framework for plant preferences in urban and desert plots.  相似文献   

In recent years much effort has been devoted to maximum likelihood estimation of generalized linear mixed models. Most of the existing methods use the EM algorithm, with various techniques in handling the intractable E-step. In this paper, a new implementation of a stochastic approximation algorithm with Markov chain Monte Carlo method is investigated. The proposed algorithm is computationally straightforward and its convergence is guaranteed. A simulation and three real data sets, including the challenging salamander data, are used to illustrate the procedure and to compare it with some existing methods. The results indicate that the proposed algorithm is an attractive alternative for problems with a large number of random effects or with high dimensional intractable integrals in the likelihood function.  相似文献   

The number of parameters mushrooms in a linear mixed effects (LME) model in the case of multivariate repeated measures data. Computation of these parameters is a real problem with the increase in the number of response variables or with the increase in the number of time points. The problem becomes more intricate and involved with the addition of additional random effects. A multivariate analysis is not possible in a small sample setting. We propose a method to estimate these many parameters in bits and pieces from baby models, by taking a subset of response variables at a time, and finally using these bits and pieces at the end to get the parameter estimates for the mother model, with all variables taken together. Applying this method one can calculate the fixed effects, the best linear unbiased predictions (BLUPs) for the random effects in the model, and also the BLUPs at each time of observation for each response variable, to monitor the effectiveness of the treatment for each subject. The proposed method is illustrated with an example of multiple response variables measured over multiple time points arising from a clinical trial in osteoporosis.  相似文献   

When employing generalized linear models, interest often focuses on estimation of odds ratios or relative risks. Additionally, researchers often make overall conclusions, requiring accurate estimation of a set of these quantities. Consequently, simultaneous estimation is warranted. Current simultaneous estimation methods only perform well in this setting when there are a very small number of comparisons and/or the sample size is relatively large. Additionally, the estimated quantities can have significant bias especially at small sample sizes. The proposed bounds: (1) perform well for a small or large number of comparisons, (2) exhibit improved performance over current methods for small to moderate sample sizes, (3) provide bias adjustment not reliant on asymptotics, and (4) avoid the infinite parameter estimates that can occur with maximum-likelihood estimators. Simulations demonstrate that the proposed bounds achieve the desired level of confidence at smaller sample sizes than previous methods.  相似文献   

An important problem for fitting local linear regression is the choice of the smoothing parameter. As the smoothing parameter becomes large, the estimator tends to a straight line, which is the least squares fit in the ordinary linear regression setting. This property may be used to assess the adequacy of a simple linear model. Motivated by Silverman's (1981) work in kernel density estimation, a suitable test statistic is the critical smoothing parameter where the estimate changes from nonlinear to linear, while linearity or non- linearity requires a more precise judgment. We define the critical smoothing parameter through the approximate F-tests by Hastie and Tibshirani (1990). To assess the significance, the “wild bootstrap” procedure is used to replicate the data and the proportion of bootstrap samples which give a nonlinear estimate when using the critical bandwidth is obtained as the p-value. Simulation results show that the critical smoothing test is useful in detecting a wide range of alternatives.  相似文献   

The purpose of this article is to obtain the jackknifed ridge predictors in the linear mixed models and to examine the superiorities, the linear combinations of the jackknifed ridge predictors over the ridge, principal components regression, r?k class and Henderson's predictors in terms of bias, covariance matrix and mean square error criteria. Numerical analyses are considered to illustrate the findings and a simulation study is conducted to see the performance of the jackknifed ridge predictors.  相似文献   

Data collected in various scientific fields are count data. One way to analyze such data is to compare the individual levels of the factor treatment using multiple comparisons. However, the measured individuals are often clustered – e.g. according to litter or rearing. This must be considered when estimating the parameters by a repeated measurement model. In addition, ignoring the overdispersion to which count data is prone leads to an increase of the type one error rate. We carry out simulation studies using several different data settings and compare different multiple contrast tests with parameter estimates from generalized estimation equations and generalized linear mixed models in order to observe coverage and rejection probabilities. We generate overdispersed, clustered count data in small samples as can be observed in many biological settings. We have found that the generalized estimation equations outperform generalized linear mixed models if the variance-sandwich estimator is correctly specified. Furthermore, generalized linear mixed models show problems with the convergence rate under certain data settings, but there are model implementations with lower implications exists. Finally, we use an example of genetic data to demonstrate the application of the multiple contrast test and the problems of ignoring strong overdispersion.  相似文献   

Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution.  相似文献   

