首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
When a two-level multilevel model (MLM) is used for repeated growth data, the individuals constitute level 2 and the successive measurements constitute level 1, which is nested within the individuals that make up level 2. The heterogeneity among individuals is represented by either the random-intercept or random-coefficient (slope) model. The variance components at level 1 involve serial effects and measurement errors under constant variance or heteroscedasticity. This study hypothesizes that missing serial effects or/and heteroscedasticity may bias the results obtained from two-level models. To illustrate this effect, we conducted two simulation studies, where the simulated data were based on the characteristics of an empirical mouse tumour data set. The results suggest that for repeated growth data with constant variance (measurement error) and misspecified serial effects (ρ > 0.3), the proportion of level-2 variation (intra-class correlation coefficient) increases with ρ and the two-level random-coefficient model is the minimum AIC (or AICc) model when compared with the fixed model, heteroscedasticity model, and random-intercept model. In addition, the serial effect (ρ > 0.1) and heteroscedasticity are both misspecified, implying that the two-level random-coefficient model is the minimum AIC (or AICc) model when compared with the fixed model and random-intercept model. This study demonstrates that missing serial effects and/or heteroscedasticity may indicate heterogeneity among individuals in repeated growth data (mixed or two-level MLM). This issue is critical in biomedical research.  相似文献   

This paper illustrates the use of multilevel statistical modelling of cross-classified data to explore interviewers' influence on survey non-response. The results suggest that the variability in whole household refusal and non-contact rates is due more to the influence of interviewers than to the influence of areas. The results from separate logistic regression models are compared with the results from multinomial models using a polytomous dependent variable (refusals, non-contacts and responses). Using the cross-classified multilevel approach allows us to estimate correlations between refusals and non-contacts, suggesting that interviewers who are good at reducing whole household refusals are also good at reducing whole household non-contacts.  相似文献   

A log-linear modelling approach is proposed for dealing with polytomous, unordered exposure variables in case-control epidemiological studies with matched pairs. Hypotheses concerning epidemiological parameters are shown to be expressable in terms of log-linear models for the expected frequencies of the case-by-control square concordance table representation of the matched data; relevant maximum likelihood estimates and goodness-of-fit statistics are presented. Possible extensions to account for ordered categorical risk factors and multiple controls are illustrated, and comparisons with previous work are discussed. Finally, the possibility of implementing the proposed method with GLIM is illustrated within the context of a data set already analyzed by other authors.  相似文献   

If the observations for fitting a polytomous logistic regression model satisfy certain normality assumptions, the maximum likelihood estimates of the regression coefficients are the discriminant function estimates. This article shows that these estimates, their unbiased counterparts, and associated test statistics for variable selection can be calculated using ordinary least squares regression techniques, thereby providing a convenient method for fitting logistic regression models in the normal case. Evidence is given indicating that the discriminant function estimates and test statistics merit wider use in nonnormal cases, especially in exploratory work on large data sets.  相似文献   

We discuss the use of latent variable models with observed covariates for computing response propensities for sample respondents. A response propensity score is often used to weight item and unit responders to account for item and unit non-response and to obtain adjusted means and proportions. In the context of attitude scaling, we discuss computing response propensity scores by using latent variable models for binary or nominal polytomous manifest items with covariates. Our models allow the response propensity scores to be found for several different items without refitting. They allow any pattern of missing responses for the items. If one prefers, it is possible to estimate population proportions directly from the latent variable models, so avoiding the use of propensity scores. Artificial data sets and a real data set extracted from the 1996 British Social Attitudes Survey are used to compare the various methods proposed.  相似文献   

The objective of this paper is to present a method which can accommodate certain types of missing data by using the quasi-likelihood function for the complete data. This method can be useful when we can make first and second moment assumptions only; in addition, it can be helpful when the EM algorithm applied to the actual likelihood becomes overly complicated. First we derive a loss function for the observed data using an exponential family density which has the same mean and variance structure of the complete data. This loss function is the counterpart of the quasi-deviance for the observed data. Then the loss function is minimized using the EM algorithm. The use of the EM algorithm guarantees a decrease in the loss function at every iteration. When the observed data can be expressed as a deterministic linear transformation of the complete data, or when data are missing completely at random, the proposed method yields consistent estimators. Examples are given for overdispersed polytomous data, linear random effects models, and linear regression with missing covariates. Simulation results for the linear regression model with missing covariates show that the proposed estimates are more efficient than estimates based on completely observed units, even when outcomes are bimodal or skewed.  相似文献   

Survival models assume that fates of individuals are independent, yet the robustness of this assumption has been poorly quantified. We examine how empirically derived estimates of the variance of survival rates are affected by dependency in survival probability among individuals. We used Monte Carlo simulations to generate known amounts of dependency among pairs of individuals and analyzed these data with Kaplan-Meier and Cormack-Jolly-Seber models. Dependency significantly increased these empirical variances as compared to theoretically derived estimates of variance from the same populations. Using resighting data from 168 pairs of black brant ( Branta bernicla nigricans ), we used a resampling procedure and program RELEASE to estimate empirical and mean theoretical variances. We estimated that the relationship between paired individuals caused the empirical variance of the survival rate to be 155% larger than the empirical variance for unpaired individuals. Monte Carlo simulations and use of this resampling strategy can provide investigators with information on how robust their data are to this common assumption of independent survival probabilities.  相似文献   

Survival models assume that fates of individuals are independent, yet the robustness of this assumption has been poorly quantified. We examine how empirically derived estimates of the variance of survival rates are affected by dependency in survival probability among individuals. We used Monte Carlo simulations to generate known amounts of dependency among pairs of individuals and analyzed these data with Kaplan-Meier and Cormack-Jolly-Seber models. Dependency significantly increased these empirical variances as compared to theoretically derived estimates of variance from the same populations. Using resighting data from 168 pairs of black brant ( Branta bernicla nigricans ), we used a resampling procedure and program RELEASE to estimate empirical and mean theoretical variances. We estimated that the relationship between paired individuals caused the empirical variance of the survival rate to be 155% larger than the empirical variance for unpaired individuals. Monte Carlo simulations and use of this resampling strategy can provide investigators with information on how robust their data are to this common assumption of independent survival probabilities.  相似文献   

Measures of sensitivity, predictive accuracy, and agreement are currently used to evaluate the efficiency of diagnostic tests reported on dichotomous scales. This paper presents a unified approach to the evaluation of diagnostic tests in terns of generalized Lnuices of sensitivity, misclassification, predictive accuracy and inaccuracy, classification agreement and prediction agreement for polytomous measurement scales. It Is sufficiently general to accommodate additional complications of study design factors such as multiple testing, known and unknown disease prevalence distributions, and multiple subpopulations defined by the cross-classification of independent factors. Estimation and hypothesis testing are developed within a general linear models approach to the analysis of categorical data from repeated measurement lesigns using weighted least squares computations. This methodology is illustrated within the context of data from a. large cotmiunity-based epidemiologic study of obstructive airways disease,Two diagnostic criteria for inpaired lung function are compared on the basis of their generalized sensitivity and classification agreement measures. The outcomes of the tests are reported on the same three-point scale (normal, questionable, impaired) and are examined within several subpopulations determined by age and sex.  相似文献   

This paper addresses the problem of simultaneous variable selection and estimation in the random-intercepts model with the first-order lag response. This type of model is commonly used for analyzing longitudinal data obtained through repeated measurements on individuals over time. This model uses random effects to cover the intra-class correlation, and the first lagged response to address the serial correlation, which are two common sources of dependency in longitudinal data. We demonstrate that the conditional likelihood approach by ignoring correlation among random effects and initial responses can lead to biased regularized estimates. Furthermore, we demonstrate that joint modeling of initial responses and subsequent observations in the structure of dynamic random-intercepts models leads to both consistency and Oracle properties of regularized estimators. We present theoretical results in both low- and high-dimensional settings and evaluate regularized estimators' performances by conducting simulation studies and analyzing a real dataset. Supporting information is available online.  相似文献   

Linear mixed models (LMM) are frequently used to analyze repeated measures data, because they are more flexible to modelling the correlation within-subject, often present in this type of data. The most popular LMM for continuous responses assumes that both the random effects and the within-subjects errors are normally distributed, which can be an unrealistic assumption, obscuring important features of the variations present within and among the units (or groups). This work presents skew-normal liner mixed models (SNLMM) that relax the normality assumption by using a multivariate skew-normal distribution, which includes the normal ones as a special case and provides robust estimation in mixed models. The MCMC scheme is derived and the results of a simulation study are provided demonstrating that standard information criteria may be used to detect departures from normality. The procedures are illustrated using a real data set from a cholesterol study.  相似文献   

The fiducial approach to the two components of variance random effects model developed by Venables and James (1978) is related to the Bayesian approach of Box and Tiao (1973). The operating characteristics, under repeated sampling, of the resulting interval estimators for the “within classes” variance component are investigated, and the behaviour of the two sets of intervals is found to be very similar, the coverage frequency of 95% probability intervals being approximately 91% when the “between classes” variance component is zero but rising rapidly to 95% as the between component increases. The probability intervals are shown to be shorter on average than a comparable confidence interval based upon the within classes sum of squares, and to be robust against nonnormality in the class means.  相似文献   

Many estimation procedures have been proposed for estimating variance components in unbalanced factorial models. A large proportion of these are based on the solution to a system of linear equations obtained from a set of quadratic forms and their expected value. This paper will present a numerical study of the small sample variance of eight variance component estimators of this type. The variances will be compared to the Bhattacharyya lower bound for unbiased estimators.  相似文献   

Clustered (longitudinal) count data arise in many bio-statistical practices in which a number of repeated count responses are observed on a number of individuals. The repeated observations may also represent counts over time from a number of individuals. One important problem that arises in practice is to test homogeneity within clusters (individuals) and between clusters (individuals). As data within clusters are observations of repeated responses, the count data may be correlated and/or over-dispersed. For over-dispersed count data with unknown over-dispersion parameter we derive two score tests by assuming a random intercept model within the framework of (i) the negative binomial mixed effects model and (ii) the double extended quasi-likelihood mixed effects model (Lee and Nelder, 2001). These two statistics are much simpler than a statistic derived by Jacqmin-Gadda and Commenges (1995) under the framework of the over-dispersed generalized linear model. The first statistic takes the over-dispersion more directly into the model and therefore is expected to do well when the model assumptions are satisfied and the other statistic is expected to be robust. Simulations show superior level property of the statistics derived under the negative binomial and double extended quasi-likelihood model assumptions. A data set is analyzed and a discussion is given.  相似文献   

Random effects models are considered for count data obtained in a cross or nested classification. The main feature of the proposed models is the use of the additive effects on the original scale in contrast to the commonly used log scale. The rationale behind this approach is given. The estimation of variance components is based on the usual mean square approach. Directly analogous results to those from the analysis of variance models for continuous data are obtained. The usual Poisson dispersion test procedure can be used not only to test for no overall random effects but also to assess the adequacy of the model. Individual variance component can be tested by using the usual F-test. To get a reliable estimate, a large number of factor levels seem to be required.  相似文献   

Multi-level models can be used to account for clustering in data from multi-stage surveys. In some cases, the intraclass correlation may be close to zero, so that it may seem reasonable to ignore clustering and fit a single-level model. This article proposes several adaptive strategies for allowing for clustering in regression analysis of multi-stage survey data. The approach is based on testing whether the PSU-level variance component is zero. If this hypothesis is retained, then variance estimates are calculated ignoring clustering; otherwise, clustering is reflected in variance estimation. A simple simulation study is used to evaluate the various procedures.  相似文献   

Summary.  A common application of multilevel models is to apportion the variance in the response according to the different levels of the data. Whereas partitioning variances is straightforward in models with a continuous response variable with a normal error distribution at each level, the extension of this partitioning to models with binary responses or to proportions or counts is less obvious. We describe methodology due to Goldstein and co-workers for apportioning variance that is attributable to higher levels in multilevel binomial logistic models. This partitioning they referred to as the variance partition coefficient. We consider extending the variance partition coefficient concept to data sets when the response is a proportion and where the binomial assumption may not be appropriate owing to overdispersion in the response variable. Using the literacy data from the 1991 Indian census we estimate simple and complex variance partition coefficients at multiple levels of geography in models with significant overdispersion and thereby establish the relative importance of different geographic levels that influence educational disparities in India.  相似文献   

Edgeworth expansions are derived for conditional distributions of sufficient statistics as well as conditional maximum likelihood estimators of log odds ratios in logistic regression models assuming that the risk factors are not almost equally distanced. Expansions are given in several special cases. Similar results are obtained for models with polytomous outcomes.  相似文献   

Nonlinear mixed-effects models are very useful to analyze repeated measures data and are used in a variety of applications. Normal distributions for random effects and residual errors are usually assumed, but such assumptions make inferences vulnerable to the presence of outliers. In this work, we introduce an extension of a normal nonlinear mixed-effects model considering a subclass of elliptical contoured distributions for both random effects and residual errors. This elliptical subclass, the scale mixtures of normal (SMN) distributions, includes heavy-tailed multivariate distributions, such as Student-t, the contaminated normal and slash, among others, and represents an interesting alternative to outliers accommodation maintaining the elegance and simplicity of the maximum likelihood theory. We propose an exact estimation procedure to obtain the maximum likelihood estimates of the fixed-effects and variance components, using a stochastic approximation of the EM algorithm. We compare the performance of the normal and the SMN models with two real data sets.  相似文献   

Summary.  Penalized regression spline models afford a simple mixed model representation in which variance components control the degree of non-linearity in the smooth function estimates. This motivates the study of lack-of-fit tests based on the restricted maximum likelihood ratio statistic which tests whether variance components are 0 against the alternative of taking on positive values. For this one-sided testing problem a further complication is that the variance component belongs to the boundary of the parameter space under the null hypothesis. Conditions are obtained on the design of the regression spline models under which asymptotic distribution theory applies, and finite sample approximations to the asymptotic distribution are provided. Test statistics are studied for simple as well as multiple-regression models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号