首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Plausible values are typically used in large-scale assessment studies, in particular, in the Trends in International Mathematics and Science Study and the Programme for International Student Assessment. Despite its large spread, there are still some questions regarding the use of plausible values and how such use affects statistical analyses. The aim of this paper is to demonstrate the role of plausible values in large-scale assessment surveys when multilevel modeling is used. Different user strategies concerning plausible values for multilevel models as well as means and variances are examined. The results show that some commonly used user strategies give incorrect results while others give reasonable estimates but incorrect standard errors. These findings are important for anyone wishing to make secondary analyses of large-scale assessment data, especially those interested in using multilevel models to analyze the data.  相似文献   

2.
Approaches that use the pseudolikelihood to perform multilevel modelling on survey data have been presented in the literature. To avoid biased estimates due to unequal selection probabilities, conditional weights can be introduced at each level. Less-biased estimators can also be obtained in a two-level linear model if the level-1 weights are scaled. In this paper, we studied several level-2 weights that can be introduced into the pseudolikelihood when the sampling design and the hierarchical structure of the multilevel model do not match. Two-level and three-level models were studied. The present work was motivated by a study that aims to estimate the contributions of lead sources to polluting the interior floor dust of the rooms within dwellings. We performed a simulation study using the real data collected from a French survey to achieve our objective. We conclude that it is preferable to use unweighted analyses or, at the most, to use conditional level-2 weights in a two-level or a three-level model. We state some warnings and make some recommendations.  相似文献   

3.
This article evaluates two methods of approximating cluster-level and conditional sampling weights when only unconditional sampling weights are available. For estimation of a multilevel analysis that does not include all facets of a sampling design, conditional sampling weights at each stage of the model should be used, but typically only the unconditional sampling weight of the ultimate sampling unit is provided on federal publicly-released datasets. Methods of approximating these conditional weights have been suggested but there has been no study of their adequacy. This demonstration and simulation study examines the feasibility of using these weight approximations.  相似文献   

4.
The authors consider the construction of weights for Generalised M‐estimation. Such weights, when combined with appropriate score functions, afford protection from biases arising through incorrectly specified response functions, as well as from natural variation. The authors obtain minimax fixed weights of the Mallows type under the assumption that the density of the independent variables is correctly specified, and they obtain adaptive weights when this assumption is relaxed. A simulation study indicates that one can expect appreciable gains in precision when the latter weights are used and the various sources of model uncertainty are present.  相似文献   

5.
Summary.  Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the 'Program for international student assessment' 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximum-likelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used.  相似文献   

6.
When multilevel models are estimated from survey data derived using multistage sampling, unequal selection probabilities at any stage of sampling may induce bias in standard estimators, unless the sources of the unequal probabilities are fully controlled for in the covariates. This paper proposes alternative ways of weighting the estimation of a two-level model by using the reciprocals of the selection probabilities at each stage of sampling. Consistent estimators are obtained when both the sample number of level 2 units and the sample number of level 1 units within sampled level 2 units increase. Scaling of the weights is proposed to improve the properties of the estimators and to simplify computation. Variance estimators are also proposed. In a limited simulation study the scaled weighted estimators are found to perform well, although non-negligible bias starts to arise for informative designs when the sample number of level 1 units becomes small. The variance estimators perform extremely well. The procedures are illustrated using data from the survey of psychiatric morbidity.  相似文献   

7.
Despite the popularity and importance, there is limited work on modelling data which come from complex survey design using finite mixture models. In this work, we explored the use of finite mixture regression models when the samples were drawn using a complex survey design. In particular, we considered modelling data collected based on stratified sampling design. We developed a new design-based inference where we integrated sampling weights in the complete-data log-likelihood function. The expectation–maximisation algorithm was developed accordingly. A simulation study was conducted to compare the new methodology with the usual finite mixture of a regression model. The comparison was done using bias-variance components of mean square error. Additionally, a simulation study was conducted to assess the ability of the Bayesian information criterion to select the optimal number of components under the proposed modelling approach. The methodology was implemented on real data with good results.  相似文献   

8.
We address the task of choosing prior weights for models that are to be used for weighted model averaging. Models that are very similar should usually be given smaller weights than models that are quite distinct. Otherwise, the importance of a model in the weighted average could be increased by augmenting the set of models with duplicates of the model or virtual duplicates of it. Similarly, the importance of a particular model feature (a certain covariate, say) could be exaggerated by including many models with that feature. Ways of forming a correlation matrix that reflects the similarity between models are suggested. Then, weighting schemes are proposed that assign prior weights to models on the basis of this matrix. The weighting schemes give smaller weights to models that are more highly correlated. Other desirable properties of a weighting scheme are identified, and we examine the extent to which these properties are held by the proposed methods. The weighting schemes are applied to real data, and prior weights, posterior weights and Bayesian model averages are determined. For these data, empirical Bayes methods were used to form the correlation matrices that yield the prior weights. Predictive variances are examined, as empirical Bayes methods can result in unrealistically small variances.  相似文献   

9.
In this paper, some extended Rasch models are analyzed in the presence of longitudinal measurements of a latent variable. Two main approaches, multidimensional and multilevel, are compared: we investigate the different information that can be obtained from the latent variable, and we give advice on the use of the different kinds of models. The multidimensional and multilevel approaches are illustrated with a simulation study and with a longitudinal study on the health-related quality of life in terminal cancer patients.  相似文献   

10.
A concept of adaptive least squares polynomials is introduced for modelling time series data. A recursion algorithm for updating coefficients of the adaptive polynomial (of a fixed degree) is derived. This concept assumes that the weights w are such that i) the importance of the data values, in terms of their weights, relative to each other stays fixed, and that ii) they satisfy the update property, i.e., the polynomial does not change if a new data value is a polynomial extrapolate. Closed form results are provided for exponential weights as a special case as they are shown to possess the update property when used with polynomials.

The concept of adaptive polynomials is similar to the linear adaptive prediction provided by the Kalman filter or the Least Mean Square algorithm of Widrow and Hoff. They can be useful in interpolating, tracking and analyzing nonstationary data.  相似文献   

11.
In the social sciences, applied researchers often face a statistical dilemma when multilevel data is structured such that lower-level units are not purely clustered within higher-level units. To aid applied researchers in appropriately analyzing such data structures, this study proposes a multiple membership growth curve model (MM-GCM). The MM-GCM offers some advantages to other similar modeling approaches, including greater flexibility in modeling the intercept at the time-point most desired for interpretation. A real longitudinal dataset from the field of education with a multiple membership structure, where some students changed schools over time, was used to demonstrate the application of the MM-GCM. Baseline and conditional MM-GCMs are presented, and parameter estimates were compared with two other common approaches to handling such data structures – the final school-GCM that ignores mobile students by only modeling the final school attended and the delete-GCM that deletes mobile students. Additionally, a simulation study was conducted to further assess the impact of ignoring mobility on parameter estimates. The results indicate that ignoring mobility results in substantial bias in model estimates, especially for cluster-level coefficients and variance components.KEYWORDS: HLM, growth curve model, multiple membership, mobility  相似文献   

12.
When using latent growth modeling (LGM), researchers often restrict the factor loadings, while the multilevel modeling (MLM) treats time as a metric variable. However, when individually varying times of observations are concerned in the longitudinal studies, the use of specified loadings would lead to inaccurate estimation. Based on piecewise growth modeling (PGM), this simulation study showed that (i) individually varying times of observations with larger boundaries got worse estimates and model fits when LGM was used; (ii) estimating the PGM across all the simulation situations was robust within MLM, whereas LGM got identically equal estimation with MLM only in the case of time boundaries of ±1 month or shorter; (iii) larger change of slope in piecewise modeling indicated better estimation.  相似文献   

13.
The standard methods for analyzing data arising from a ‘thorough QT/QTc study’ are based on multivariate normal models with common variance structure for both drug and placebo. Such modeling assumptions may be violated and when the sample sizes are small, the statistical inference can be sensitive to such stringent assumptions. This article proposes a flexible class of parametric models to address the above‐mentioned limitations of the currently used models. A Bayesian methodology is used for data analysis and models are compared using the deviance information criteria. Superior performance of the proposed models over the current models is illustrated through a real dataset obtained from a GlaxoSmithKline (GSK) conducted ‘thorough QT/QTc study’. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

14.
ABSTRACT

In this article we study the approximately unbiased multi-level pseudo maximum likelihood (MPML) estimation method for general multi-level modeling with sampling weights. We conduct a simulation study to determine the effect various factors have on the estimation method. The factors we included in this study are scaling method, size of clusters, invariance of selection, informativeness of selection, intraclass correlation, and variability of standardized weights. The scaling method is an indicator of how the weights are normalized on each level. The invariance of the selection is an indicator of whether or not the same selection mechanism is applied across clusters. The informativeness of the selection is an indicator of how biased the selection is. We summarize our findings and recommend a multi-stage procedure based on the MPML method that can be used in practical applications.  相似文献   

15.
Spatial econometric models estimated on the big geo-located point data have at least two problems: limited computational capabilities and inefficient forecasting for the new out-of-sample geo-points. This is because of spatial weights matrix W defined for in-sample observations only and the computational complexity. Machine learning models suffer the same when using kriging for predictions; thus this problem still remains unsolved. The paper presents a novel methodology for estimating spatial models on big data and predicting in new locations. The approach uses bootstrap and tessellation to calibrate both model and space. The best bootstrapped model is selected with the PAM (Partitioning Around Medoids) algorithm by classifying the regression coefficients jointly in a nonindependent manner. Voronoi polygons for the geo-points used in the best model allow for a representative space division. New out-of-sample points are assigned to tessellation tiles and linked to the spatial weights matrix as a replacement for an original point what makes feasible usage of calibrated spatial models as a forecasting tool for new locations. There is no trade-off between forecast quality and computational efficiency in this approach. An empirical example illustrates a model for business locations and firms' profitability.  相似文献   

16.
Zero-inflated count models are increasingly employed in many fields in case of “zero-inflation”. In modeling road traffic crashes, it has also shown to be useful in obtaining a better model-fitting when zero crash counts are over-presented. However, the general specification of zero-inflated model can not account for the multilevel data structure in crash data, which may be an important source of over-dispersion. This paper examines zero-inflated Poisson regression with site-specific random effects (REZIP) with comparison to random effect Poisson model and standard zero-inflated poison model. A practical and flexible procedure, using Bayesian inference with Markov Chain Monte Carlo algorithm and cross-validation predictive density techniques, is applied for model calibration and suitability assessment. Using crash data in Singapore (1998–2005), the illustrative results demonstrate that the REZIP model may significantly improve the model-fitting and predictive performance of crash prediction models. This improvement can contribute to traffic safety management and engineering practices such as countermeasure design and safety evaluation of traffic treatments.  相似文献   

17.
We study nonparametric estimation of the illness-death model using left-truncated and right-censored data. The general aim is to estimate the multivariate distribution of a progressive multi-state process. Maximum likelihood estimation under censoring suffers from problems of uniqueness and consistency, so instead we review and extend methods that are based on inverse probability weighting. For univariate left-truncated and right-censored data, nonparametric maximum likelihood estimation can be considerably improved when exploiting knowledge on the truncation distribution. We aim to examine the gain in using such knowledge for inverse probability weighting estimators in the illness-death framework. Additionally, we compare the weights that use truncation variables with the weights that integrate them out, showing, by simulation, that the latter performs more stably and efficiently. We apply the methods to intensive care units data collected in a cross-sectional design, and discuss how the estimators can be easily modified to more general multi-state models.  相似文献   

18.
Count data with excess zeros often occurs in areas such as public health, epidemiology, psychology, sociology, engineering, and agriculture. Zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression are useful for modeling such data, but because of hierarchical study design or the data collection procedure, zero-inflation and correlation may occur simultaneously. To overcome these challenges ZIP or ZINB may still be used. In this paper, multilevel ZINB regression is used to overcome these problems. The method of parameter estimation is an expectation-maximization algorithm in conjunction with the penalized likelihood and restricted maximum likelihood estimates for variance components. Alternative modeling strategies, namely the ZIP distribution are also considered. An application of the proposed model is shown on decayed, missing, and filled teeth of children aged 12 years old.  相似文献   

19.
Interpretation of principal components is difficult due to their weights (loadings, coefficients) being of various sizes. Whereas very small weights or very large weights can give clear indication of the importance of particular variables, weights that are neither large nor small (‘grey area’ weights) are problematical. This is a particular problem in the fast moving goods industries where a lot of multivariate panel data are collected on products. These panel data are subjected to univariate analyses and multivariate analyses where principal components (PCs) are key to the interpretation of the data. Several authors have suggested alternatives to PCs, seeking simplified components such as sparse PCs. Here components, termed simple components (SCs), are sought in conjunction with Thurstonian criteria that a component should have only a few variables highly weighted on it and each variable should be weighted heavily on just a few components. An algorithm is presented that finds SCs efficiently. Simple components are found for panel data consisting of the responses to a questionnaire on efficacy and other features of deodorants. It is shown that five SCs can explain an amount of variation within the data comparable to that explained by the PCs, but with easier interpretation.  相似文献   

20.
Moderated multiple regression provides a useful framework for understanding moderator variables. These variables can also be examined within multilevel datasets, although the literature is not clear on the best way to assess data for significant moderating effects, particularly within a multilevel modeling framework. This study explores potential ways to test moderation at the individual level (level one) within a 2-level multilevel modeling framework, with varying effect sizes, cluster sizes, and numbers of clusters. The study examines five potential methods for testing interaction effects: the Wald test, F-test, likelihood ratio test, Bayesian information criterion (BIC), and Akaike information criterion (AIC). For each method, the simulation study examines Type I error rates and power. Following the simulation study, an applied study uses real data to assess interaction effects using the same five methods. Results indicate that the Wald test, F-test, and likelihood ratio test all perform similarly in terms of Type I error rates and power. Type I error rates for the AIC are more liberal, and for the BIC typically more conservative. A four-step procedure for applied researchers interested in examining interaction effects in multi-level models is provided.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号