首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
Although the methodology for handling ordinal and dichotomous observed variables in structural equation models (SEMs) is developing rapidly, several important issues are unresolved. One of these is the optimal test statistic to apply as a test of overall model fit. We propose a new "vanishing tetrad" test statistic for such models. We build on Bollen's (1990) simultaneous test statistic for testing multiple vanishing tetrads and on Bollen and Ting's (1993) confirmatory tetrad analysis (CTA) for hypothesis testing of model structures. These and other works on vanishing tetrads assume continuous observed variables and do not consider observed categorical variables. In this paper we present a method to test models when some or all of the observed variables are collapsed or categorical versions of underlying continuous variables. The test statistic that we provide is an alternative "overall fit" statistic for SEMs with censored, ordinal, or dichotomous observed variables. Furthermore, the vanishing tetrad test sometimes permits us to compare the fit of some models that are not nested in the traditional likelihood ratio test. We illustrate the new test statistic with examples and a small simulation experiment comparing it with two other tests of model fit for SEMs with ordinal or dichotomous endogenous variables.  相似文献   

2.
We propose using latent class analysis as an alternative to log-linear analysis for the multiple imputation of incomplete categorical data. Similar to log-linear models, latent class models can be used to describe complex association structures between the variables used in the imputation model. However, unlike log-linear models, latent class models can be used to build large imputation models containing more than a few categorical variables. To obtain imputations reflecting uncertainty about the unknown model parameters, we use a nonparametric bootstrap procedure as an alternative to the more common full Bayesian approach. The proposed multiple imputation method, which is implemented in Latent GOLD software for latent class analysis, is illustrated with two examples. In a simulated data example, we compare the new method to well-established methods such as maximum likelihood estimation with incomplete data and multiple imputation using a saturated log-linear model. This example shows that the proposed method yields unbiased parameter estimates and standard errors. The second example concerns an application using a typical social sciences data set. It contains 79 variables that are all included in the imputation model. The proposed method is especially useful for such large data sets because standard methods for dealing with missing data in categorical variables break down when the number of variables is so large.  相似文献   

3.
Generalized linear models (GLMs), as defined by J. A. Nelder and R. W. M. Wedderburn (1972) , unify a class of regression models for categorical, discrete, and continuous response variables. As an extension of classical linear models, GLMs provide a common body of theory and methodology for some seemingly unrelated models and procedures, such as the logistic, Poisson, and probit models, that are increasingly used in family studies. This article provides an overview of the principle and the key components of GLMs, such as the exponential family of distributions, the linear predictor, and the link function. To illustrate the application of GLMs, this article uses Canadian national survey data to build an example focusing on the number of close friends among older adults. The article concludes with a discussion of the strengths and weaknesses of GLMs.  相似文献   

4.
This paper describes and contrasts two useful ways to employ a latent class variable as a mixture variable in regression analyses of panel data with a categorical dependent variable. One way is to model unobserved heterogeneity in the trajectory, or change in the distribution, of the dependent variable. Two models that accomplish this are the latent trajectory model and latent growth curve model for a categorical dependent variable having ordered categories. Each latent class here represents a distinct trajectory of the dependent variable. The latent trajectory model introduces covariate effects on the composition of latent classes, while the latent growth curve model introduces covariate effects on both the "intercept" and the "slope" of growth in logit, which may vary among latent classes.
The other useful way is to model unobserved heterogeneity in the state dependence of the dependent variable. Two models that accomplish this are introduced for a simultaneous analysis of response probability and response stability, and the latent class variable is employed to distinguish two latent populations that differ in the stability of responses over time. One of them is the switching multinomial logit model with a time-lagged dependent variable as its separation indicator, and the other is the mover-stayer regression model.
By applying these four models to empirical data, this paper demonstrates the usefulness of these models for panel-data analyses. Example programs for specifying these models based on the LEM program are also provided.  相似文献   

5.
A General Class of Nonparametric Models for Ordinal Categorical Data   总被引:1,自引:0,他引:1  
This paper presents a general class of models for ordinal categorical data that can be specified by means of linear and/or log-linear equality and/or inequality restrictions on the (conditional) probabilities of a multiway contingency table. Some special cases are models with ordered local odds ratios, models with ordered cumulative response probabilities, order-restricted row association and column association models, and models for stochastically ordered marginal distributions. A simple unidimensional Newton algorithm is proposed for obtaining the restricted maximum-likelihood estimates. In situations in which there is some kind of missing data, this algorithm can be implemented in the M step of an EM algorithm. Computation of p-values of testing statistics is performed by means of parametric bootstrapping.  相似文献   

6.
In many applications observations have some type of clustering, with observations within clusters tending to be correlated. A common instance of this occurs when each subject in the sample undergoes repeated measurement, in which case a cluster consists of the set of observations for the subject. One approach to modeling clustered data introduces cluster-level random effects into the model. The use of random effects in linear models for normal responses is well established. By contrast, random effects have only recently seen much use in models for categorical data. This chapter surveys a variety of potential social science applications of random effects modeling of categorical data. Applications discussed include repeated measurement for binary or ordinal responses, shrinkage to improve multiparameter estimation of a set of proportions or rates, multivariate latent variable modeling, hierarchically structured modeling, and cluster sampling. The models discussed belong to the class of generalized linear mixed models (GLMMs), an extension of ordinary linear models that permits nonnormal response variables and both fixed and random effects in the predictor term. The models are GLMMs for either binomial or Poisson response variables, although we also present extensions to multicategory (nominal or ordinal) responses. We also summarize some of the technical issues of model-fitting that complicate the fitting of GLMMs even with existing software.  相似文献   

7.
We propose an alternative method of conducting exploratory latent class analysis that utilizes latent class factor models, and compare it to the more traditional approach based on latent class cluster models. We show that when formulated in terms of R mutually independent, dichotomous latent factors, the LC factor model has the same number of distinct parameters as an LC cluster model with R+1 clusters. Analyses over several data sets suggest that LC factor models typically fit data better and provide results that are easier to interpret than the corresponding LC cluster models. We also introduce a new graphical "bi-plot" display for LC factor models and compare it to similar plots used in correspondence analysis and to a barycentric coordinate display for LC cluster models. New results on identification of LC models are also presented. We conclude by describing various model extensions and an approach for eliminating boundary solutions in identified and unidentified LC models, which we have implemented in a new computer program.  相似文献   

8.
Many proposed methods for analyzing clustered ordinal data focus on the regression model and consider the association structure within a cluster as a nuisance. However, the association structure is often of equal interest—for example, temporal association in longitudinal studies and association between responses to similar questions in a survey. We discuss the use, appropriateness, and interpretability of various latent variable and Markov models for the association structure and propose a new structure that exploits the ordinality of the response. The models are illustrated with a study concerning opinions regarding government spending and an analysis of stability and change in teenage marijuana use over time, where we reveal different behavioral patterns for boys and girls through a comprehensive investigation of individual response profiles.  相似文献   

9.
Effects of categorical variables in statistical models typically are reported in terms of comparison either with a reference category or with a suitably defined "mean effect," for reasons of parameter identification. A conventional presentation of estimates and standard errors, but without the full variance–covariance matrix, does not allow subsequent readers either to make inference on a comparison of interest that is not presented or to compare or combine results from different studies where the same variables but different reference levels are used. It is shown how an alternative presentation, in terms of "quasi standard errors," overcomes this problem in an economical and intuitive way. A primary application is the reporting of effects of categorical predictors, often called factors, in linear and generalized linear models, hazard models, multinomial–response models, generalized additive models, etc. Other applications include the comparison of coefficients between related regression equations—for example, log–odds ratios in a multinomial logit model—and the presentation of multipliers or "scores" in models with multiplicative interaction structure.  相似文献   

10.
The standard latent class model is a finite mixture of indirectly observed multinomial distributions, each of which is assumed to exhibit statistical independence. Latent class analysis has been applied in a wide variety of research contexts, including studies of mobility, educational attainment, agreement, and diagnostic accuracy, and as measurement error models in social research. One of the attractive features of the latent class model in these settings is that the parameters defining the individual multinomials are readily interpretable marginal probabilities, conditional on the unobserved latent variable(s), that are often of substantive interest. There are, however, settings where the local-independence axiom is not supported, and hence it is useful to consider some form of local dependence. In this paper we consider a family of models defined in terms of finite mixtures of multinomial models where the multinomials are parameterized in terms of a set of models for the univariate marginal distributions and for marginal associations. Local dependence is introduced through the models for marginal associations, and the standard latent class model obtains as a special case. Three examples are analyzed with the models to illustrate their utility in analyzing complex cross-classifications.  相似文献   

11.
Multilevel Latent Class Models   总被引:4,自引:0,他引:4  
The latent class (LC) models that have been developed so far assume that observations are independent. Parametric and nonparametric random–coefficient LC models are proposed here, which will make it possible to modify this assumption. For example, the models can be used for the analysis of data collected with complex sampling designs, data with a multilevel structure, and multiple–group data for more than a few groups. An adapted EM algorithm is presented that makes maximum–likelihood estimation feasible. The new model is illustrated with examples from organizational, educational, and cross–national comparative research.  相似文献   

12.
An "effect display" is a graphical or tabular summary of a statistical model based on high-order terms in the model. Effect displays have previously been defined by Fox (1987, 2003) for generalized linear models (including linear models). Such displays are especially compelling for complicated models—for example, those including interactions or polynomial terms. This paper extends effect displays to models commonly used for polytomous categorical response variables: the multinomial logit model and the proportional-odds logit model. Determining point estimates of effects for these models is a straightforward extension of results for the generalized linear model. Estimating sampling variation for effects on the probability scale in the multinomial and proportional-odds logit models is more challenging, however, and we use the delta method to derive approximate standard errors. Finally, we provide software for effect displays in the R statistical computing environment.  相似文献   

13.
Ordinal response scales with a middle category are widely used in public opinion studies, psychology, medicine, computed tomography and other fields. The usual models in the statistical literature for ordinal response variables treat the case where the scale has a natural middle category no differently from the case where the scale does not have a middle category. This paper proposes new models for the analysis of ordinal response scales with middle categories, applying these to data collected in 1993-1994 on American opinion toward the balance between environmental quality and economic prosperity. Some of the models should also be useful when the scale does not have a natural middle category. The models are easily used to address issues of concern in empirical work—for example, stochastic ordering among covariate classes and asymmetry about the middle category. Log-linear models are considered in Section 2. The relationship between the normal distribution and a quadratic log-linear model with known scores, discussed in this section, is the basis for Section 3, which considers a log-nonlinear model with unknown scores estimated from the data. Section 4 shows how generalized log-linear and generalized log-nonlinear models can be used to simultaneously study whether the response is below, at, or above the midpoint, and the conditional distribution of responses above (below) the midpoint. These models are also useful when the response scale is viewed as nested and/or the response process is sequential.  相似文献   

14.
Many interesting sociological questions pertain to how the association between two variables depends on a third variable. In sociological applications, the third variable often pertains to countries, to subgroups of a population, or to time periods. We propose a regression-type approach that specifies that the log-odds- ratios that describe the two-way association of interest are a linear function of latent scores for the third variable. Additive and multiplicative models currently in use by researchers are special cases of the regression-type model. To illustrate the utility of the regression-type approach, we apply this approach to analyze (1) data on occupational mobility in the United States, Britain, and Japan (comparing mobility in these countries) and (2) data on the association between religion and voting behavior in U.S. presidential elections from 1968 to 1992 (comparing this association in the different elections). We also introduce here graphical displays that can be used to obtain worthwhile information about goodness-of-fit and to aid in substantive interpretation.  相似文献   

15.
A partial order of discrete beliefs based on a generalization of item order in Guttman scaling generates a nonunidimensional collection of latent belief states that can be represented by a distributive lattice. By incorporating misclassification errors under local independence assumptions, the lattice structure is transformed into a latent class model for observed response states. We apply this model to survey responses dealing with government welfare programs and suggest that our approach can retrieve information where unidimensional and multidimensional models do not fit. The concluding section discusses directions for future work.  相似文献   

16.
17.
In the social sciences logit and probit models are often used multivariate data analysis procedures for binary dependent variables. Both procedures can be thought of as resting on a linear model for an unobserved variable y* from which a nonlinear model for the probability of y?=?1 is derived. We first show that compared to linear models this nonlinearity leads to problems of interpreting results from such analysis. In particular odds ratios (exponentiated logit coefficients) often used in logistic regression are problematic in this respect. Instead we recommend using graphical procedures and reporting (corrected) average marginal effects (AME). Based on a series of Monte-Carlo simulations we next demonstrate that the regression coefficients from logit and probit models should not be compared between nested models. Because model building in the social sciences often employs a stepwise procedure a method allowing valid comparisons of effect sizes between models would be advantageous. Results from our simulation study show that average marginal effects and regression coefficients corrected by a method proposed by Karlson et al. (Sociological Methodology 42, 2012) lead to satisfactory results in many different scenarios. In contrast, y*-standardized coefficients are of limited utility and coefficients from a linear probability model should only be used with normally distributed variables.  相似文献   

18.
We consider partially observed network data as defined in Handcock and Gile (2010). More specifically we introduce an elaboration of the Bayesian data augmentation scheme of Koskinen et al. (2010) that uses the exchange algorithm (Caimo and Friel, 2011) for inference for the exponential random graph model (ERGM) where tie variables are partly observed. We illustrate the generating of posteriors and unobserved tie-variables with empirical network data where 74% of the tie variables are unobserved under the assumption that some standard assumptions hold true. One of these assumptions is that covariates are fixed and completely observed. A likely scenario is that also covariates might only be partially observed and we propose a further extension of the data augmentation algorithm for missing attributes. We provide an illustrative example of parameter inference with nearly 30% of dyads affected by missing attributes (e.g. homophily effects). The assumption that all actors are known is another assumption that is liable to be violated so that there are “covert actors”. We briefly discuss various aspects of this problem with reference to the Sageman (2004) data set on suspected terrorists. We conclude by identifying some areas in need of further research.  相似文献   

19.
A model is considered for the regression analysis of multivariate binary data such as repeated-measures data (for example, panel data) or multiple-indicators with measures of some underlying characteristic such as attitude or ability (for example, surveys or tests). The model is related to the usual Rasch model, the usual latent-class model, and other familiar models such as logistic regression. In addition to a regression specification, the model includes parameters that describe heterogeneity not accounted for by the predictors. In contrast to most other approaches, a nonparametric specification of the latent mixing distribution is used, leading to a formulation based on scaled latent classes. We examine the relationship between this model and several other models, give a tractable formulation of the likelihood function and likelihood equations, present an algorithm for maximum-likelihood estimation, and analyze marginal and conditional latent structures. The approach is illustrated with longitudinal data from the German Socioeconomic Panel.  相似文献   

20.
This paper describes and analyzes research on the dynamics of long-term care and the policy relevance of identifying the sources of persistence in caregiving arrangements (including the effect of dynamics on parameter estimates, implications for family welfare, parent welfare, child welfare, and cost of government programs). We discuss sources and causes of observed persistence in caregiving arrangements including inertia/state dependence (confounded by unobserved heterogeneity) and costs of changing caregivers. We comment on causes of dynamics including learning/human capital accumulation; burnout; and game-playing. We suggest how to deal with endogenous geography; dynamics in discrete and continuous choices; and equilibrium issues (multiple equilibria, dynamic equilibria). We also present an overview of commonly used longitudinal data sets and evaluate their relative advantages/disadvantages. We also discuss other data issues related to noisy measures of wealth and family structure. Finally, we suggest some methods to handle econometric problems such as endogeneous geography.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号