首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 546 毫秒
Confidence regions for simple correspondence analysis allow for the identification of categories that are consistent with independence, and those that are not. This paper describes a procedure for constructing elliptical regions which takes into account the unequal weighting of each of the axes of the plot.  相似文献   

Preference decisions will usually depend on the characteristics of both the judges and the objects being judged. In the analysis of paired comparison data concerning European universities and students' characteristics, it is demonstrated how to incorporate subject-specific information into Bradley–Terry-type models. Using this information it is shown that preferences for universities and therefore university rankings are dramatically different for different groups of students. A log-linear representation of a generalized Bradley–Terry model is specified which allows simultaneous modelling of subject- and object-specific covariates and interactions between them. A further advantage of this approach is that standard software for fitting log-linear models, such as GLIM, can be used.  相似文献   


In this article, Bayesian estimation of the expected cell counts for log-linear models is considered. The prior specified for log-linear parameters is used to determine a prior for expected cell counts, by means of the family and parameters of prior distributions. This approach is more cost-effective than working directly with cell counts because converting prior information into a prior distribution on the log-linear parameters is easier than that of on the expected cell counts. While proceeding from the prior on log-linear parameters to the prior of the expected cell counts, we faced with a singularity problem of variance matrix of the prior distribution, and added a new precision parameter to solve the problem. A numerical example is also given to illustrate the usage of the new parameter.  相似文献   

One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modelling approaches to capture–recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co-workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log-linear model; however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the 'size' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and the Bayesian contexts.  相似文献   

It is quite common that raters may need to classify a sample of subjects on a categorical scale. Perfect agreement can rarely be observed partly because of different perceptions about the meanings of the category labels between raters and partly because of factors such as intrarater variability. Usually, category indistinguishability occurs between adjacent categories. In this article, we propose a simple log-linear model combining ordinal scale information and category distinguishability between ordinal categories for modelling agreement between two raters. For the proposed model, no score assignment is required to the ordinal categories. An algorithm and statistical properties will be provided.  相似文献   

Summary Nonsymmetric correspondence analysis is a model meant for the analysis of the dependence in a two-way continengy table, and is an alternative to correspondence analysis. Correspondence analysis is based on the decomposition of Pearson's Ф2-index Nonsymmetric correspondence analysis is based on the decomposition of Goodman-Kruskal's τ-index for predicatablity. In this paper, we approach nonsymmetric correspondence analysis as a statistical model based on a probability distribution. We provide algorithms for the maximum likelihood and the least-squares estimation with linear constraints upon model parameters. The nonsymmetric correspondence analysis model has many properties that can be useful for prediction analysis in contingency tables. Predictability measures are introduced to identify the categories of the response variable that can be best predicted, as well as the categories of the explanatory variable having the highest predictability power. We describe the interpretation of model parameters in two examples. In the end, we discuss the relations of nonsymmetric correspondence analysis with other reduced-rank models.  相似文献   

Correspondence analysis is a popular statistical technique used to identify graphically the presence, and structure, of association between two or more cross-classified categorical variables. Such a procedure is very useful when it is known that there is a symmetric (two-way) relationship between the variables. When such a relationship is known not to exist, non-symmetrical correspondence analysis is more appropriate as a method of establishing the source of association. This paper highlights some tools that can be used to explore the behaviour of asymmetric categorical variables. These tools consist of confidence regions, the link between non-symmetrical correspondence analysis and the analysis of variance of categorical variables, and the effect of imposing linear constraints. We also explore the application of non-symmetrical correspondence analysis to three-way contingency tables.  相似文献   

A family of log-linear models are proposed to describe contingency tables in which one variable can be considered as the response to the remaining. The proposed models take into account the ordering nature of the response categories and have structure similar to that employed in polynomial regression. Stochastic ordering of the response distributions under the proposed models is discussed and the model-reduction techniques are developed. The proposed models are applied to two data sets previously analysed in the literature.  相似文献   

The paper presents a partition of the Pearson chi-squared statistic for triply ordered three-way contingency tables. The partition invokes orthogonal polynomials and identifies three-way association terms as well as each combination of two-way associations. This partition provides information about the structure of each variable by identifying important bivariate and trivariate associations in terms of location (linear), dispersion (quadratic) and higher order components. The significance of each term in the partition, and each association within each term can also be determined.
The paper compares the chi-squared partition with the log-linear models of Agresti (1994) for multi-way contingency tables with ordinal categories, by generalizing the model proposed by Haberman (1974).  相似文献   

SUMMARY Non-completion of higher education degree courses is a considerable problem, incurring costs on the taxpayer, higher education institutions and the students who fail to complete. Closer examination of the data reveals that non-completion rates in higher education vary substantially across institutions and by subject of degree. The purpose of this paper is to investigate, within each of 13 broad subject categories, the potential determinants of inter-university variations in non-completion rates. Published data are used to compute university non-completion rates over four time periods and to construct corresponding explanatory variables which could potentially be related to non-completion rates. The explanatory variables measure the characteristics (both academic and socioeconomic) of students recruited by universities and the characteristics of the institutions themselves. The significance of the relationship between the possible explanatory variables and non-completion rates within each given subject is assessed using both weighted leastsquares and weighted logit analysis. The conclusions drawn from the results of each technique are identical, and, therefore, for interpretation reasons, only the results of the weighted least-squares analysis are reported. As expected, the academic quality of student entrants is an important determinant of non-completion rates in the majority of subjects, although the magnitude of the effect varies according to subject. Variables reflecting the age and gender mix of university entrants are generally not significantly related to noncompletion rates. The characteristics of institutions which are significantly related to non-completion rates in specific subjects include the staff student ratio and the length of the degree course  相似文献   

This article examines whether there are educational premiums on the quantity side of the labor market. We document four findings: (1) Trend employment patterns shifted for most educational levels post-1977; (2) the lower the level of educational attainment, the more volatile the employment ratio; (3) the volatility of employment for female high school dropouts increased over time even as the economy became less volatile; and (4) since 1984, the responses of skilled and unskilled employment to the business cycle have become more alike. This latter finding is consistent with a reduced degree of capital-skill complementarity during this period.  相似文献   

This article proposes a new spatial cluster detection method for longitudinal outcomes that detects neighborhoods and regions with elevated rates of disease while controlling for individual level confounders. The proposed method, CumResPerm, utilizes cumulative geographic residuals through a permutation test to detect potential clusters which are defined as sets of administrative regions, such as a town or group of administrative regions. Previous cluster detection methods are not able to incorporate individual level data including covariate adjustment, while still being able to define potential clusters using informative neighborhood or town boundaries. Often, it is of interest to detect such spatial clusters because individuals residing in a town may have similar environmental exposures or socioeconomic backgrounds due to administrative reasons, such as zoning laws. Therefore, these boundaries can be very informative and more relevant than arbitrary clusters such as the standard circle or square. Application of the CumResPerm method will be illustrated by the Home Allergens and Asthma prospective cohort study analyzing the relationship between area or neighborhood residence and repeated measured outcome, occurrence of wheeze in the last six months, while taking into account mobile locations.  相似文献   

Taguchi's statistic has long been known to be a more appropriate measure of association for ordinal variables than the Pearson chi-squared statistic. Therefore, there is some advantage in using Taguchi's statistic for performing correspondence analysis when a two-way contingency table consists of one ordinal categorical variable. This article will explore the development of correspondence analysis using a decomposition of Taguchi's statistic.  相似文献   

We discuss a general application of categorical data analysis to mutations along the HIV genome. We consider a multidimensional table for several positions at the same time. Due to the complexity of the multidimensional table, we may collapse it by pooling some categories. However, the association between the remaining variables may not be the same as before collapsing. We discuss the collapsibility of tables and the change in the meaning of parameters after collapsing categories. We also address this problem with a log-linear model. We present a parameterization with the consensus output as the reference cell as is appropriate to explain genomic mutations in HIV. We also consider five null hypotheses and some classical methods to address them. We illustrate methods for six positions along the HIV genome, through consideration of all triples of positions.  相似文献   

The risk of a child dying before completing five years of age is highest in Sub-Saharan African countries. But Child mortality rates have shown substantial decline in Ethiopia. For this study, the 2000, 2005 and 2011 Ethiopian Demographic Survey (EDHS) was used. Generalized linear mixed model with spatial covariance structure was adapted. The model allowed for spatial correlation, and leads to the more realistic estimate for under-five mortality risk factors. The analysis showed that the risk of under-five mortality shows decline in years. But, some regions showed increase in years. The study highlights the need to implement better education for family planning and child care to improve the under-five mortality situation in some administrative areas.  相似文献   

This paper addresses the problem of analyzing a three-way contingency table that is upper-triangular, and a priori symmetric within layers. The log-linear model is modified to handle this kind of table, and maximum likelihood estimation is carried out for the modified log-linear model. This leads to an expression of the maximum likelihood estimates exclusively in terms of the observed cell counts. It is skin this analysis is equivalent to an application of the gone log-linear model to an artificially complete table, obtain. by splitting the off-diagonal cells in half within layers. This analysis is used in analyzing the results of a study done to determine the effect of the sex-linked dwarfing gene in male chickens on resistance to E. coli infection; the conclusion differs from that of a previous analysis of the same data (see Norwood and Hinkelmann 1978). It is found, in fact, that the structure of association among the two allele variables and the disease variable is somewhat more complex than previously proposed. A second example is taken from Ishii (1960). Finally, collapsibility conditions for the modified log-linear model, as well as various other sampling plans and limitations to the testing procedure, are discussed.  相似文献   

Since correspondence analysis appears to be sensitive to outliers, it is important to be able to evaluate the sensitivity of the data on the results. This article deals with measuring the influence of rows and columns on the results obtained with correspondence analysis. To establish the influence of individuals on the analysis, we use the notion of influence curve and we propose a general criterion based on the mean square error to measure the sensitivity of the correspondence analysis and its robustness. A numerical example is presented to illustrate the notions developed in this article.  相似文献   

This article presents an empirical analysis of firms' order backlogs, inventories, production, and price adjustments to unanticipated demand shocks. The data are obtained from quarterly INSEE Business Survey Tests on firms' realizations, expectations, and appraisals of some various economic variables. The analysis is based on the formulation and the estimation of a recursive system of conditional log-linear probability models.  相似文献   

With the aim of assessing the extent of the differences in the context of Italian educational system, the paper applies multilevel modeling to a new administrative dataset, containing detailed information for more than 500,000 students at grade 6 in the year 2011/2012, provided by the Italian Institute for the Evaluation of Educational System. Data are grouped by classes, schools and geographical areas. Different models for each area are fitted, in order to properly address the heteroscedasticity of the phenomenon. The results show that it is possible to estimate statistically significant “school effects”, i.e., the positive/negative association of attending a specific school and the student’s test score, after a case-mix adjustment. Therefore, the paper’s most important message is that school effects are different in terms of magnitude and types in the three geographical macro areas (Northern, Central and Southern Italy) and are dependent on specific students’ and schools’ characteristics.  相似文献   

The author investigates the analysis of unreplicated factorial experiments from a geometric perspective. He considers more specifically a (k + 1)‐run experiment used to estimate k orthogonal contrasts. He observes that once centered and scaled to unit length, the response vector can be viewed as a point on the unit sphere in the vector space spanned by the contrasts. In this context, a model selection procedure is equivalent to a partition of the unit sphere into regions corresponding to the different models considered. The author exploits this approach to gain useful insights into the analysis of such experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号