首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
Quantifying driver crash risks has been difficult because the exposure data are often incompatible with crash frequency data. Induced exposure methods provide a promising idea that a relative measurement of driver crash risks can be derived solely from crash frequency data. This paper describes an application of the extended Bradley–Terry model for paired preferences to estimating driver crash risks. We estimate the crash risk for driver groups defined by driver–vehicle characteristics from log-linear models in terms of a set of relative risk scores by using only crash frequency data. Illustrative examples using police-reported crash data from Hawaii are presented.  相似文献   

2.
When estimating treatment effect on count outcome of given population, one uses different models in different studies, resulting in non-comparable measures of treatment effect. Here we show that the marginal rate differences in these studies are comparable measures of treatment effect. We estimate the marginal rate differences by log-linear models and show that their finite-sample maximum-likelihood estimates are unbiased and highly robust with respect to effects of dispersing covariates on outcome. We get approximate finite-sample distributions of these estimates by using the asymptotic normal distribution of estimates of the log-linear model parameters. This method can be easily applied to practice.  相似文献   

3.
ABSTRACT

In this article, Bayesian estimation of the expected cell counts for log-linear models is considered. The prior specified for log-linear parameters is used to determine a prior for expected cell counts, by means of the family and parameters of prior distributions. This approach is more cost-effective than working directly with cell counts because converting prior information into a prior distribution on the log-linear parameters is easier than that of on the expected cell counts. While proceeding from the prior on log-linear parameters to the prior of the expected cell counts, we faced with a singularity problem of variance matrix of the prior distribution, and added a new precision parameter to solve the problem. A numerical example is also given to illustrate the usage of the new parameter.  相似文献   

4.
Summary.  We develop a class of log-linear structural models that is suited to estimation of small area cross-classified counts based on survey data. This allows us to account for various associ- ation structures within the data and includes as a special case the restricted log-linear model underlying structure preserving estimation. The effect of survey design can be incorporated into estimation through the specification of an unbiased direct estimator and its associated covariance structure. We illustrate our approach by applying it to estimation of small area labour force characteristics in Norway.  相似文献   

5.
ABSTRACT

Log-linear models for the distribution on a contingency table are represented as the intersection of only two kinds of log-linear models. One assuming that a certain group of the variables, if conditioned on all other variables, has a jointly independent distribution and another one assuming that a certain group of the variables, if conditioned on all other variables, has no highest order interaction. The subsets entering into these models are uniquely determined by the original log-linear model. This canonical representation suggests considering joint conditional independence and conditional no highest order association as the elementary building blocks of log-linear models.  相似文献   

6.
One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modelling approaches to capture–recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co-workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log-linear model; however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the 'size' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and the Bayesian contexts.  相似文献   

7.
As the number of random variables for the categorical data increases, the possible number of log-linear models which can be fitted to the data increases rapidly, so that various model selection methods are developed. However, we often found that some models chosen by different selection criteria do not coincide. In this paper, we propose a comparison method to test the final models which are non-nested. The statistic of Cox (1961, 1962) is applied to log-linear models for testing non-nested models, and the Kullback-Leibler measure of closeness (Pesaran 1987) is explored. In log-linear models, pseudo estimators for the expectation and the variance of Cox's statistic are not only derived but also shown to be consistent estimators.  相似文献   

8.
Strict collapsibility and model collapsibility are two important concepts associated with the dimension reduction of a multidimensional contingency table, without losing the relevant information. In this paper, we obtain some necessary and sufficient conditions for the strict collapsibility of the full model, with respect to an interaction factor or a set of interaction factors, based on the interaction parameters of the conditional/layer log-linear models. For hierarchical log-linear models, we present also necessary and sufficient conditions for the full model to be model collapsible, based on the conditional interaction parameters. We discuss both the cases where one variable or a set of variables is conditioned. The connections between the strict collapsibility and the model collapsibility are also pointed out. Our results are illustrated through suitable examples, including a real life application.  相似文献   

9.
In medical diagnostic testing problems, the covariate adjusted receiver operating characteristic (ROC) curves have been discussed recently for achieving the best separation between disease and control. Due to various restrictions such as cost, the availability of patients, and ethical issues quite frequently only limited information is available. As a result, we are unlikely to have a large enough overall sample size to support reliable direct estimations of ROCs for all the underlying covariates of interest. For example, some genetic factors are less commonly observable compared with others. To get an accurate covariate adjusted ROC estimation, novel statistical methods are needed to effectively utilize the limited information. Therefore, it is desirable to use indirect estimates that borrow strength by employing values of the variables of interest from neighbouring covariates. In this paper we discuss two semiparametric exponential tilting models, where the density functions from different covariate levels share a common baseline density, and the parameters in the exponential tilting component reflect the difference among the covariates. With the proposed models, the estimated covariate adjusted ROC is much smoother and more efficient than the nonparametric counterpart without borrowing information from neighbouring covariates. A simulation study and a real data application are reported. The Canadian Journal of Statistics 40: 569–587; 2012 © 2012 Statistical Society of Canada  相似文献   

10.
In this study, estimation of the parameters of the zero-inflated count regression models and computations of posterior model probabilities of the log-linear models defined for each zero-inflated count regression models are investigated from the Bayesian point of view. In addition, determinations of the most suitable log-linear and regression models are investigated. It is known that zero-inflated count regression models cover zero-inflated Poisson, zero-inflated negative binomial, and zero-inflated generalized Poisson regression models. The classical approach has some problematic points but the Bayesian approach does not have similar flaws. This work points out the reasons for using the Bayesian approach. It also lists advantages and disadvantages of the classical and Bayesian approaches. As an application, a zoological data set, including structural and sampling zeros, is used in the presence of extra zeros. In this work, it is observed that fitting a zero-inflated negative binomial regression model creates no problems at all, even though it is known that fitting a zero-inflated negative binomial regression model is the most problematic procedure in the classical approach. Additionally, it is found that the best fitting model is the log-linear model under the negative binomial regression model, which does not include three-way interactions of factors.  相似文献   

11.
Most methods for describing the relationship among random variables require specific probability distributions and some assumptions concerning random variables. Mutual information, based on entropy to measure the dependency among random variables, does not need any specific distribution and assumptions. Redundancy, which is an analogous version of mutual information, is also proposed as a method. In this paper, the concepts of redundancy and mutual information are explored as applied to multi-dimensional categorical data. We found that mutual information and redundancy for categorical data can be expressed as a function of the generalized likelihood ratio statistic under several kinds of independent log-linear models. As a consequence, mutual information and redundancy can also be used to analyze contingency tables stochastically. Whereas the generalized likelihood ratio statistic to test the goodness-of-fit of the log-linear models is sensitive to the sample size, the redundancy for categorical data does not depend on sample size but depends on its cell probabilities.  相似文献   

12.
Several survival regression models have been developed to assess the effects of covariates on failure times. In various settings, including surveys, clinical trials and epidemiological studies, missing data may often occur due to incomplete covariate data. Most existing methods for lifetime data are based on the assumption of missing at random (MAR) covariates. However, in many substantive applications, it is important to assess the sensitivity of key model inferences to the MAR assumption. The index of sensitivity to non-ignorability (ISNI) is a local sensitivity tool to measure the potential sensitivity of key model parameters to small departures from the ignorability assumption, needless of estimating a complicated non-ignorable model. We extend this sensitivity index to evaluate the impact of a covariate that is potentially missing, not at random in survival analysis, using parametric survival models. The approach will be applied to investigate the impact of missing tumor grade on post-surgical mortality outcomes in individuals with pancreas-head cancer in the Surveillance, Epidemiology, and End Results data set. For patients suffering from cancer, tumor grade is an important risk factor. Many individuals in these data with pancreas-head cancer have missing tumor grade information. Our ISNI analysis shows that the magnitude of effect for most covariates (with significant effect on the survival time distribution), specifically surgery and tumor grade as some important risk factors in cancer studies, highly depends on the missing mechanism assumption of the tumor grade. Also a simulation study is conducted to evaluate the performance of the proposed index in detecting sensitivity of key model parameters.  相似文献   

13.
ABSTRACT

Background: Many exposures in epidemiological studies have nonlinear effects and the problem is to choose an appropriate functional relationship between such exposures and the outcome. One common approach is to investigate several parametric transformations of the covariate of interest, and to select a posteriori the function that fits the data the best. However, such approach may result in an inflated Type I error. Methods: Through a simulation study, we generated data from Cox's models with different transformations of a single continuous covariate. We investigated the Type I error rate and the power of the likelihood ratio test (LRT) corresponding to three different procedures that considered the same set of parametric dose-response functions. The first unconditional approach did not involve any model selection, while the second conditional approach was based on a posteriori selection of the parametric function. The proposed third approach was similar to the second except that it used a corrected critical value for the LRT to ensure a correct Type I error. Results: The Type I error rate of the second approach was two times higher than the nominal size. For simple monotone dose-response, the corrected test had similar power as the unconditional approach, while for non monotone, dose-response, it had a higher power. A real-life application that focused on the effect of body mass index on the risk of coronary heart disease death, illustrated the advantage of the proposed approach. Conclusion: Our results confirm that a posteriori selecting the functional form of the dose-response induces a Type I error inflation. The corrected procedure, which can be applied in a wide range of situations, may provide a good trade-off between Type I error and power.  相似文献   

14.
Parameter estimation for association and log-linear models is an important aspect of the analysis of cross-classified categorical data. Classically, iterative procedures, including Newton's method and iterative scaling, have typically been used to calculate the maximum likelihood estimates of these parameters. An important special case occurs when the categorical variables are ordinal and this has received a considerable amount of attention for more than 20 years. This is because models for such cases involve the estimation of a parameter that quantifies the linear-by-linear association and is directly linked with the natural logarithm of the common odds ratio. The past five years has seen the development of non-iterative procedures for estimating the linear-by-linear parameter for ordinal log-linear models. Such procedures have been shown to lead to numerically equivalent estimates when compared with iterative, maximum likelihood estimates. Such procedures also enable the researcher to avoid some of the computational difficulties that commonly arise with iterative algorithms. This paper investigates and evaluates the performance of three non-iterative procedures for estimating this parameter by considering 14 contingency tables that have appeared in the statistical and allied literature. The estimation of the standard error of the association parameter is also considered.  相似文献   

15.
The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but in fact it extends back at least into the 19th century. Moreover, it remains an active area of research today. In this paper we give an overview of this history focussing on the development of log-linear models and their estimation via the method of maximum likelihood. Roy played a crucial role in this development with two papers co-authored with his students, Mitra and Marvin Kastenbaum, at roughly the mid-point temporally in this development. Then we describe a problem that eluded Roy and his students, that of the implications of sampling zeros for the existence of maximum likelihood estimates for log-linear models. Understanding the problem of non-existence is crucial to the analysis of large sparse contingency tables. We introduce some relevant results from the application of algebraic geometry to the study of this statistical problem.  相似文献   

16.
This paper suggests estimators of the frequencies (N8) or proportions {N8/N) of N distinguishable objects contained in S categories; given various types of information, We consider information in the form of exact constraints on the N8, sample frequencies, and frequencies of related data, The analysis uses Bayesian methods, where the prior distribution is assumed to be a function of the cross-entropy between the N8 and a reference distribution, We show the relationship between our estimator and the log-linear and logit models and also present a sampling experiment to compare our proposed estimator with the iterated proportional fitting estimator.  相似文献   

17.
The combination of log-linear models and correspondence analysis have long been used to decompose contingency tables and aid in their interpretation. Until now, this approach has not been applied to the education Statewide Longitudinal Data System (SLDS), which contains administrative school data at the student level. While some research has been conducted using the SLDS, its primary use is for state education administrative reporting. This article uses the combination of log-linear models and correspondence analysis to gain insight into high school dropouts in two discrete regions in Kentucky, Appalachia and non-Appalachia, defined by the American Community Survey. The individual student records from the SLDS were categorized into one of the two regions and a log-linear model was used to identify the interactions between the demographic characteristics and the dropout categories, push-out and pull-out. Correspondence analysis was then used to visualize the interactions with the expanded push-out categories, boredom, course selection, expulsion, failing grade, teacher conflict, and pull-out categories, employment, family problems, illness, marriage, and pregnancy to provide insights into the regional differences. In this article, we demonstrate that correspondence analysis can extend the insights gained from SDLS data and provide new perspectives on dropouts. Supplementary materials for this article are available online.  相似文献   

18.
The log-linear model is a tool widely accepted for modelling discrete data given in a contingency table. Although its parameters reflect the interaction structure in the joint distribution of all variables, it does not give information about structures appearing in the margins of the table. This is in contrast to multivariate logistic parameters, recently introduced by Glonek & McCullagh (1995), which have as parameters the highest order log odds ratios derived from the joint table and from each marginal table. Glonek & McCullagh give the link between the cell probabilities and the multivariate logistic parameters, in an algebraic fashion. The present paper focuses on this link, showing that it is derived by general parameter transformations in exponential families. In particular, the connection between the natural, the expectation and the mixed parameterization in exponential families (Barndorff-Nielsen, 1978) is used; this also yields the derivatives of the likelihood equation and shows properties of the Fisher matrix. The paper emphasises the analysis of independence hypotheses in margins of a contingency table.  相似文献   

19.
A general methodology is presented for finding suitable Poisson log-linear models with applications to multiway contingency tables. Mixtures of multivariate normal distributions are used to model prior opinion when a subset of the regression vector is believed to be nonzero. This prior distribution is studied for two- and three-way contingency tables, in which the regression coefficients are interpretable in terms of odds ratios in the table. Efficient and accurate schemes are proposed for calculating the posterior model probabilities. The methods are illustrated for a large number of two-way simulated tables and for two three-way tables. These methods appear to be useful in selecting the best log-linear model and in estimating parameters of interest that reflect uncertainty in the true model.  相似文献   

20.
Abstract.  In the context of survival analysis it is possible that increasing the value of a covariate X has a beneficial effect on a failure time, but this effect is reversed when conditioning on any possible value of another covariate Y . When studying causal effects and influence of covariates on a failure time, this state of affairs appears paradoxical and raises questions about the real effect of X . Situations of this kind may be seen as a version of Simpson's paradox. In this paper, we study this phenomenon in terms of the linear transformation model. The introduction of a time variable makes the paradox more interesting and intricate: it may hold conditionally on a certain survival time, i.e. on an event of the type { T > t } for some but not all t , and it may hold only for some range of survival times.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号