首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the estimation of cell probabilities from a two–way contingency table, suppose that a priori the classification variables are believed independent. New empirical Bayes and Bayes estimators are proposed which shrink the observed proportions towards classical estimates under the model of independence. The estimators, based on a Dirichlet mixture class of priors, compare favorably to an estimator of Laird (1978) that is based on a normal prior on terms of a log–linear model. The methods are generalized to three–way tables.  相似文献   

2.
Relative risks are often considered preferable to odds ratios for quantifying the association between a predictor and a binary outcome. Relative risk regression is an alternative to logistic regression where the parameters are relative risks rather than odds ratios. It uses a log link binomial generalised linear model, or log‐binomial model, which requires parameter constraints to prevent probabilities from exceeding 1. This leads to numerical problems with standard approaches for finding the maximum likelihood estimate (MLE), such as Fisher scoring, and has motivated various non‐MLE approaches. In this paper we discuss the roles of the MLE and its main competitors for relative risk regression. It is argued that reliable alternatives to Fisher scoring mean that numerical issues are no longer a motivation for non‐MLE methods. Nonetheless, non‐MLE methods may be worthwhile for other reasons and we evaluate this possibility for alternatives within a class of quasi‐likelihood methods. The MLE obtained using a reliable computational method is recommended, but this approach requires bootstrapping when estimates are on the parameter space boundary. If convenience is paramount, then quasi‐likelihood estimation can be a good alternative, although parameter constraints may be violated. Sensitivity to model misspecification and outliers is also discussed along with recommendations and priorities for future research.  相似文献   

3.
Missing data methods, maximum likelihood estimation (MLE) and multiple imputation (MI), for longitudinal questionnaire data were investigated via simulation. Predictive mean matching (PMM) was applied at both item and scale levels, logistic regression at item level and multivariate normal imputation at scale level. We investigated a hybrid approach which is combination of MLE and MI, i.e. scales from the imputed data are eliminated if all underlying items were originally missing. Bias and mean square error (MSE) for parameter estimates were examined. ML seemed to provide occasionally the best results in terms of bias, but hardly ever on MSE. All imputation methods at the scale level and logistic regression at item level hardly ever showed the best performance. The hybrid approach is similar or better than its original MI. The PMM-hybrid approach at item level demonstrated the best MSE for most settings and in some cases also the smallest bias.  相似文献   

4.
A representation of sums and differences of the form 2n log n, the lnn function, is introduced to express likelihood-ratio chi-square test statistics in contingency table analysis. This is a concise explicit form to display when partitioning chi-square statistics in accordance with hierarchical models. The lnn representation gives students insights into the construction of test statistics, and assists in relating identical forms under differing model sets. Hierarchies are presented for independence and equi-probability in two-way tables, for symmetry in correlated square tables, for independence-and-homogeneity of two-way responses across levels of a factor, and for mutual independence in three-way tables, along with relevant partitions of chi-square.  相似文献   

5.
For a two-dimensional contingency table of probabilities, the concept of symmetry around the main diagonal is well defined. Statistical hypothesis test of symmetry versus positive bias have also been explored. For tables of higher (three or more) dimensions, however, different concepts of symmetry are available. In this study, we consider statistical inference procedures of symmetry in partial tables versus various biases in three-dimensional tables. We find the maximum likelihood estimates of the cell probabilities and the asymptotic distribution of the likelihood ratio test statistic in each case. Simulation studies are used to investigate the sizes and powers of the tests. The methodologies developed are applied on real data sets.  相似文献   

6.
This article discusses a representation of Pearson's chi-square for independence in two-way contingency tables in terms of conditional probabilities of two categorical random variables and proposes a functional interpretation of Pearson's chi-square. This representation is suggested for use in the teaching of statistical independence between categorical variables.  相似文献   

7.
The paper describes a generalized iterative proportional fitting procedure that can be used for maximum likelihood estimation in a special class of the general log‐linear model. The models in this class, called relational, apply to multivariate discrete sample spaces that do not necessarily have a Cartesian product structure and may not contain an overall effect. When applied to the cell probabilities, the models without the overall effect are curved exponential families and the values of the sufficient statistics are reproduced by the MLE only up to a constant of proportionality. The paper shows that Iterative Proportional Fitting, Generalized Iterative Scaling, and Improved Iterative Scaling fail to work for such models. The algorithm proposed here is based on iterated Bregman projections. As a by‐product, estimates of the multiplicative parameters are also obtained. An implementation of the algorithm is available as an R‐package.  相似文献   

8.
This paper develops alternatives to maximum likelihood estimators (MLE) for logistic regression models and compares the mean squared error (MSE) of the estimators. The MLE for the vector of underlying success probabilities has low MSE only when the true probabilities are extreme (i.e., near 0 or 1). Extreme probabilities correspond to logistic regression parameter vectors which are large in norm. A competing “restricted” MLE and an empirical version of it are suggested as estimators with better performance than the MLE for central probabilities. An approximate EM-algorithm for estimating the restriction is described. As in the case of normal theory ridge estimators, the proposed estimators are shown to be formally derivable by Bayes and empirical Bayes arguments. The small sample operating characteristics of the proposed estimators are compared to the MLE via a simulation study; both the estimation of individual probabilities and of logistic parameters are considered.  相似文献   

9.
Summary A method of inputting prior opinion in contingency tables is described. The method can be used to incorporate beliefs of independence or symmetry but extensions are straightforward. Logistic normal distributions that express such beliefs are used as priors of the cell probabilities and posterior estimates are derived. Empirical Bayes methods are also discussed and approximate posterior variances are provided. The methods are illustrated by a numerical example.  相似文献   

10.
11.
The authors study the empirical likelihood method for linear regression models. They show that when missing responses are imputed using least squares predictors, the empirical log‐likelihood ratio is asymptotically a weighted sum of chi‐square variables with unknown weights. They obtain an adjusted empirical log‐likelihood ratio which is asymptotically standard chi‐square and hence can be used to construct confidence regions. They also obtain a bootstrap empirical log‐likelihood ratio and use its distribution to approximate that of the empirical log‐likelihood ratio. A simulation study indicates that the proposed methods are comparable in terms of coverage probabilities and average lengths of confidence intervals, and perform better than a normal approximation based method.  相似文献   

12.
The proportional odds model (POM) is commonly used in regression analysis to predict the outcome for an ordinal response variable. The maximum likelihood estimation (MLE) approach is typically used to obtain the parameter estimates. The likelihood estimates do not exist when the number of parameters, p, is greater than the number of observations n. The MLE also does not exist if there are no overlapping observations in the data. In a situation where the number of parameters is less than the sample size but p is approaching to n, the likelihood estimates may not exist, and if they exist they may have quite large standard errors. An estimation method is proposed to address the last two issues, i.e. complete separation and the case when p approaches n, but not the case when p>n. The proposed method does not use any penalty term but uses pseudo-observations to regularize the observed responses by downgrading their effect so that they become close to the underlying probabilities. The estimates can be computed easily with all commonly used statistical packages supporting the fitting of POMs with weights. Estimates are compared with MLE in a simulation study and an application to the real data.  相似文献   

13.
We consider a sequence of contingency tables whose cell probabilities may vary randomly. The distribution of cell probabilities is modelled by a Dirichlet distribution. Bayes and empirical Bayes estimates of the log odds ratio are obtained. Emphasis is placed on estimating the risks associated with the Bayes, empirical Bayes and maximum lilkelihood estimates of the log odds ratio.  相似文献   

14.
A new technique for the detection of outliers in contingency tables is introduced, where outliers are unusual cell counts with respect to classical loglinear Poisson models. Subsets of cell counts called minimal patterns are defined, corresponding to non-singular design matrices and leading to potentially uncontaminated maximum-likelihood estimates of the model parameters and thereby the expected cell counts. A criterion to easily produce minimal patterns in the two-way case under independence is derived, based on the analysis of the positions of the chosen cells. A simulation study and a couple of real-data examples are presented to illustrate the performance of the newly developed outlier identification algorithm, and to compare it with other existing methods.  相似文献   

15.
Summary. The maximum likelihood estimator (MLE) for the proportional hazards model with partly interval-censored data is studied. Under appropriate regularity conditions, the MLEs of the regression parameter and the cumulative hazard function are shown to be consistent and asymptotically normal. Two methods to estimate the variance–covariance matrix of the MLE of the regression parameter are considered, based on a generalized missing information principle and on a generalized profile information procedure. Simulation studies show that both methods work well in terms of the bias and variance for samples of moderate size. An example illustrates the methods.  相似文献   

16.
Probability forecasting models can be estimated using weighted score functions that (by definition) capture the performance of the estimated probabilities relative to arbitrary “baseline” probability assessments, such as those produced by another model, by a bookmaker or betting market, or by a human probability assessor. Maximum likelihood estimation (MLE) is interpretable as just one such method of optimum score estimation. We find that when MLE-based probabilities are themselves treated as the baseline, forecasting models estimated by optimizing any of the proven families of power and pseudospherical economic score functions yield the very same probabilities as MLE. The finding that probabilities estimated by optimum score estimation respond to MLE-baseline probabilities by mimicking them supports reliance on MLE as the default form of optimum score estimation.  相似文献   

17.
For models with random effects or missing data, the likelihood function is sometimes intractable analytically but amenable to Monte Carlo approximation. To get a good approximation, the parameter value that drives the simulations should be sufficiently close to the maximum likelihood estimate (MLE) which unfortunately is unknown. Introducing a working prior distribution, we express the likelihood function as a posterior expectation and approximate it using posterior simulations. If the sample size is large, the sample information is likely to outweigh the prior specification and the posterior simulations will be concentrated around the MLE automatically, leading to good approximation of the likelihood near the MLE. For smaller samples, we propose to use the current posterior as the next prior distribution to make the posterior simulations closer to the MLE and hence improve the likelihood approximation. By using the technique of data duplication, we can simulate from the sharpened posterior distribution without actually updating the prior distribution. The suggested method works well in several test cases. A more complex example involving censored spatial data is also discussed.  相似文献   

18.
We consider parametric regression problems with some covariates missing at random. It is shown that the regression parameter remains identifiable under natural conditions. When the always observed covariates are discrete, we propose a semiparametric maximum likelihood method, which does not require parametric specification of the missing data mechanism or the covariate distribution. The global maximum likelihood estimator (MLE), which maximizes the likelihood over the whole parameter set, is shown to exist under simple conditions. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. Under regularity conditions, the two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set.  相似文献   

19.
Missing observations often occur in cross-classified data collected during observational, clinical, and public health studies. Inappropriate treatment of missing data can reduce statistical power and give biased results. This work extends the Baker, Rosenberger and Dersimonian modeling approach to compute maximum likelihood estimates for cell counts in three-way tables with missing data, and studies the association between two dichotomous variables while controlling for a third variable in \( 2\times 2 \times K \) tables. This approach is applied to the Behavioral Risk Factor Surveillance System data. Simulation studies are used to investigate the efficiency of estimation of the common odds ratio.  相似文献   

20.
Classical analysis of contingency tables employs (i) fixed sample sizes and (ii) the maximum likelihood and weighted least squares approach to parameter estimation. It is well-known, however, that certain important parameters, such as the main effect and interaction parameters, can neverbe estimated unbiasedly when the sample size is fixed a priori We introduce a sequential unbiased estimator for the cell probabilities subject to log linear constraints. As a simple consequence, we show how parameters such as those mentioned above may. be estimated unbiasedly. Our unbiased estimator for the vector of cell probabilities is shown to be consistent in the sense of Wolfowitz (Ann. Math. Statist. (1947) 18). We give a sufficient condition on a multinomial stopping rule for the corresponding sufficient statistic to be complete. When this condition holds, we have a unique minimum variance unbiased estimator for the cell probabilities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号