期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian analysis for incomplete multi-way contingency tables with nonignorable nonresponse

Yousung Park 《Journal of applied statistics》2010,37(9):1439-1453

We propose Bayesian methods with five types of priors to estimate cell probabilities in an incomplete multi-way contingency table under nonignorable nonresponse. In this situation, the maximum likelihood (ML) estimates often fall in the boundary solution, causing the ML estimates to become unstable. To deal with such a multi-way table, we present an EM algorithm which generalizes the previous algorithm used for incomplete one-way tables. Three of the five types of priors were previously introduced while the other two are newly proposed to reflect different response patterns between respondents and nonrespondents. Data analysis and simulation studies show that Bayesian estimates based on the old three priors can be worse than the ML regardless of occurrence of boundary solution, contrary to previous studies. The Bayesian estimates from the two new priors are most preferable when a boundary solution occurs. We provide an illustrating example using data for a study of the relationship between a mother's smoking and her newborn's weight. 相似文献

2.

A general maximum likelihood analysis of overdispersion in generalized linear models

Murray Aitkin 《Statistics and Computing》1996,6(3):251-262

This paper presents an EM algorithm for maximum likelihood estimation in generalized linear models with overdispersion. The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully non-parametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters may be sensitive to the specification of a parametric form for the mixing distribution. A listing of a GLIM4 algorithm for fitting the overdispersed binomial logit model is given in an appendix.A simple method is given for obtaining correct standard errors for parameter estimates when using the EM algorithm.Several examples are discussed. 相似文献

3.

A GLM approach to step-stress accelerated life testing with interval censoring

Jinsuk Lee Rong Pan 《Journal of statistical planning and inference》2012,142(4):810-819

In this paper, we present a statistical inference procedure for the step-stress accelerated life testing (SSALT) model with Weibull failure time distribution and interval censoring via the formulation of generalized linear model (GLM). The likelihood function of an interval censored SSALT is in general too complicated to obtain analytical results. However, by transforming the failure time to an exponential distribution and using a binomial random variable for failure counts occurred in inspection intervals, a GLM formulation with a complementary log-log link function can be constructed. The estimations of the regression coefficients used for the Weibull scale parameter are obtained through the iterative weighted least square (IWLS) method, and the shape parameter is updated by a direct maximum likelihood (ML) estimation. The confidence intervals for these parameters are estimated through bootstrapping. The application of the proposed GLM approach is demonstrated by an industrial example. 相似文献

4.

A new algorithm to estimate monotone nonparametric link functions and a comparison with parametric approach

Xin Wang Vivekananda Roy Zhengyuan Zhu 《Statistics and Computing》2018,28(5):1083-1094

The generalized linear model (GLM) is a class of regression models where the means of the response variables and the linear predictors are joined through a link function. Standard GLM assumes the link function is fixed, and one can form more flexible GLM by either estimating the flexible link function from a parametric family of link functions or estimating it nonparametically. In this paper, we propose a new algorithm that uses P-spline for nonparametrically estimating the link function which is guaranteed to be monotone. It is equivalent to fit the generalized single index model with monotonicity constraint. We also conduct extensive simulation studies to compare our nonparametric approach for estimating link function with various parametric approaches, including traditional logit, probit and robit link functions, and two recently developed link functions, the generalized extreme value link and the symmetric power logit link. The simulation study shows that the link function estimated nonparametrically by our proposed algorithm performs well under a wide range of different true link functions and outperforms parametric approaches when they are misspecified. A real data example is used to illustrate the results. 相似文献

5.

On selecting parametric link transformation families in generalized linear models

《Journal of statistical planning and inference》1997,61(1):125-139

The use of parametric link transformation families in generalized linear models (GLM) has been shown to improve substantially the fit of standard analyses using a fixed link in some data sets (see Czado, 1993, for example). When link and regression parameters are globally orthogonal (Cox and Reid, 1987), then the variance inflation of the regression parameter estimates due to the additional estimation of the link is asymptotically zero. Parameter orthogonality also induces numerical stability which is seen in the reduction of computation time required for the calculation of parameter estimates. This stability remains a desirable property even for inferences which are conditional on a fixed link value. Czado and Santner (1992b), for binomial error, and Czado (1992), for GLMs have shown that only local orthogonality can be achieved in general. This paper provides conditions on the link family to extend the notion of local orthogonality at a point to orthogonality in a neighborhood asymptotically and shows that the resulting links are location and scale invariant. General concepts for the construction of such links are given, and it is shown how they relate to link families proposed in the literature. The ideas are illustrated by two examples. 相似文献

6.

Simple Formula for Calculating Bias‐corrected AIC in Generalized Linear Models

Shinpei Imori Hirokazu Yanagihara Hirofumi Wakaki 《Scandinavian Journal of Statistics》2014,41(2):535-555

In real‐data analysis, deciding the best subset of variables in regression models is an important problem. Akaike's information criterion (AIC) is often used in order to select variables in many fields. When the sample size is not so large, the AIC has a non‐negligible bias that will detrimentally affect variable selection. The present paper considers a bias correction of AIC for selecting variables in the generalized linear model (GLM). The GLM can express a number of statistical models by changing the distribution and the link function, such as the normal linear regression model, the logistic regression model, and the probit model, which are currently commonly used in a number of applied fields. In the present study, we obtain a simple expression for a bias‐corrected AIC (corrected AIC, or CAIC) in GLMs. Furthermore, we provide an ‘R’ code based on our formula. A numerical study reveals that the CAIC has better performance than the AIC for variable selection. 相似文献

7.

Choosing the link function and accounting for link uncertainty in generalized linear models using Bayes factors

Claudia Czado Adrian E. Raftery 《Statistical Papers》2006,47(3):419-442

One important component of model selection using generalized linear models (GLM) is the choice of a link function. We propose using approximate Bayes factors to assess the improvement in fit over a GLM with canonical link when a parametric link family is used. The approximate Bayes factors are calculated using the Laplace approximations given in [32], together with a reference set of prior distributions. This methodology can be used to differentiate between different parametric link families, as well as allowing one to jointly select the link family and the independent variables. This involves comparing nonnested models and so standard significance tests cannot be used. The approach also accounts explicitly for uncertainty about the link function. The methods are illustrated using parametric link families studied in [12] for two data sets involving binomial responses. The first author was supported by Sonderforschungsbereich 386 Statistische Analyse Diskreter Strukturen, and the second author by NIH Grant 1R01CA094212-01 and ONR Grant N00014-01-10745. 相似文献

8.

Marginal regression models with a time to event outcome and discrete multiple source predictors

Litman HJ Horton NJ Murphy JM Laird NM 《Lifetime data analysis》2006,12(3):249-265

Information from multiple informants is frequently used to assess psychopathology. We consider marginal regression models with multiple informants as discrete predictors and a time to event outcome. We fit these models to data from the Stirling County Study; specifically, the models predict mortality from self report of psychiatric disorders and also predict mortality from physician report of psychiatric disorders. Previously, Horton et al. found little relationship between self and physician reports of psychopathology, but that the relationship of self report of psychopathology with mortality was similar to that of physician report of psychopathology with mortality. Generalized estimating equations (GEE) have been used to fit marginal models with multiple informant covariates; here we develop a maximum likelihood (ML) approach and show how it relates to the GEE approach. In a simple setting using a saturated model, the ML approach can be constructed to provide estimates that match those found using GEE. We extend the ML technique to consider multiple informant predictors with missingness and compare the method to using inverse probability weighted (IPW) GEE. Our simulation study illustrates that IPW GEE loses little efficiency compared with ML in the presence of monotone missingness. Our example data has non-monotone missingness; in this case, ML offers a modest decrease in variance compared with IPW GEE, particularly for estimating covariates in the marginal models. In more general settings, e.g., categorical predictors and piecewise exponential models, the likelihood parameters from the ML technique do not have the same interpretation as the GEE. Thus, the GEE is recommended to fit marginal models for its flexibility, ease of interpretation and comparable efficiency to ML in the presence of missing data. 相似文献

9.

Measurement error in the generalised linear model

Ben Armstrong 《统计学通讯:模拟与计算》2013,42(3):529-544

This paper considers the problem of estimating the linear parameters of a Generalised Linear Model (GLM) when the explanatory variable is subject to measurement error. In this situation the induced model for dependence on the approximate explanatory variable is not usually of GLM form. However, when the distribution of measurement error is known or estimated from replicated measurements, application of the GLIM iteratively reweighted least squares algorithm with transformed data and weighting is shown to produce maximum quasi likelihood estimates in many cases. Details of this approach are given for two particular generalized linear models; simulation results illustrate the usefulness of the theory for these models. 相似文献

10.

Generalized Estimation of the BLUP in Mixed-Effects Models: A Comparison with ML and REML

Ching-Ray Yu Kelly H. Zou Martin O. Carlsson Samaradasa Weerahandi 《统计学通讯:模拟与计算》2015,44(3):694-704

The Best Linear Unbiased Predictor (BLUP) in mixed models is a function of the variance components and they are estimated using maximum likelihood (ML) or restricted ML methods. Nonconvergence of BLUP would occur due to a drawback of the standard likelihood-based approaches. In such situations, ML and REML either do not provide any BLUPs or all become equal. To overcome this drawback, we provide a generalized estimate (GE) of BLUP that does not suffer from the problem of negative or zero variance components, and compare its performance against the ML and REML estimates of BLUP. Simulated and published data are used to compare BLUP. 相似文献

11.

Flexible Tweedie regression models for continuous data

Wagner Hugo Bonat Célestin C. Kokonendji 《Journal of Statistical Computation and Simulation》2017,87(11):2138-2152

Tweedie regression models (TRMs) provide a flexible family of distributions to deal with non-negative right-skewed data and can handle continuous data with probability mass at zero. Estimation and inference of TRMs based on the maximum likelihood (ML) method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting TRMs, namely quasi-likelihood (QML) and pseudo-likelihood (PML). We discuss their asymptotic properties and perform simulation studies to compare our methods with the ML method. We show that the QML method provides asymptotically efficient estimation for regression parameters. Simulation studies showed that the QML and PML approaches present estimates, standard errors and coverage rates similar to the ML method. Furthermore, the second-moment assumptions required by the QML and PML methods enable us to extend the TRMs to the class of quasi-TRMs in Wedderburn's style. It allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide an R implementation and illustrate the application of TRMs using three data sets. 相似文献

12.

Variable selection for multivariate generalized linear models

Xiaoguang Wang Junhui Fan 《Journal of applied statistics》2014,41(2):393-406

Generalized linear models (GLMs) are widely studied to deal with complex response variables. For the analysis of categorical dependent variables with more than two response categories, multivariate GLMs are presented to build the relationship between this polytomous response and a set of regressors. Traditional variable selection approaches have been proposed for the multivariate GLM with a canonical link function when the number of parameters is fixed in the literature. However, in many model selection problems, the number of parameters may be large and grow with the sample size. In this paper, we present a new selection criterion to the model with a diverging number of parameters. Under suitable conditions, the criterion is shown to be model selection consistent. A simulation study and a real data analysis are conducted to support theoretical findings. 相似文献

13.

Some results for maximum likelihood estimation of adjusted relative risks

Bernardo Borba de Andrade Joanlise Marco de Leon Andrade 《统计学通讯:理论与方法》2018,47(23):5750-5769

Maximum likelihood (ML) estimation of relative risks via log-binomial regression requires a restricted parameter space. Computation via non linear programming is simple to implement and has high convergence rate. We show that the optimization problem is well posed (convex domain and convex objective) and provide a variance formula along with a methodology for obtaining standard errors and prediction intervals which account for estimates on the boundary of the parameter space. We performed simulations under several scenarios already used in the literature in order to assess the performance of ML and of two other common estimation methods. 相似文献

14.

Testing Inference from Logistic Regression Models in Data with Unobserved Heterogeneity at Cluster Levels

Salma Ayis 《统计学通讯:模拟与计算》2013,42(6):1202-1211

Clustering due to unobserved heterogeneity may seriously impact on inference from binary regression models. We examined the performance of the logistic, and the logistic-normal models for data with such clustering. The total variance of unobserved heterogeneity rather than the level of clustering determines the size of bias of the maximum likelihood (ML) estimator, for the logistic model. Incorrect specification of clustering as level 2, using the logistic-normal model, provides biased estimates of the structural and random parameters, while specifying level 1, provides unbiased estimates for the former, and adequately estimates the latter. The proposed procedure appeals to many research areas. 相似文献

15.

Multiple imputation for gamma outcome variable using generalized linear model

Vinay K. Gupta Gurprit Grover 《Journal of Statistical Computation and Simulation》2017,87(10):1980-1988

We used a proper multiple imputation (MI) through Gibbs sampling approach to impute missing values of a gamma distributed outcome variable which were missing at random, using generalized linear model (GLM) with identity link function. The missing values of the outcome variable were multiply imputed using GLM and then the complete data sets obtained after MI were analysed through GLM again for the estimation purpose. We examined the performance of the proposed technique through a simulation study with the data sets having four moderate and large proportions of missing values, 10%, 20%, 30% and 50%. We also applied this technique on a real life data and compared the results with those obtained by applying GLM only on observed cases. The results showed that the proposed technique gave better results for moderate proportions of missing values. 相似文献

16.

Revisiting the transitional dynamics of business cycle phases with mixed-frequency data

Marie Bessec 《Econometric Reviews》2019,38(7):711-732

This paper introduces a Markov-switching model in which transition probabilities depend on higher frequency indicators and their lags through polynomial weighting schemes. The MSV-MIDAS model is estimated through maximum likelihood (ML) methods with a slightly modified version of Hamilton’s filter. Monte Carlo simulations show that ML provides accurate estimates, but they suggest some caution in interpreting the tests of the parameters in the transition probabilities. We apply this new model to forecast business cycle turning points in the United States. We properly detect recessions by exploiting the link between GDP growth and higher frequency variables from financial and energy markets. 相似文献

17.

Missing data mechanisms and their implications on the analysis of categorical data

Frederico Z. Poleto Julio M. Singer Carlos Daniel Paulino 《Statistics and Computing》2011,21(1):31-43

We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space. 相似文献

18.

Properties of the beta regression model for small area estimation of proportions and application to estimation of poverty rates

Ryan Janicki 《统计学通讯:理论与方法》2020,49(9):2264-2284

Abstract

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented. 相似文献

19.

THE ASYMPTOTIC EQUIVALENCE OF THE FISHER INFORMATION MATRICES FOR TYPE I AND TYPE II CENSORED DATA FROM LOCATION-SCALE FAMILIES

《统计学通讯:理论与方法》2013,42(10):2211-2225

Type I and Type II censored data arise frequently in controlled laboratory studies concerning time to a particular event (e.g., death of an animal or failure of a physical device). Log-location-scale distributions (e.g., Weibull, lognormal, and loglogistic) are commonly used to model the resulting data. Maximum likelihood (ML) is generally used to obtain parameter estimates when the data are censored. The Fisher information matrix can be used to obtain large-sample approximate variances and covariances of the ML estimates or to estimate these variances and covariances from data. The derivations of the Fisher information matrix proceed differently for Type I (time censoring) and Type II (failure censoring) because the number of failures is random in Type I censoring, but length of the data collection period is random in Type II censoring. Under regularity conditions (met with the above-mentioned log-location-scale distributions), we outline the different derivations and show that the Fisher information matrices for Type I and Type II censoring are asymptotically equivalent. 相似文献

20.

Estimating large-scale general linear and seemingly unrelated regressions models after deleting observations

Stella Hadjiantoni Erricos John Kontoghiorghes 《Statistics and Computing》2017,27(2):349-361

A new numerical method to solve the downdating problem (and variants thereof), namely removing the effect of some observations from the generalized least squares (GLS) estimator of the general linear model (GLM) after it has been estimated, is extensively investigated. It is verified that the solution of the downdated least squares problem can be obtained from the estimation of an equivalent GLM, where the original model is updated with the imaginary deleted observations. This updated GLM has a non positive definite dispersion matrix which comprises complex covariance values and it is proved herein to yield the same normal equations as the downdated model. Additionally, the problem of deleting observations from the seemingly unrelated regressions model is addressed, demonstrating the direct applicability of this method to other multivariate linear models. The algorithms which implement the novel downdating method utilize efficiently the previous computations from the estimation of the original model. As a result, the computational cost is significantly reduced. This shows the great usability potential of the downdating method in computationally intensive problems. The downdating algorithms have been applied to real and synthetic data to illustrate their efficiency. 相似文献