期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Generalized linear models with covariate measurement error and unknown link function

Nels Johnson 《Journal of applied statistics》2017,44(5):833-852

Generalized linear models (GLMs) with error-in-covariates are useful in epidemiological research due to the ubiquity of non-normal response variables and inaccurate measurements. The link function in GLMs is chosen by the user depending on the type of response variable, frequently the canonical link function. When covariates are measured with error, incorrect inference can be made, compounded by incorrect choice of link function. In this article we propose three flexible approaches for handling error-in-covariates and estimating an unknown link simultaneously. The first approach uses a fully Bayesian (FB) hierarchical framework, treating the unobserved covariate as a latent variable to be integrated over. The second and third are approximate Bayesian approach which use a Laplace approximation to marginalize the variables measured with error out of the likelihood. Our simulation results show support that the FB approach is often a better choice than the approximate Bayesian approaches for adjusting for measurement error, particularly when the measurement error distribution is misspecified. These approaches are demonstrated on an application with binary response. 相似文献

2.

Robust inference in an heteroscedastic measurement error model

Mário de Castro Manuel Galea 《Journal of the Korean Statistical Society》2010,39(4):439-447

In this paper we deal with robust inference in heteroscedastic measurement error models. Rather than the normal distribution, we postulate a Student t distribution for the observed variables. Maximum likelihood estimates are computed numerically. Consistent estimation of the asymptotic covariance matrices of the maximum likelihood and generalized least squares estimators is also discussed. Three test statistics are proposed for testing hypotheses of interest with the asymptotic chi-square distribution which guarantees correct asymptotic significance levels. Results of simulations and an application to a real data set are also reported. 相似文献

3.

Comparison of designs for multivariate generalized linear models

S. Mukhopadhyay A.I. Khuri 《Journal of statistical planning and inference》2008

The purpose of this paper is to discuss response surface designs for multivariate generalized linear models (GLMs). Such models are considered whenever several response variables can be measured for each setting of a group of control variables, and the response variables are adequately represented by GLMs. The mean-squared error of prediction (MSEP) matrix is used to assess the quality of prediction associated with a given design. The MSEP incorporates both the prediction variance and the prediction bias, which results from using maximum likelihood estimates of the parameters of the fitted linear predictor. For a given design, quantiles of a scalar-valued function of the MSEP are obtained within a certain region of interest. The quantiles depend on the unknown parameters of the linear predictor. The dispersion of these quantiles over the space of the unknown parameters is determined and then depicted by the so-called quantile dispersion graphs. An application of the proposed methodology is presented using the special case of the bivariate binary distribution. 相似文献

4.

Orthogonality of the mean and error distribution in generalized linear models

Alan Huang Paul J. Rathouz 《统计学通讯:理论与方法》2017,46(7):3290-3296

We show that the mean-model parameter is always orthogonal to the error distribution in generalized linear models. Thus, the maximum likelihood estimator of the mean-model parameter will be asymptotically efficient regardless of whether the error distribution is known completely, known up to a finite vector of parameters, or left completely unspecified, in which case the likelihood is taken to be an appropriate semiparametric likelihood. Moreover, the maximum likelihood estimator of the mean-model parameter will be asymptotically independent of the maximum likelihood estimator of the error distribution. This generalizes some well-known results for the special cases of normal, gamma, and multinomial regression models, and, perhaps more interestingly, suggests that asymptotically efficient estimation and inferences can always be obtained if the error distribution is non parametrically estimated along with the mean. In contrast, estimation and inferences using misspecified error distributions or variance functions are generally not efficient. 相似文献

5.

Kernel Density-Based Linear Regression Estimate

Weixin Yao Zhibiao Zhao 《统计学通讯:理论与方法》2013,42(24):4499-4512

For linear regression models with non normally distributed errors, the least squares estimate (LSE) will lose some efficiency compared to the maximum likelihood estimate (MLE). In this article, we propose a kernel density-based regression estimate (KDRE) that is adaptive to the unknown error distribution. The key idea is to approximate the likelihood function by using a nonparametric kernel density estimate of the error density based on some initial parameter estimate. The proposed estimate is shown to be asymptotically as efficient as the oracle MLE which assumes the error density were known. In addition, we propose an EM type algorithm to maximize the estimated likelihood function and show that the KDRE can be considered as an iterated weighted least squares estimate, which provides us some insights on the adaptiveness of KDRE to the unknown error distribution. Our Monte Carlo simulation studies show that, while comparable to the traditional LSE for normal errors, the proposed estimation procedure can have substantial efficiency gain for non normal errors. Moreover, the efficiency gain can be achieved even for a small sample size. 相似文献

6.

Regression models for binary longitudinal responses

AITKIN MURRAY ALFÓ MARCO 《Statistics and Computing》1998,8(4):289-307

Some conditional models to deal with binary longitudinal responses are proposed, extending random effects models to include serial dependence of Markovian form, and hence allowing for quite general association structures between repeated observations recorded on the same individual. The presence of both these components implies a form of dependence between them, and so a complicated expression for the resulting likelihood. To handle this problem, we introduce, as a first instance, what Follmann and Wu (1995) called, in a different setting, an approximate conditional model, which represents an optimal choice for the general framework of categorical longitudinal responses. Then we define two more formally correct models for the binary case, with no assumption about the distribution of the random effect. All of the discussed models are estimated by means of an EM algorithm for nonparametric maximum likelihood. The algorithm, an adaptation of that used by Aitkin (1996) for the analysis of overdispersed generalized linear models, is initially derived as a form of Gaussian quadrature, and then extended to a completely unknown mixing distribution. A large scale simulation work is described to explore the behaviour of the proposed approaches in a number of different situations. 相似文献

7.

Restricted ridge estimator in generalized linear models: Monte Carlo simulation studies on Poisson and binomial distributed responses

Fikriye Kurtoğlu M. Revan Özkale 《统计学通讯:模拟与计算》2019,48(4):1191-1218

It is known that collinearity among the explanatory variables in generalized linear models (GLMs) inflates the variance of maximum likelihood estimators. To overcome multicollinearity in GLMs, ordinary ridge estimator and restricted estimator were proposed. In this study, a restricted ridge estimator is introduced by unifying the ordinary ridge estimator and the restricted estimator in GLMs and its mean squared error (MSE) properties are discussed. The MSE comparisons are done in the context of first-order approximated estimators. The results are illustrated by a numerical example and two simulation studies are conducted with Poisson and binomial responses. 相似文献

8.

Mixtures of Linear Regression with Measurement Errors

Weixin Yao Weixing Song 《统计学通讯:理论与方法》2013,42(8):1602-1614

Existing research on mixtures of regression models are limited to directly observed predictors. The estimation of mixtures of regression for measurement error data imposes challenges for statisticians. For linear regression models with measurement error data, the naive ordinary least squares method, which directly substitutes the observed surrogates for the unobserved error-prone variables, yields an inconsistent estimate for the regression coefficients. The same inconsistency also happens to the naive mixtures of regression estimate, which is based on the traditional maximum likelihood estimator and simply ignores the measurement error. To solve this inconsistency, we propose to use the deconvolution method to estimate the mixture likelihood of the observed surrogates. Then our proposed estimate is found by maximizing the estimated mixture likelihood. In addition, a generalized EM algorithm is also developed to find the estimate. The simulation results demonstrate that the proposed estimation procedures work well and perform much better than the naive estimates. 相似文献

9.

Constrained inference for generalized linear models with incomplete covariate data

《Journal of Statistical Computation and Simulation》2012,82(4):693-710

Missing data are common in many experiments, including surveys, clinical trials, epidemiological studies, and environmental studies. Unconstrained likelihood inferences for generalized linear models (GLMs) with nonignorable missing covariates have been studied extensively in the literature. However, parameter orderings or constraints may occur naturally in practice, and thus the efficiency of a statistical method may be improved by incorporating parameter constraints into the likelihood function. In this paper, we consider constrained inference for analysing GLMs with nonignorable missing covariates under linear inequality constraints on the model parameters. Specifically, constrained maximum likelihood (ML) estimation is based on the gradient projection expectation maximization approach. Further, we investigate the asymptotic null distribution of the constrained likelihood ratio test (LRT). Simulations study the empirical properties of the constrained ML estimators and LRTs, which demonstrate improved precision of these constrained techniques. An application to contaminant levels in an environmental study is also presented. 相似文献

10.

Errors-in-variables beta regression models

Jalmar M.F. Carrasco Reinaldo B. Arellano-Valle 《Journal of applied statistics》2014,41(7):1530-1547

Beta regression models provide an adequate approach for modeling continuous outcomes limited to the interval (0, 1). This paper deals with an extension of beta regression models that allow for explanatory variables to be measured with error. The structural approach, in which the covariates measured with error are assumed to be random variables, is employed. Three estimation methods are presented, namely maximum likelihood, maximum pseudo-likelihood and regression calibration. Monte Carlo simulations are used to evaluate the performance of the proposed estimators and the naïve estimator. Also, a residual analysis for beta regression models with measurement errors is proposed. The results are illustrated in a real data set. 相似文献

11.

Joint regression modeling for missing categorical covariates in generalized linear models

Luis Carlos Pérez-Ruiz Gabriel Escarela 《Journal of applied statistics》2018,45(15):2741-2759

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques. 相似文献

12.

Maximum likelihood estimation of polychoric correlations in r×s ×t contingency tables

《Journal of Statistical Computation and Simulation》2012,82(1-2):53-67

This paper discusses the maximum likelihood estimation of the polychoric correlation coefficient based on observed frequencies of three polytomous ordinal variables. The underlying latent variables are assumed to have a standardized trivariate normal distribution. The thresholds and correlations are estimated simultaneously via the scoring algorithm. Some practical applications of the method are discussed. An example is reported to illustrate the theory and some technical details are presented in the Appendix. 相似文献

13.

On generalized least squares estimation of the weibull distribution

Richard M. Engeman Thomas J. Keefe 《统计学通讯:理论与方法》2013,42(19):2181-2193

A simple estimation procedure, based on the generalized least squares method, for the parameters of the Weibull distribution is described and investigated. Through a simulation study, this estimation technique is compared with maximum likelihood estimation, ordinary least squares estimation, and Menon's estimation procedure; this comparison is based on observed relative efficiencies (that is, the ratio of the Cramer-Rao lower bound to the observed mean squared error). Simulation results are presented for samples of size 25. Among the estimators considered in this simulation study, the generalized least squares estimator was found to be the "best" estimator for the shape parameter and a close competitor to the maximum likelihood estimator of the scale parameter. 相似文献

14.

Student-t censored regression model: properties and inference

Reinaldo B. Arellano-Valle Luis M. Castro Graciela González-Farías Karla A. Mu?oz-Gajardo 《Statistical Methods and Applications》2012,21(4):453-473

In statistical analysis, particularly in econometrics, it is usual to consider regression models where the dependent variable is censored (limited). In particular, a censoring scheme to the left of zero is considered here. In this article, an extension of the classical normal censored model is developed by considering independent disturbances with identical Student-t distribution. In the context of maximum likelihood estimation, an expression for the expected information matrix is provided, and an efficient EM-type algorithm for the estimation of the model parameters is developed. In order to know what type of variables affect the income of housewives, the results and methods are applied to a real data set. A brief review on the normal censored regression model or Tobit model is also presented. 相似文献

15.

A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications

Leila Amiri Mojtaba Khazaei Mojtaba Ganjali 《AStA Advances in Statistical Analysis》2018,102(1):95-115

Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model. 相似文献

16.

A new algorithm for maximum likelihood estimation in normal scale-mixture generalized autoregressive conditional heteroskedastic models

Byungtae Seo 《Journal of Statistical Computation and Simulation》2015,85(1):202-215

In this paper, we propose a new generalized autoregressive conditional heteroskedastic (GARCH) model using infinite normal scale-mixtures which can suitably avoid order selection problems in the application of finite normal scale-mixtures. We discuss its theoretical properties and develop a two-stage algorithm for the maximum likelihood estimator to estimate the mixing distribution non-parametric maximum likelihood estimator (NPMLE) as well as GARCH parameters (two-stage MLE). For the estimation of a mixing distribution, we employ a fast computational algorithm proposed by Wang [On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J R Stat Soc Ser B. 2007;69:185–198] under the gradient characterization of the non-parametric mixture likelihood. The GARCH parameters are then estimated either using the expectation-mazimization algorithm or general optimization scheme. In addition, we propose a new forecasting algorithm of value-at-risk (VaR) using the two-stage MLE and the NPMLE. Through a simulation study and real data analysis, we compare the performance of the two-stage MLE with the existing ones including quasi-maximum likelihood estimator based on the standard normal density and the finite normal mixture quasi maximum estimated-likelihood estimator (cf. Lee S, Lee T. Inference for Box–Cox transformed threshold GARCH models with nuisance parameters. Scand J Stat. 2012;39:568–589) in terms of the relative efficiency and accuracy of VaR forecasting. 相似文献

17.

A general maximum likelihood analysis of overdispersion in generalized linear models

Murray Aitkin 《Statistics and Computing》1996,6(3):251-262

This paper presents an EM algorithm for maximum likelihood estimation in generalized linear models with overdispersion. The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully non-parametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters may be sensitive to the specification of a parametric form for the mixing distribution. A listing of a GLIM4 algorithm for fitting the overdispersed binomial logit model is given in an appendix.A simple method is given for obtaining correct standard errors for parameter estimates when using the EM algorithm.Several examples are discussed. 相似文献

18.

Generalized Linear Latent Variable Modeling for Multi-Group Studies

Jens C. Eickhoff Yasuo Amemiya 《统计学通讯:理论与方法》2013,42(9-10):1991-2008

ABSTRACT

Latent variable modeling is commonly used in behavioral, social, and medical science research. The models used in such analysis relate all observed variables to latent common factors. In many applications, the observations are highly non normal or discrete, e.g., polytomous responses or counts. The existing approaches for non normal observations can be considered lacking in several aspects, especially for multi-group samples situations. We propose a generalized linear model approach for multi-sample latent variable analysis that can handle a broad class of non normal and discrete observations, and that furnishes meaningful interpretation and inference in multi-group studies through maximum likelihood analysis. A Monte Carlo EM algorithm is proposed for parameter estimation. The convergence assessment and standard error estimation is addressed. Simulation studies are reported to show the usefulness of the our approach. An example from a substance abuse prevention study is also presented. 相似文献

19.

Inference of Seasonal Long‐memory Time Series with Measurement Error

下载免费PDF全文

Henghsiu Tsai Heiko Rachinger Edward M.H. Lin 《Scandinavian Journal of Statistics》2015,42(1):137-154

We consider the Whittle likelihood estimation of seasonal autoregressive fractionally integrated moving‐average models in the presence of an additional measurement error and show that the spectral maximum Whittle likelihood estimator is asymptotically normal. We illustrate by simulation that ignoring measurement errors may result in incorrect inference. Hence, it is pertinent to test for the presence of measurement errors, which we do by developing a likelihood ratio (LR) test within the framework of Whittle likelihood. We derive the non‐standard asymptotic null distribution of this LR test and the limiting distribution of LR test under a sequence of local alternatives. Because in practice, we do not know the order of the seasonal autoregressive fractionally integrated moving‐average model, we consider three modifications of the LR test that takes model uncertainty into account. We study the finite sample properties of the size and the power of the LR test and its modifications. The efficacy of the proposed approach is illustrated by a real‐life example. 相似文献

20.

The skew generalized t distribution as the scale mixture of a skew exponential power distribution and its applications in robust estimation

Olcay Arslan Ali İ. Genç 《Statistics》2013,47(5):481-498

In this paper, we consider the family of skew generalized t (SGT) distributions originally introduced by Theodossiou [P. Theodossiou, Financial data and the skewed generalized t distribution, Manage. Sci. Part 1 44 (12) ( 1998), pp. 1650–1661] as a skew extension of the generalized t (GT) distribution. The SGT distribution family warrants special attention, because it encompasses distributions having both heavy tails and skewness, and many of the widely used distributions such as Student's t, normal, Hansen's skew t, exponential power, and skew exponential power (SEP) distributions are included as limiting or special cases in the SGT family. We show that the SGT distribution can be obtained as the scale mixture of the SEP and generalized gamma distributions. We investigate several properties of the SGT distribution and consider the maximum likelihood estimation of the location, scale, and skewness parameters under the assumption that the shape parameters are known. We show that if the shape parameters are estimated along with the location, scale, and skewness parameters, the influence function for the maximum likelihood estimators becomes unbounded. We obtain the necessary conditions to ensure the uniqueness of the maximum likelihood estimators for the location, scale, and skewness parameters, with known shape parameters. We provide a simple iterative re-weighting algorithm to compute the maximum likelihood estimates for the location, scale, and skewness parameters and show that this simple algorithm can be identified as an EM-type algorithm. We finally present two applications of the SGT distributions in robust estimation. 相似文献