首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Tsou (2003a) proposed a parametric procedure for making robust inference for mean regression parameters in the context of generalized linear models. This robust procedure is extended to model variance heterogeneity. The normal working model is adjusted to become asymptotically robust for inference about regression parameters of the variance function for practically all continuous response variables. The connection between the novel robust variance regression model and the estimating equations approach is also provided.  相似文献   

Heteroscedasticity generally exists when a linear regression model is applied to analyzing some real-world problems. Therefore, how to accurately estimate the variance functions of the error term in a heteroscedastic linear regression model is of great importance for obtaining efficient estimates of the regression parameters and making valid statistical inferences. A method for estimating the variance function of heteroscedastic linear regression models is proposed in this article based on the variance-reduced local linear smoothing technique. Some simulations and comparisons with other method are conducted to assess the performance of the proposed method. The results demonstrate that the proposed method can accurately estimate the variance functions and therefore produce more efficient estimates of the regression parameters.  相似文献   

The use of graphical methods for comparing the quality of prediction throughout the design space of an experiment has been explored extensively for responses modeled with standard linear models. In this paper, fraction of design space (FDS) plots are adapted to evaluate designs for generalized linear models (GLMs). Since the quality of designs for GLMs depends on the model parameters, initial parameter estimates need to be provided by the experimenter. Consequently, an important question to consider is the design's robustness to user misspecification of the initial parameter estimates. FDS plots provide a graphical way of assessing the relative merits of different designs under a variety of types of parameter misspecification. Examples using logistic and Poisson regression models with their canonical links are used to demonstrate the benefits of the FDS plots.  相似文献   

A well-known problem in multiple regression is that it is possible to reject the hypothesis that all slope parameters are equal to zero, yet when applying the usual Student's T-test to the individual parameters, no significant differences are found. An alternative strategy is to estimate prediction error via the 0.632 bootstrap method for all models of interest and declare the parameters associated with the model that yields the smallest prediction error to differ from zero. The main results in this paper are that this latter strategy can have practical value versus Student's T; replacing squared error with absolute error can be beneficial in some situations and replacing least squares with an extension of the Theil-Sen estimator can substantially increase the probability of identifying the correct model under circumstances that are described.  相似文献   


Structured sparsity has recently been a very popular technique to deal with the high-dimensional data. In this paper, we mainly focus on the theoretical problems for the overlapping group structure of generalized linear models (GLMs). Although the overlapping group lasso method for GLMs has been widely applied in some applications, the theoretical properties about it are still unknown. Under some general conditions, we presents the oracle inequalities for the estimation and prediction error of overlapping group Lasso method in the generalized linear model setting. Then, we apply these results to the so-called Logistic and Poisson regression models. It is shown that the results of the Lasso and group Lasso procedures for GLMs can be recovered by specifying the group structures in our proposed method. The effect of overlap and the performance of variable selection of our proposed method are both studied by numerical simulations. Finally, we apply our proposed method to two gene expression data sets: the p53 data and the lung cancer data.  相似文献   

In this article, we propose two novel diagnostic measures for the deletion of influential observations for regression parameters in the setting of generalized linear models. The proposed diagnostic methods are capable for detecting the influential observations under model misspecification, as long as the true underlying distributions have finite second moments.More specifically, it is demonstrated that the Poisson likelihood function can be properly adjusted to become asymptotically valid for practically all underlying discrete distributions. The adjusted Poisson regression model that achieves the robustness property is presented. Simulation studies and an illustration are performed to demonstrate the efficacy of the two novel diagnostic procedures.  相似文献   

The paper proposes a Bayesian quantile regression method for hierarchical linear models. Existing approaches of hierarchical linear quantile regression models are scarce and most of them were not from the perspective of Bayesian thoughts, which is important for hierarchical models. In this paper, based on Bayesian theories and Markov Chain Monte Carlo methods, we introduce Asymmetric Laplace distributed errors to simulate joint posterior distributions of population parameters and across-unit parameters and then derive their posterior quantile inferences. We run a simulation as the proposed method to examine the effects on parameters induced by units and quantile levels; the method is also applied to study the relationship between Chinese rural residents' family annual income and their cultivated areas. Both the simulation and real data analysis indicate that the method is effective and accurate.  相似文献   

Regression analyses are commonly performed with doubly limited continuous dependent variables; for instance, when modeling the behavior of rates, proportions and income concentration indices. Several models are available in the literature for use with such variables, one of them being the unit gamma regression model. In all such models, parameter estimation is typically performed using the maximum likelihood method and testing inferences on the model''s parameters are usually based on the likelihood ratio test. Such a test can, however, deliver quite imprecise inferences when the sample size is small. In this paper, we propose two modified likelihood ratio test statistics for use with the unit gamma regressions that deliver much more accurate inferences when the number of data points in small. Numerical (i.e. simulation) evidence is presented for both fixed dispersion and varying dispersion models, and also for tests that involve nonnested models. We also present and discuss two empirical applications.  相似文献   

The use of parametric link transformation families in generalized linear models (GLM) has been shown to improve substantially the fit of standard analyses using a fixed link in some data sets (see Czado, 1993, for example). When link and regression parameters are globally orthogonal (Cox and Reid, 1987), then the variance inflation of the regression parameter estimates due to the additional estimation of the link is asymptotically zero. Parameter orthogonality also induces numerical stability which is seen in the reduction of computation time required for the calculation of parameter estimates. This stability remains a desirable property even for inferences which are conditional on a fixed link value. Czado and Santner (1992b), for binomial error, and Czado (1992), for GLMs have shown that only local orthogonality can be achieved in general. This paper provides conditions on the link family to extend the notion of local orthogonality at a point to orthogonality in a neighborhood asymptotically and shows that the resulting links are location and scale invariant. General concepts for the construction of such links are given, and it is shown how they relate to link families proposed in the literature. The ideas are illustrated by two examples.  相似文献   

Using the marginal likelihood based on the signed ranks derived from matched pairs data, inferences are made for regression parameters. Both members of a given pair are subject to the same censoring time, while different pairs are subject to different censoring times. Censoring is independent of the response and on the right. Easily calculated logistic density scores are used to provide an approximate analysis so that inferences can be made about a regression parameter in the presence of a difference within the matched pairs. Inference for the survival times of matched skin grafts is considered.  相似文献   

In this paper, we propose a robust estimation procedure for a class of non‐linear regression models when the covariates are contaminated with Laplace measurement error, aiming at constructing an estimation procedure for the regression parameters which are less affected by the possible outliers, and heavy‐tailed underlying distribution, as well as reducing the bias introduced by the measurement error. Starting with the modal regression procedure developed for the measurement error‐free case, a non‐trivial modification is made so that the modified version can effectively correct the potential bias caused by measurement error. Large sample properties of the proposed estimate, such as the convergence rate and the asymptotic normality, are thoroughly investigated. A simulation study and real data application are conducted to illustrate the satisfying finite sample performance of the proposed estimation procedure.  相似文献   

Panel studies are statistical studies in which two or more variables are observed for two or more subjects at two or more points In time. Cross- lagged panel studies are those studies in which the variables are continuous and divide naturally into two effects or impacts of each set of variables on the other. If a regression approach is taken5 a regression structure Is formulated for the cross-lagged models This structure may assume that the regression parameters are homogeneous across waves and across subpopulations. Under such assumptions the methods of multivariate regression analysis can be adapted to make inferences about the parameters. These inferences are limited to the degree that homogeneity of the parameters Is 'supported b}T the data. We consider the problem of testing the hypotheses of homogeneity and consider the problem of making statistical inferences about the cross-effects should there be evidence against one of the homogeneity assumptions. We demonstrate the methods developed by applying then to two panel data sets.  相似文献   

Multicollinearity and model misspecification are frequently encountered problems in practice that produce undesirable effects on classical ordinary least squares (OLS) regression estimator. The ridge regression estimator is an important tool to reduce the effects of multicollinearity, but it is still sensitive to a model misspecification of error distribution. Although rank-based statistical inference has desirable robustness properties compared to the OLS procedures, it can be unstable in the presence of multicollinearity. This paper introduces a rank regression estimator for regression parameters and develops tests for general linear hypotheses in a multiple linear regression model. The proposed estimator and the tests have desirable robustness features against the multicollinearity and model misspecification of error distribution. Asymptotic behaviours of the proposed estimator and the test statistics are investigated. Real and simulated data sets are used to demonstrate the feasibility and the performance of the estimator and the tests.  相似文献   

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques.  相似文献   

A method for robustness in linear models is to assume that there is a mixture of standard and outlier observations with a different error variance for each class. For generalised linear models (GLMs) the mixture model approach is more difficult as the error variance for many distributions has a fixed relationship to the mean. This model is extended to GLMs by changing the classes to one where the standard class is a standard GLM and the outlier class which is an overdispersed GLM achieved by including a random effect term in the linear predictor. The advantages of this method are it can be extended to any model with a linear predictor, and outlier observations can be easily identified. Using simulation the model is compared to an M-estimator, and found to have improved bias and coverage. The method is demonstrated on three examples.  相似文献   

This article presents parametric bootstrap (PB) approaches for hypothesis testing and interval estimation for the regression coefficients and the variance components of panel data regression models with complete panels. The PB pivot variables are proposed based on sufficient statistics of the parameters. On the other hand, we also derive generalized inferences and improved generalized inferences for variance components in this article. Some simulation results are presented to compare the performance of the PB approaches with the generalized inferences. Our studies show that the PB approaches perform satisfactorily for various sample sizes and parameter configurations, and the performance of PB approaches is mostly the same as that of generalized inferences with respect to the expected lengths and powers. The PB inferences have almost exact coverage probabilities and Type I error rates. Furthermore, the PB procedure can be simply carried out by a few simulation steps, and the derivation is easier to understand and to be extended to the incomplete panels. Finally, the proposed approaches are illustrated by using a real data example.  相似文献   

Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

The income or expenditure-related data sets are often nonlinear, heteroscedastic, skewed even after the transformation, and contain numerous outliers. We propose a class of robust nonlinear models that treat outlying observations effectively without removing them. For this purpose, case-specific parameters and a related penalty are employed to detect and modify the outliers systematically. We show how the existing nonlinear models such as smoothing splines and generalized additive models can be robustified by the case-specific parameters. Next, we extend the proposed methods to the heterogeneous models by incorporating unequal weights. The details of estimating the weights are provided. Two real data sets and simulated data sets show the potential of the proposed methods when the nature of the data is nonlinear with outlying observations.  相似文献   

Shapes of service-time distributions in queueing network models have a great impact on the distribution of system response-times. It is essential for the analysis of response-time distribution that the modeled service-time distributions have the correct shape. Tradionally modeling of service-time distributions is based on a parametric approach by assuming a specific distribution and estimating its parameters. We introduce an alternative approach based on the principles of exploratory data analysis and nonparametric data modeling. The proposed method applies nonlinear data transformation and resistant curve fitting. The method can be used in cases, where the available data is a complete sample, a histogram, or the mean and a set of 5-10 quantiles. The reported results indicate that the proposed method is able to approximate the distribution of measured service times so that accurate estimates for quantiles of the response-time distribution are obtained  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号