首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

We develop splice plots as a diagnostic tool for parametric generalized linear models. Splice plots use the independence of the outcome and explanatory measures given the regression function. Plotting differences between the estimated parametric regression function and non-parametric estimates of the regression function computed in small neighborhoods of the fitted values from the parametric model can be used to assess model fit.  相似文献   

2.
In this study we investigate the problem of estimation and testing of hypotheses in multivariate linear regression models when the errors involved are assumed to be non-normally distributed. We consider the class of heavy-tailed distributions for this purpose. Although our method is applicable for any distribution in this class, we take the multivariate t-distribution for illustration. This distribution has applications in many fields of applied research such as Economics, Business, and Finance. For estimation purpose, we use the modified maximum likelihood method in order to get the so-called modified maximum likelihood estimates that are obtained in a closed form. We show that these estimates are substantially more efficient than least-square estimates. They are also found to be robust to reasonable deviations from the assumed distribution and also many data anomalies such as the presence of outliers in the sample, etc. We further provide test statistics for testing the relevant hypothesis regarding the regression coefficients.  相似文献   

3.
We Consider the generalized multivariate linear model and assume the covariance matrix of the p x 1 vector of responses on a given individual can be represented in the general linear structure form described by Anderson (1973). The effects of the use of estimates of the parameters of the covariance matrix on the generalized least squares estimator of the regression coefficients and on the prediction of a portion of a future vector, when only the first portion of the vector has been observed, are investigated. Approximations are derived for the covariance matrix of the generalized least squares estimator and for the mean square error matrix of the usual predictor, for the practical case where estimated parameters are used.  相似文献   

4.
Simultaneous confidence bands have been shown in the statistical literature as powerful inferential tools in univariate linear regression. While the methodology of simultaneous confidence bands for univariate linear regression has been extensively researched and well developed, no published work seems available for multivariate linear regression. This paper fills this gap by studying one particular simultaneous confidence band for multivariate linear regression. Because of the shape of the band, the word ‘tube’ is more pertinent and so will be used to replace the word ‘band’. It is shown that the construction of the tube is related to the distribution of the largest eigenvalue. A simulation‐based method is proposed to compute the 1 ? α quantile of this eigenvalue. With the computation power of modern computers, the simultaneous confidence tube can be computed fast and accurately. A real‐data example is used to illustrate the method, and many potential research problems have been pointed out.  相似文献   

5.
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827–842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.  相似文献   

6.
Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

7.
The expressions for moments of order statistics from the generalized gamma distribution are derived. Coefficients to get the BLUEs of location and scale parameters in the generalized gamma distribution are computed. Some simple alternative linear unbiased estimates of location and scale parameters are also proposed and their relative efficiencies compared to the BLUEs are studied.  相似文献   

8.
For the first time, a new class of generalized Weibull linear models is introduced to be competitive to the well-known generalized (gamma and inverse Gaussian) linear models which are adequate for the analysis of positive continuous data. The proposed models have a constant coefficient of variation for all observations similar to the gamma models and may be suitable for a wide range of practical applications in various fields such as biology, medicine, engineering, and economics, among others. We derive a joint iterative algorithm for estimating the mean and dispersion parameters. We obtain closed form expressions in matrix notation for the second-order biases of the maximum likelihood estimates of the model parameters and define bias corrected estimates. The corrected estimates are easily obtained as vectors of regression coefficients in suitable weighted linear regressions. The practical use of the new class of models is illustrated in one application to a lung cancer data set.  相似文献   

9.
In this paper, we discuss how a regression model, with a non-continuous response variable, which allows for dependency between observations, should be estimated when observations are clustered and measurements on the subjects are repeated. The cluster sizes are assumed to be large. We find that the conventional estimation technique suggested by the literature on generalized linear mixed models (GLMM) is slow and sometimes fails due to non-convergence and lack of memory on standard PCs. We suggest to estimate the random effects as fixed effects by generalized linear model and to derive the covariance matrix from these estimates. A simulation study shows that our proposal is feasible in terms of mean-square error and computation time. We recommend that our proposal be implemented in the software of GLMM techniques so that the estimation procedure can switch between the conventional technique and our proposal, depending on the size of the clusters.  相似文献   

10.
11.
ABSTRACT

We propose a new semiparametric Weibull cure rate model for fitting nonlinear effects of explanatory variables on the mean, scale and cure rate parameters. The regression model is based on the generalized additive models for location, scale and shape, for which any or all distribution parameters can be modeled as parametric linear and/or nonparametric smooth functions of explanatory variables. We present methods to select additive terms, model estimation and validation, where all computational codes are presented in a simple way such that any R user can fit the new model. Biases of the parameter estimates caused by models specified erroneously are investigated through Monte Carlo simulations. We illustrate the usefulness of the new model by means of two applications to real data. We provide computational codes to fit the new regression model in the R software.  相似文献   

12.
We often rely on the likelihood to obtain estimates of regression parameters but it is not readily available for generalized linear mixed models (GLMMs). Inferences for the regression coefficients and the covariance parameters are key in these models. We presented alternative approaches for analyzing binary data from a hierarchical structure that do not rely on any distributional assumptions: a generalized quasi-likelihood (GQL) approach and a generalized method of moments (GMM) approach. These are alternative approaches to the typical maximum-likelihood approximation approach in Statistical Analysis System (SAS) such as Laplace approximation (LAP). We examined and compared the performance of GQL and GMM approaches with multiple random effects to the LAP approach as used in PROC GLIMMIX, SAS. The GQL approach tends to produce unbiased estimates, whereas the LAP approach can lead to highly biased estimates for certain scenarios. The GQL approach produces more accurate estimates on both the regression coefficients and the covariance parameters with smaller standard errors as compared to the GMM approach. We found that both GQL and GMM approaches are less likely to result in non-convergence as opposed to the LAP approach. A simulation study was conducted and a numerical example was presented for illustrative purposes.  相似文献   

13.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

14.
This paper shows that by minimizing a Chebychev norm a mixing distribution can be constructed which converges weakly to the true mixing distribution with probability one. Deely and Kruse (1968) established a similar result for the supremum norm. For both norms the constructed mixing distribution is computed by solving a linear programming problem, but this problem is considerably smaller when the Chebychev norm is used. Thus a suitable mixing distribution can be constructed from solving a linear programming problem with considerably less computational work than was previously known. To illustrate the application of this simpler procedure it is applied to derive nonparametric empirical Bayes estimates in a simulation study. Some density estimates are also illustrated.  相似文献   

15.
It is well-known that Ordinary Least Squares (OLS) yields inconsistent estimates if applied to a regression equation with lagged dependent variables and correlated errors. Bias expressions which appear in the literature usually assume the exogenous variables to be non-stochastic. Due to this assumption the numerical sizes of these expressions cannot be determined. Further, the analysis is mostly restricted to very simple models. In this paper the problem of calculating the asymptotic bias of OLS is generalized to stationary dynamic regression models, where the errors follow a stationary ARMA process. A general bias expression is derived and a method is introduced by which its actual size can be computed numerically.  相似文献   

16.
In this paper, we introduce linear modeling of canonical correlation analysis, which estimates canonical direction matrices by minimising a quadratic objective function. The linear modeling results in a class of estimators of canonical direction matrices, and an optimal class is derived in the sense described herein. The optimal class guarantees several of the following desirable advantages: first, its estimates of canonical direction matrices are asymptotically efficient; second, its test statistic for determining the number of canonical covariates always has a chi‐squared distribution asymptotically; third, it is straight forward to construct tests for variable selection. The standard canonical correlation analysis and other existing methods turn out to be suboptimal members of the class. Finally, we study the role of canonical variates as a means of dimension reduction for predictors and responses in multivariate regression. Numerical studies and data analysis are presented.  相似文献   

17.
In this paper, we propose a new semiparametric heteroscedastic regression model allowing for positive and negative skewness and bimodal shapes using the B-spline basis for nonlinear effects. The proposed distribution is based on the generalized additive models for location, scale and shape framework in order to model any or all parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. We motivate the new model by means of Monte Carlo simulations, thus ignoring the skewness and bimodality of the random errors in semiparametric regression models, which may introduce biases on the parameter estimates and/or on the estimation of the associated variability measures. An iterative estimation process and some diagnostic methods are investigated. Applications to two real data sets are presented and the method is compared to the usual regression methods.  相似文献   

18.
Remove unwanted variation (RUV) is an estimation and normalization system in which the underlying correlation structure of a multivariate dataset is estimated from negative control measurements, typically gene expression values, which are assumed to stay constant across experimental conditions. In this paper we derive the weight matrix which is estimated and incorporated into the generalized least squares estimates of RUV-inverse, and show that this weight matrix estimates the average covariance matrix across negative control measurements. RUV-inverse can thus be viewed as an estimation method adjusting for an unknown experimental design. We show that for a balanced incomplete block design (BIBD), RUV-inverse recovers intra- and interblock estimates of the relevant parameters and combines them as a weighted sum just like the best linear unbiased estimator (BLUE), except that the weights are globally estimated from the negative control measurements instead of being individually optimized to each measurement as in the classical, single measurement BIBD BLUE.  相似文献   

19.
We introduce the log-odd Weibull regression model based on the odd Weibull distribution (Cooray, 2006). We derive some mathematical properties of the log-transformed distribution. The new regression model represents a parametric family of models that includes as sub-models some widely known regression models that can be applied to censored survival data. We employ a frequentist analysis and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to assess global influence. Further, for different parameter settings, sample sizes and censoring percentages, some simulations are performed. In addition, the empirical distribution of some modified residuals are given and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to check the model assumptions. The extended regression model is very useful for the analysis of real data.  相似文献   

20.
Several estimators are examined for the simple linear regression model under a controlled, experimental situation with multiple observations at each design point. The model is examined under normal and non-normal error distributions and mild heterogeneity of variances across the chosen design points. We consider the ordinary, generalized, and estimated generalized least squares estimators and several examples of M estimators. The asymptotic properties of the M estimator using the Huber ψ are presented under these conditions for the multiple regression model. A simulation study is also presented which indicates that the M estimator possesses strong robustness properties under the presence of both non-normality and mild heteroscedasticity o£ errors. Finally, the M estimates are compared to the least squares estimates in two examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号