共查询到20条相似文献,搜索用时 125 毫秒
1.
《Journal of Statistical Computation and Simulation》2012,82(3):517-537
Skew scale mixtures of normal distributions are often used for statistical procedures involving asymmetric data and heavy-tailed. The main virtue of the members of this family of distributions is that they are easy to simulate from and they also supply genuine expectation-maximization (EM) algorithms for maximum likelihood estimation. In this paper, we extend the EM algorithm for linear regression models and we develop diagnostics analyses via local influence and generalized leverage, following Zhu and Lee's approach. This is because Cook's well-known approach cannot be used to obtain measures of local influence. The EM-type algorithm has been discussed with an emphasis on the skew Student-t-normal, skew slash, skew-contaminated normal and skew power-exponential distributions. Finally, results obtained for a real data set are reported, illustrating the usefulness of the proposed method. 相似文献
2.
Partially linear models (PLMs) are an important tool in modelling economic and biometric data and are considered as a flexible generalization of the linear model by including a nonparametric component of some covariate into the linear predictor. Usually, the error component is assumed to follow a normal distribution. However, the theory and application (through simulation or experimentation) often generate a great amount of data sets that are skewed. The objective of this paper is to extend the PLMs allowing the errors to follow a skew-normal distribution [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178], increasing the flexibility of the model. In particular, we develop the expectation-maximization (EM) algorithm for linear regression models and diagnostic analysis via local influence as well as generalized leverage, following [H. Zhu and S. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126]. A simulation study is also conducted to evaluate the efficiency of the EM algorithm. Finally, a suitable transformation is applied in a data set on ragweed pollen concentration in order to fit PLMs under asymmetric distributions. An illustrative comparison is performed between normal and skew-normal errors. 相似文献
3.
The robust estimation and the local influence analysis for linear regression models with scale mixtures of multivariate skew-normal distributions have been developed in this article. The main virtue of considering the linear regression model under the class of scale mixtures of skew-normal distributions is that they have a nice hierarchical representation which allows an easy implementation of inference. Inspired by the expectation maximization algorithm, we have developed a local influence analysis based on the conditional expectation of the complete-data log-likelihood function, which is a measurement invariant under reparametrizations. This is because the observed data log-likelihood function associated with the proposed model is somewhat complex and with Cook's well-known approach it can be very difficult to obtain measures of the local influence. Some useful perturbation schemes are discussed. In order to examine the robust aspect of this flexible class against outlying and influential observations, some simulation studies have also been presented. Finally, a real data set has been analyzed, illustrating the usefulness of the proposed methodology. 相似文献
4.
We discuss in this paper the assessment of local influence in univariate elliptical linear regression models. This class includes
all symmetric continuous distributions, such as normal, Student-t, Pearson VII, exponential power and logistic, among others.
We derive the appropriate matrices for assessing the local influence on the parameter estimates and on predictions by considering
as influence measures the likelihood displacement and a distance based on the Pearson residual. Two examples with real data
are given for illustration. 相似文献
5.
《Journal of Statistical Computation and Simulation》2012,82(10):1115-1129
This paper presents a unified method for influence analysis to deal with random effects appeared in additive nonlinear regression models for repeated measurement data. The basic idea is to apply the Q-function, the conditional expectation of the complete-data log-likelihood function obtained from EM algorithm, instead of the observed-data log-likelihood function as used in standard influence analysis. Diagnostic measures are derived based on the case-deletion approach and the local influence approach. Two real examples and a simulation study are examined to illustrate our methodology. 相似文献
6.
《Journal of Statistical Computation and Simulation》2012,82(7):909-922
In this paper, we develop diagnostic methods for generalized Poisson regression (GPR) models with errors in variables based on the corrected likelihood. The one-step approximations of the estimates in the case-deletion model are given and case-deletion and local influence measures are presented. Meanwhile, based on a corrected score function, the testing statistics for the significance of dispersion parameters in GPR models with measurement errors are investigated. Finally, illustration of our methodology is given through numerical examples. 相似文献
7.
《Journal of Statistical Computation and Simulation》2012,82(10):1101-1113
The importance of the normal distribution for fitting continuous data is well known. However, in many practical situations data distribution departs from normality. For example, the sample skewness and the sample kurtosis are far away from 0 and 3, respectively, which are nice properties of normal distributions. So, it is important to have formal tests of normality against any alternative. D'Agostino et al. [A suggestion for using powerful and informative tests of normality, Am. Statist. 44 (1990), pp. 316–321] review four procedures Z 2(g 1), Z 2(g 2), D and K 2 for testing departure from normality. The first two of these procedures are tests of normality against departure due to skewness and kurtosis, respectively. The other two tests are omnibus tests. An alternative to the normal distribution is a class of skew-normal distributions (see [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178]). In this paper, we obtain a score test (W) and a likelihood ratio test (LR) of goodness of fit of the normal regression model against the skew-normal family of regression models. It turns out that the score test is based on the sample skewness and is of very simple form. The performance of these six procedures, in terms of size and power, are compared using simulations. The level properties of the three statistics LR, W and Z 2(g 1) are similar and close to the nominal level for moderate to large sample sizes. Also, their power properties are similar for small departure from normality due to skewness (γ1≤0.4). Of these, the score test statistic has a very simple form and computationally much simpler than the other two statistics. The LR statistic, in general, has highest power, although it is computationally much complex as it requires estimates of the parameters under the normal model as well as those under the skew-normal model. So, the score test may be used to test for normality against small departure from normality due to skewness. Otherwise, the likelihood ratio statistic LR should be used as it detects general departure from normality (due to both skewness and kurtosis) with, in general, largest power. 相似文献
8.
Virginia F. Flack 《统计学通讯:理论与方法》2013,42(2):755-766
Methods for flagging new points that are not similar to the original data used for developing a ridge regression equation are discussed. Using the regression equation for predictions should be avoided for these dissimilar points. An example quantifies the sample space limitations for using biased prediction when multicollinearity is present. 相似文献
9.
In this paper, we extend the censored linear regression model with normal errors to Student-t errors. A simple EM-type algorithm for iteratively computing maximum-likelihood estimates of the parameters is presented. To examine the performance of the proposed model, case-deletion and local influence techniques are developed to show its robust aspect against outlying and influential observations. This is done by the analysis of the sensitivity of the EM estimates under some usual perturbation schemes in the model or data and by inspecting some proposed diagnostic graphics. The efficacy of the method is verified through the analysis of simulated data sets and modelling a real data set first analysed under normal errors. The proposed algorithm and methods are implemented in the R package CensRegMod. 相似文献
10.
Ai-Xia Fan 《统计学通讯:模拟与计算》2017,46(7):5323-5339
This article investigates case-deletion influence analysis via Cook’s distance and local influence analysis via conformal normal curvature for partially linear models with response missing at random. Local influence approach is developed to assess the sensitivity of parameter and nonparametric estimators to various perturbations such as case-weight, response variable, explanatory variable, and parameter perturbations on the basis of semiparametric estimating equations, which are constructed using the inverse probability weighted approach, rather than likelihood function. Residual and generalized leverage are also defined. Simulation studies and a dataset taken from the AIDS Clinical Trials are used to illustrate the proposed methods. 相似文献
11.
Luis Hernando Vanegas Gauss M. Cordeiro 《Journal of Statistical Computation and Simulation》2013,83(12):2315-2338
We propose some statistical tools for diagnosing the class of generalized Weibull linear regression models [A.A. Prudente and G.M. Cordeiro, Generalized Weibull linear models, Comm. Statist. Theory Methods 39 (2010), pp. 3739–3755]. This class of models is an alternative means of analysing positive, continuous and skewed data and, due to its statistical properties, is very competitive with gamma regression models. First, we show that the Weibull model induces ma-ximum likelihood estimators asymptotically more efficient than the gamma model. Standardized residuals are defined, and their statistical properties are examined empirically. Some measures are derived based on the case-deletion model, including the generalized Cook's distance and measures for identifying influential observations on partial F-tests. The results of a simulation study conducted to assess behaviour of the global influence approach are also presented. Further, we perform a local influence analysis under the case-weights, response and explanatory variables perturbation schemes. The Weibull, gamma and other Weibull-type regression models are fitted into three data sets to illustrate the proposed diagnostic tools. Statistical analyses indicate that the Weibull model fitted into these data yields better fits than other common alternative models. 相似文献
12.
The authors propose a robust bounded‐influence estimator for binary regression with continuous outcomes, an alternative to logistic regression when the investigator's interest focuses on the proportion of subjects who fall below or above a cut‐off value. The authors show both theoretically and empirically that in this context, the maximum likelihood estimator is sensitive to model misspecifications. They show that their robust estimator is more stable and nearly as efficient as maximum likelihood when the hypotheses are satisfied. Moreover, it leads to safer inference. The authors compare the different estimators in a simulation study and present an analysis of hypertension on Harlem survey data. 相似文献
13.
Carolina Marchant Francisco José A. Cysneiros Juan F. Vivanco 《Journal of applied statistics》2016,43(15):2829-2849
Birnbaum–Saunders (BS) models are receiving considerable attention in the literature. Multivariate regression models are a useful tool of the multivariate analysis, which takes into account the correlation between variables. Diagnostic analysis is an important aspect to be considered in the statistical modeling. In this paper, we formulate multivariate generalized BS regression models and carry out a diagnostic analysis for these models. We consider the Mahalanobis distance as a global influence measure to detect multivariate outliers and use it for evaluating the adequacy of the distributional assumption. We also consider the local influence approach and study how a perturbation may impact on the estimation of model parameters. We implement the obtained results in the R software, which are illustrated with real-world multivariate data to show their potential applications. 相似文献
14.
Maria Ioneris Oliveira Michelli Barros Joelson Campos Francisco Jos A. Cysneiros 《Journal of applied statistics》2022,49(5):1252
In this paper, we discuss the bivariate Birnbaum-Saunders accelerated lifetime model, in which we have modeled the dependence structure of bivariate survival data through the use of frailty models. Specifically, we propose the bivariate model Birnbaum-Saunders with the following frailty distributions: gamma, positive stable and logarithmic series. We present a study of inference and diagnostic analysis for the proposed model, more concisely, are proposed a diagnostic analysis based in local influence and residual analysis to assess the fit model, as well as, to detect influential observations. In this regard, we derived the normal curvatures of local influence under different perturbation schemes and we performed some simulation studies for assessing the potential of residuals to detect misspecification in the systematic component, the presence in the stochastic component of the model and to detect outliers. Finally, we apply the methodology studied to real data set from recurrence in times of infections of 38 kidney patients using a portable dialysis machine, we analyzed these data considering independence within the pairs and using the bivariate Birnbaum-Saunders accelerated lifetime model, so that we could make a comparison and verify the importance of modeling dependence within the times of infection associated with the same patient. 相似文献
15.
Myung Genn Kim 《统计学通讯:理论与方法》2013,42(5):1271-1278
The method of local influence is generalized to the multivariate regression. The scheme of perturbations adopted in multivariate regression is similar in spirit to the perturbation of case-weights in univariate regression case. The method developed here is useful for identifying influential observations in multivariate regression as an exploratory or confirmatory data analysis. An illustrative example is given for the effectiveness of the local influence approach in multivariate regression. 相似文献
16.
Víctor Leiva Shuangzhe Liu Lei Shi Francisco José A. Cysneiros 《Journal of applied statistics》2016,43(4):627-642
We propose an influence diagnostic methodology for linear regression models with stochastic restrictions and errors following elliptically contoured distributions. We study how a perturbation may impact on the mixed estimation procedure of parameters in the model. Normal curvatures and slopes for assessing influence under usual schemes are derived, including perturbations of case-weight, response variable, and explanatory variable. Simulations are conducted to evaluate the performance of the proposed methodology. An example with real-world economy data is presented as an illustration. 相似文献
17.
《Journal of Statistical Computation and Simulation》2012,82(9):813-827
We introduce multicovariate-adjusted regression (MCAR), an adjustment method for regression analysis, where both the response (Y) and predictors (X 1, …, X p ) are not directly observed. The available data have been contaminated by unknown functions of a set of observable distorting covariates, Z 1, …, Z s , in a multiplicative fashion. The proposed method substantially extends the current contaminated regression modelling capability, by allowing for multiple distorting covariate effects. MCAR is a flexible generalisation of the recently proposed covariate-adjusted regression method, an effective adjustment method in the presence of a single covariate, Z. For MCAR estimation, we establish a connection between the MCAR models and adaptive varying coefficient models. This connection leads to an adaptation of a hybrid backfitting estimation algorithm. Extensive simulations are used to study the performance and limitations of the proposed iterative estimation algorithm. In particular, the bias and mean square error of the proposed MCAR estimators are examined, relative to a baseline and a consistent benchmark estimator. The method is also illustrated with a Pima Indian diabetes data set, where the response and predictors are potentially contaminated by body mass index and triceps skin fold thickness. Both distorting covariates measure aspects of obesity, an important risk factor in type 2 diabetes. 相似文献
18.
Robust regression has not had a great impact on statistical practice, although all statisticians are convinced of its importance. The procedures for robust regression currently available are complex, and computer intensive. With a modification of the Gaussian paradigm, taking into consideration outliers and leverage points, we propose an iteratively weighted least squares method which gives robust fits. The procedure is illustrated by applying it on data sets which have been previously used to illustrate robust regression methods.It is hoped that this simple, effective and accessible method will find its use in statistical practice. 相似文献
19.
《Journal of Statistical Computation and Simulation》2012,82(1):27-39
Linear regression models with coefficients across individual units regarded as random samples from some population are studied in this article from a Bayesian viewpoint. A prior distribution of the secondary parameters is derived following the Jeffreys rule. Posterior distribution of the primary and secondary parameters, and the predictive distribution of the future value are then examined. Computations of the parameter estimates are found to be rather straightforward. Data from a performance test on pigs is analysed and discussed. We also discuss the difficulties involved in using a Lindley and Smith (1972) prior in this problem. 相似文献
20.
Filidor V. Labra Aldo M. Garay Victor H. Lachos Edwin M.M. Ortega 《Journal of statistical planning and inference》2012
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. 相似文献