期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Local influence analysis for regression models with scale mixtures of skew-normal distributions

C. B. Zeller F. E. Vilca-Labra 《Journal of applied statistics》2011,38(2):343-368

The robust estimation and the local influence analysis for linear regression models with scale mixtures of multivariate skew-normal distributions have been developed in this article. The main virtue of considering the linear regression model under the class of scale mixtures of skew-normal distributions is that they have a nice hierarchical representation which allows an easy implementation of inference. Inspired by the expectation maximization algorithm, we have developed a local influence analysis based on the conditional expectation of the complete-data log-likelihood function, which is a measurement invariant under reparametrizations. This is because the observed data log-likelihood function associated with the proposed model is somewhat complex and with Cook's well-known approach it can be very difficult to obtain measures of the local influence. Some useful perturbation schemes are discussed. In order to examine the robust aspect of this flexible class against outlying and influential observations, some simulation studies have also been presented. Finally, a real data set has been analyzed, illustrating the usefulness of the proposed methodology. 相似文献

2.

A robust R control chart based on a two-step estimator of the process dispersion

Hamid Shahriari 《统计学通讯:理论与方法》2013,42(9):2504-2523

ABSTRACT

Control charts are the frequently used tools for monitoring and controlling the processes. Classical control charts are sensitive to existing contaminated data which may be presented in the data collected from the processes. Thus, these charts are not able to control the processes precisely when the data are contaminated. Robust control charts are those which are less sensitive to contamination. Some robust control charts for monitoring the process variability were proposed in the past which are robust to some sorts of contamination. In this paper a new robust R control chart is proposed which is less sensitive to wide range of contaminations, i.e. general and local contaminations. Simulation studies are performed to compare the performance of the proposed control chart with some classical and robust control charts, using ARL and MSD as criteria for comparisons purposes. The simulation results show a very good performance of the proposed chart when both types of contaminations exist. 相似文献

3.

Outlier identification and robust parameter estimation in a zero-inflated Poisson model

Jun Yang Min Xie Thong Ngee Goh 《Journal of applied statistics》2011,38(2):421-430

The Zero-inflated Poisson distribution has been used in the modeling of count data in different contexts. This model tends to be influenced by outliers because of the excessive occurrence of zeroes, thus outlier identification and robust parameter estimation are important for such distribution. Some outlier identification methods are studied in this paper, and their applications and results are also presented with an example. To eliminate the effect of outliers, two robust parameter estimates are proposed based on the trimmed mean and the Winsorized mean. Simulation results show the robustness of our proposed parameter estimates. 相似文献

4.

Robust prediction and extrapolation designs for censored data

Xiaojian Xu 《Journal of statistical planning and inference》2009

In this paper we present the construction of robust designs for a possibly misspecified generalized linear regression model when the data are censored. The minimax designs and unbiased designs are found for maximum likelihood estimation in the context of both prediction and extrapolation problems. This paper extends preceding work of robust designs for complete data by incorporating censoring and maximum likelihood estimation. It also broadens former work of robust designs for censored data from others by considering both nonlinearity and much more arbitrary uncertainty in the fitted regression response and by dropping all restrictions on the structure of the regressors. Solutions are derived by a nonsmooth optimization technique analytically and given in full generality. A typical example in accelerated life testing is also demonstrated. We also investigate implementation schemes which are utilized to approximate a robust design having a density. Some exact designs are obtained using an optimal implementation scheme. 相似文献

5.

ROBUST METHODS FOR DEPENDENT CELL PEDIGREE DATA

Lan C. Marschner 《Australian & New Zealand Journal of Statistics》1992,34(2):181-198

This paper studies a robust approach to the analysis of cell pedigree data, building on the work of Huggins & Marschner (1991) which discussed M-estimation for the so-called bifurcating autoregressive process. The study allows for incomplete observation of the pedigree, and incorporates the possibility of additive effects outliers, as discussed in the time series literature. Some properties of the proposed estimation procedure are studied, including a Monte Carlo investigation of robustness in the presence of contamination. 相似文献

6.

Robust stepwise regression

C. Agostinelli 《Journal of applied statistics》2002,29(6):825-840

The selection of an appropriate subset of explanatory variables to use in a linear regression model is an important aspect of a statistical analysis. Classical stepwise regression is often used with this aim but it could be invalidated by a few outlying observations. In this paper, we introduce a robust F-test and a robust stepwise regression procedure based on weighted likelihood in order to achieve robustness against the presence of outliers. The introduced methodology is asymptotically equivalent to the classical one when no contamination is present. Some examples and simulation are presented. 相似文献

7.

Robustness of designed experiments against missing data

Krishan Lal V. K. Gupta Lalmohan Bhar 《Journal of applied statistics》2001,28(1):63-79

This paper investigates the robustness of designed experiments for estimating linear functions of a subset of parameters in a general linear model against the loss of any t( ≥1) observations. Necessary and sufficient conditions for robustness of a design under a homoscedastic model are derived. It is shown that a design robust under a homoscedastic model is also robust under a general heteroscedastic model with correlated observations. As a particular case, necessary and sufficient conditions are obtained for the robustness of block designs against the loss of data. Simple sufficient conditions are also provided for the binary block designs to be robust against the loss of data. Some classes of designs, robust up to three missing observations, are identified. A-efficiency of the residual design is evaluated for certain block designs for several patterns of two missing observations. The efficiency of the residual design has also been worked out when all the observations in any two blocks, not necessarily disjoint, are lost. The lower bound to A-efficiency has also been obtained for the loss of t observations. Finally, a general expression is obtained for the efficiency of the residual design when all the observations of m ( ≥1) disjoint blocks are lost. 相似文献

8.

What can the foundations discussion contribute to data analysis? And what may be some of the future directions in robust methods and data analysis?

《Journal of statistical planning and inference》1997,57(1):7-19

相似文献

9.

Consistency and Normality of M-Estimators in Partly Linear Models with Stochastic Adapted Errors

Li Yan Xia Chen 《统计学通讯:理论与方法》2013,42(9):1557-1568

The robust M-estimators for the partly linear model under stochastic adapted errors are considered. It is shown that the M-estimator of parameter is asymptotically normal and the M-estimator of the nonparametric function achieves the optimal rate of convergence for nonparametric regression. Some known results are improved and generalized. Some simulations and a real data example are conducted to illustrate the proposed method. 相似文献

10.

A robust Parafac model for compositional data

M. A. Di Palma P. Filzmoser M. Gallo K. Hron 《Journal of applied statistics》2018,45(8):1347-1369

Compositional data are characterized by values containing relative information, and thus the ratios between the data values are of interest for the analysis. Due to specific features of compositional data, standard statistical methods should be applied to compositions expressed in a proper coordinate system with respect to an orthonormal basis. It is discussed how three-way compositional data can be analyzed with the Parafac model. When data are contaminated by outliers, robust estimates for the Parafac model parameters should be employed. It is demonstrated how robust estimation can be done in the context of compositional data and how the results can be interpreted. A real data example from macroeconomics underlines the usefulness of this approach. 相似文献

11.

Robust and diagnostic regression analyses

Anthony C Atkinson 《统计学通讯:理论与方法》2013,42(22):2559-2571

Graphical methods of diagnostic regression analysis are applied to three examples in which least squares and robust regression analyses give substantially different results. The diagnostic tools lead to the identification of data deficiencies and model inadequacies. The analyses serve as a reminder that robust regressions depend upon the linear model and upon the scale in whicli the response is analysed. The robust analysis may also be sensitive to gross errors in one or more explanatory variables 相似文献

12.

Robust estimation in generalized linear mixed models

Kelvin K. W. Yau & Anthony Y. C. Kuk 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(1):101-117

Generalized linear mixed models (GLMMs) are widely used to analyse non-normal response data with extra-variation, but non-robust estimators are still routinely used. We propose robust methods for maximum quasi-likelihood and residual maximum quasi-likelihood estimation to limit the influence of outlying observations in GLMMs. The estimation procedure parallels the development of robust estimation methods in linear mixed models, but with adjustments in the dependent variable and the variance component. The methods proposed are applied to three data sets and a comparison is made with the nonparametric maximum likelihood approach. When applied to a set of epileptic seizure data, the methods proposed have the desired effect of limiting the influence of outlying observations on the parameter estimates. Simulation shows that one of the residual maximum quasi-likelihood proposals has a smaller bias than those of the other estimation methods. We further discuss the equivalence of two GLMM formulations when the response variable follows an exponential family. Their extensions to robust GLMMs and their comparative advantages in modelling are described. Some possible modifications of the robust GLMM estimation methods are given to provide further flexibility for applying the method. 相似文献

13.

The power of monitoring: how to make the most of a contaminated multivariate sample

Andrea Cerioli Marco Riani Anthony C. Atkinson Aldo Corbellini 《Statistical Methods and Applications》2018,27(4):559-587

Diagnostic tools must rely on robust high-breakdown methodologies to avoid distortion in the presence of contamination by outliers. However, a disadvantage of having a single, even if robust, summary of the data is that important choices concerning parameters of the robust method, such as breakdown point, have to be made prior to the analysis. The effect of such choices may be difficult to evaluate. We argue that an effective solution is to look at several pictures, and possibly to a whole movie, of the available data. This can be achieved by monitoring, over a range of parameter values, the results computed through the robust methodology of choice. We show the information gain that monitoring provides in the study of complex data structures through the analysis of multivariate datasets using different high-breakdown techniques. Our findings support the claim that the principle of monitoring is very flexible and that it can lead to robust estimators that are as efficient as possible. We also address through simulation some of the tricky inferential issues that arise from monitoring. 相似文献

14.

Small sample size comparisons of tests for homogeneity of variances by Monte-Carlo

S Geng W. J Wang C Miller 《统计学通讯:模拟与计算》2013,42(4):379-389

A number of tests are available for testing the equality of several population variances. Some are claimed to be robust. We compared six of those claimed robust procedures by Monte Carlo simulated experiments, particularly for cases of small and unequal sample sizes. Our results show that the jack-knife test compares favorably with the other tests. 相似文献

15.

Robust splines

Russell V. Lenth 《统计学通讯:理论与方法》2013,42(9):847-854

We consider the problem of fitting a cubic spline to data using robust regression techniques. Some important properties of splines are discussed, showing that their use as a regression model is related in principle to the concept of robustness. Methods for fitting splines and interpreting the results are outlined, and an illustrative example is given. 相似文献

16.

Robust estimation of the mean vector for high-dimensional data set using robust clustering

Hamid Shahriari 《Journal of applied statistics》2015,42(6):1183-1205

The first step in statistical analysis is the parameter estimation. In multivariate analysis, one of the parameters of interest to be estimated is the mean vector. In multivariate statistical analysis, it is usually assumed that the data come from a multivariate normal distribution. In this situation, the maximum likelihood estimator (MLE), that is, the sample mean vector, is the best estimator. However, when outliers exist in the data, the use of sample mean vector will result in poor estimation. So, other estimators which are robust to the existence of outliers should be used. The most popular robust multivariate estimator for estimating the mean vector is S-estimator with desirable properties. However, computing this estimator requires the use of a robust estimate of mean vector as a starting point. Usually minimum volume ellipsoid (MVE) is used as a starting point in computing S-estimator. For high-dimensional data computing, the MVE takes too much time. In some cases, this time is so large that the existing computers cannot perform the computation. In addition to the computation time, for high-dimensional data set the MVE method is not precise. In this paper, a robust starting point for S-estimator based on robust clustering is proposed which could be used for estimating the mean vector of the high-dimensional data. The performance of the proposed estimator in the presence of outliers is studied and the results indicate that the proposed estimator performs precisely and much better than some of the existing robust estimators for high-dimensional data. 相似文献

17.

Location and Scale Estimation with Correlation Coefficients

Rudy Gideon Adele Marie Rothan 《统计学通讯:理论与方法》2013,42(9):1561-1572

This article shows how to use any correlation coefficient to produce an estimate of location and scale. It is part of a broader system, called a correlation estimation system (CES), that uses correlation coefficients as the starting point for estimations. The method is illustrated using the well-known normal distribution. This article shows that any correlation coefficient can be used to fit a simple linear regression line to bivariate data and then the slope and intercept are estimates of standard deviation and location. Because a robust correlation will produce robust estimates, this CES can be recommended as a tool for everyday data analysis. Simulations indicate that the median with this method using a robust correlation coefficient appears to be nearly as efficient as the mean with good data and much better if there are a few errant data points. Hypothesis testing and confidence intervals are discussed for the scale parameter; both normal and Cauchy distributions are covered. 相似文献

18.

Estimation of the Parameters of the Birnbaum–Saunders Distribution

Steven G. From Linxiong Li 《统计学通讯:理论与方法》2013,42(12):2157-2169

Some alternative estimators to the maximum likelihood estimators of the two parameters of the Birnbaum–Saunders distribution are proposed. Most have high efficiencies as measured by root mean square error and are robust to departure from the model as well as to outliers. In addition, the proposed estimators are easy to compute. Both complete and right-censored data are discussed. Simulation studies are provided to compare the performance of the estimators. 相似文献

19.

Long Memory in Foreign-Exchange Rates

Yin-Wong Cheung 《商业与经济统计学杂志》2013,31(1):93-101

Using the Geweke–Porter-Hudak test, we find evidence of long memory in exchange-rate data. This implies that the empirical evidence of unit roots in exchange rates may not be robust to long-memory alternatives. Fractionally integrated autoregressive moving average (ARFIMA) models are estimated by both the time-domain exact maximum likelihood (ML) method and the frequency-domain approximate ML method. Impulse-response functions and forecasts based on these estimated ARFIMA models are evaluated to gain insight into the long-memory characteristics of exchange rates. Some tentative explanations of the long memory found in the exchange rates are discussed. 相似文献

20.

The evaluation of socio-economic development of development agency regions in Turkey using classical and robust principal component analyses

Hasan Bulut Yüksel Öner 《Journal of applied statistics》2017,44(16):2936-2948

In this study, classical and robust principal component analyses are used to evaluate socioeconomic development of regions of development agencies that give service on the purpose of decreasing development difference among regions in Turkey. Due to the high differences between development levels of regions outlier problem occurs, hence robust statistical methods are used. Also, classical and robust statistical methods are used to investigate if there are any outliers in data set. In classic principal component analyse, the number of observations must be larger than the number of variables. Otherwise determinant of covariance matrix is zero. In Robust method for Principal Component Analysis (ROBPCA), a robust approach to principal component analyse in high-dimensional data, even if the number of variables is larger than the number of observations, principal components are obtained. In this paper, firstly 26 development agencies are evaluated with 19 variables by using principal component analysis based on classical and robust scatter matrices and then these 26 development agencies are evaluated with 46 variables by using the ROBPCA method. 相似文献