期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hoda Kamranfar Rahim Chinipardaz 《统计学通讯:模拟与计算》2017,46(10):7844-7854

This article is concerned with the outliers in GARCH models. An iterative procedure is given for testing the presence of any type of the four common outliers. Since the distribution of test statistic cannot be obtained analytically, its distributional behavior is investigated via a simulation study. The simulation study is based on estimation of residuals standard deviation (σ_ν), which are obtained using two methods, median absolute deviation method (MAD), and omit-one method. The proposed procedure is employed for testing the presence of outliers in weekly light oil price Indexes of Iran during 1997 to 2010. 相似文献

2.

Nonlinear regression models for heterogeneous data with massive outliers

Yoonsuh Jung 《Journal of applied statistics》2019,46(8):1456-1477

The income or expenditure-related data sets are often nonlinear, heteroscedastic, skewed even after the transformation, and contain numerous outliers. We propose a class of robust nonlinear models that treat outlying observations effectively without removing them. For this purpose, case-specific parameters and a related penalty are employed to detect and modify the outliers systematically. We show how the existing nonlinear models such as smoothing splines and generalized additive models can be robustified by the case-specific parameters. Next, we extend the proposed methods to the heterogeneous models by incorporating unequal weights. The details of estimating the weights are provided. Two real data sets and simulated data sets show the potential of the proposed methods when the nature of the data is nonlinear with outlying observations. 相似文献

3.

A note on contamination models and outliers

Järgen Wellmann Ursula Gather 《统计学通讯:理论与方法》2013,42(8):1793-1802

In order to describe or generate so-called outliers in univariate statistical data, contamination models are often used. These models assume that k out of n independent random variables are shifted or multiplicated by some constant, whereas the other observations still come i.i.d. from some common target distribution. Of course, these contaminants do not necessarily stick out as the extremes in the sample. Moreover, it is the amount and magnitude of ‘contamination” which determines the number of obvious outliers. Using the concept of Davies and Gather (1993) to formalize the outlier notion we quantify the amount of contamination needed to produce a prespecified expected number of ‘genuine’ outliers. In particular, we demonstrate that for sample of moderate size from a normal target distribution a rather large shift of the contaminants is necessary to yield a certain expected number of outliers. Such an insight is of interest when designing simulation studies where outliers shoulod occur as well as in theoretical investigations on outliers. 相似文献

4.

Interpolation,outliers mid inverse autocorrelations

Daniel Piña Agustin Maravall 《统计学通讯:理论与方法》2013,42(10):3175-3186

The paper addresses the problem of estimating missing observations in an infinite realization of a linear, possibly nonstationary, stochastic processes when the model is known. The general case of any possible distribution of missing observations in the time series is considered, and analytical expressions for the optimal estimators and their associated mean squared errors are obtained. These expressions involve solely the elements of the inverse or dual autocorrelation function of the series.

This optimal estimator -the conditional expectation of the missing observations given the available ones- is equal to the estimator that results from filling the missing values in the series with arbitrary numbers, treating these numbers as additive outliers, and removing with intervention analysis the outlier effects from the invented numbers. 相似文献

5.

Perfect simulation for Reed-Frost epidemic models

Philip D. O'Neill 《Statistics and Computing》2003,13(1):37-44

The Reed-Frost epidemic model is a simple stochastic process with parameter q that describes the spread of an infectious disease among a closed population. Given data on the final outcome of an epidemic, it is possible to perform Bayesian inference for q using a simple Gibbs sampler algorithm. In this paper it is illustrated that by choosing latent variables appropriately, certain monotonicity properties hold which facilitate the use of a perfect simulation algorithm. The methods are applied to real data. 相似文献

6.

Robust quasi-likelihood inference in generalized linear mixed models with outliers

《Journal of Statistical Computation and Simulation》2012,82(2):233-258

It is well known that in a traditional outlier-free situation, the generalized quasi-likelihood (GQL) approach [B.C. Sutradhar, On exact quasilikelihood inference in generalized linear mixed models, Sankhya: Indian J. Statist. 66 (2004), pp. 261–289] performs very well to obtain the consistent as well as the efficient estimates for the parameters involved in the generalized linear mixed models (GLMMs). In this paper, we first examine the effect of the presence of one or more outliers on the GQL estimation for the parameters in such GLMMs, especially in two important models such as count and binary mixed models. The outliers appear to cause serious biases and hence inconsistency in the estimation. As a remedy, we then propose a robust GQL (RGQL) approach in order to obtain the consistent estimates for the parameters in the GLMMs in the presence of one or more outliers. An extensive simulation study is conducted to examine the consistency performance of the proposed RGQL approach. 相似文献

7.

Maximum studentized score tests for the detection of outliers in time series regression models

《Journal of Statistical Computation and Simulation》2012,82(12):1355-1372

Efficient score tests exist among others, for testing the presence of additive and/or innovative outliers that are the result of the shifted mean of the error process under the regression model. A sample influence function of autocorrelation-based diagnostic technique also exists for the detection of outliers that are the result of the shifted autocorrelations. The later diagnostic technique is however not useful if the outlying observation does not affect the autocorrelation structure but is generated due to an inflation in the variance of the error process under the regression model. In this paper, we develop a unified maximum studentized type test which is applicable for testing the additive and innovative outliers as well as variance shifted outliers that may or may not affect the autocorrelation structure of the outlier free time series observations. Since the computation of the p-values for the maximum studentized type test is not easy in general, we propose a Satterthwaite type approximation based on suitable doubly non-central F-distributions for finding such p-values [F.E. Satterthwaite, An approximate distribution of estimates of variance components, Biometrics 2 (1946), pp. 110–114]. The approximations are evaluated through a simulation study, for example, for the detection of additive and innovative outliers as well as variance shifted outliers that do not affect the autocorrelation structure of the outlier free time series observations. Some simulation results on model misspecification effects on outlier detection are also provided. 相似文献

8.

Tests for outliers in the inverse Gaussian distribution,with application to first hitting time models

《Journal of Statistical Computation and Simulation》2012,82(1):73-80

The inverse Gaussian (IG) distribution is often applied in statistical modelling, especially with lifetime data. We present tests for outlying values of the parameters (μ, λ) of this distribution when data are available from a sample of independent units and possibly with more than one event per unit. Outlier tests are constructed from likelihood ratio tests for equality of parameters. The test for an outlying value of λ is based on an F-distributed statistic that is transformed to an approximate normal statistic when there are unequal numbers of events per unit. Simulation studies are used to confirm that Bonferroni tests have accurate size and to examine the powers of the tests. The application to first hitting time models, where the IG distribution is derived from an underlying Wiener process, is described. The tests are illustrated on data concerning the strength of different lots of insulating material. 相似文献

9.

Sensitivity analysis of partially linear models with response missing at random

Ai-Xia Fan 《统计学通讯:模拟与计算》2017,46(7):5323-5339

This article investigates case-deletion influence analysis via Cook’s distance and local influence analysis via conformal normal curvature for partially linear models with response missing at random. Local influence approach is developed to assess the sensitivity of parameter and nonparametric estimators to various perturbations such as case-weight, response variable, explanatory variable, and parameter perturbations on the basis of semiparametric estimating equations, which are constructed using the inverse probability weighted approach, rather than likelihood function. Residual and generalized leverage are also defined. Simulation studies and a dataset taken from the AIDS Clinical Trials are used to illustrate the proposed methods. 相似文献

10.

Using Wald-type estimator to combat outliers and Berkson-type uncertainties with mixture distributions in linear regression models

Yuh-Jenn Wu Li-Hsueh Cheng 《统计学通讯:理论与方法》2018,47(14):3324-3337

The impacts of outliers and Berkson-type uncertainties with additive and multiplicative errors in linear regression are investigated. The work is motivated by a common biological phenomenon in which outlying observations and Berkson-type uncertainties may lie partly in the data, causing incorrect estimations and inferences. In this article, we use Wald-type estimator to combat these uncertainties due to its merits, including large sample properties especially for asymmetric errors, as well as its simplicity without nuisance parameters. The severity of the neglect of uncertainty effects will be examined by Monte Carlo simulations and real data examples through comparison with residual-based methods and the proposed estimate. 相似文献

11.

Estimation and diagnostic analysis in skew-generalized-normal regression models

Clécio S. Ferreira Reinaldo B. Arellano-Valle 《Journal of Statistical Computation and Simulation》2018,88(6):1039-1059

The skew-generalized-normal distribution [Arellano-Valle, RB, Gómez, HW, Quintana, FA. A new class of skew-normal distributions. Comm Statist Theory Methods 2004;33(7):1465–1480] is a class of asymmetric normal distributions, which contains the normal and skew-normal distributions as special cases. The main virtues of this distribution is that it is easy to simulate from and it also supplies a genuine expectation–maximization (EM) algorithm for maximum likelihood estimation. In this paper, we extend the EM algorithm for linear regression models assuming skew-generalized-normal random errors and we develop a diagnostics analyses via local influence and generalized leverage, following Zhu and Lee's approach. This is because Cook's well-known approach would be more complicated to use to obtain measures of local influence. Finally, results obtained for a real data set are reported, illustrating the usefulness of the proposed method. 相似文献

12.

Identification and classification of multiple outliers,high leverage points and influential observations in linear regression

A.A.M. Nurunnabi M. Nasser A.H.M.R. Imon 《Journal of applied statistics》2016,43(3):509-525

Detection of multiple unusual observations such as outliers, high leverage points and influential observations (IOs) in regression is still a challenging task for statisticians due to the well-known masking and swamping effects. In this paper we introduce a robust influence distance that can identify multiple IOs, and propose a sixfold plotting technique based on the well-known group deletion approach to classify regular observations, outliers, high leverage points and IOs simultaneously in linear regression. Experiments through several well-referred data sets and simulation studies demonstrate that the proposed algorithm performs successfully in the presence of multiple unusual observations and can avoid masking and/or swamping effects. 相似文献

13.

Inference and diagnostics in skew scale mixtures of normal regression models

《Journal of Statistical Computation and Simulation》2012,82(3):517-537

Skew scale mixtures of normal distributions are often used for statistical procedures involving asymmetric data and heavy-tailed. The main virtue of the members of this family of distributions is that they are easy to simulate from and they also supply genuine expectation-maximization (EM) algorithms for maximum likelihood estimation. In this paper, we extend the EM algorithm for linear regression models and we develop diagnostics analyses via local influence and generalized leverage, following Zhu and Lee's approach. This is because Cook's well-known approach cannot be used to obtain measures of local influence. The EM-type algorithm has been discussed with an emphasis on the skew Student-t-normal, skew slash, skew-contaminated normal and skew power-exponential distributions. Finally, results obtained for a real data set are reported, illustrating the usefulness of the proposed method. 相似文献

14.

A high breakdown,high efficiency and bounded influence modified GM estimator based on support vector regression

Waleed Dhhan Sohel Rana Habshah Midi 《Journal of applied statistics》2017,44(4):700-714

Regression analysis aims to estimate the approximate relationship between the response variable and the explanatory variables. This can be done using classical methods such as ordinary least squares. Unfortunately, these methods are very sensitive to anomalous points, often called outliers, in the data set. The main contribution of this article is to propose a new version of the Generalized M-estimator that provides good resistance against vertical outliers and bad leverage points. The advantage of this method over the existing methods is that it does not minimize the weight of the good leverage points, and this increases the efficiency of this estimator. To achieve this goal, the fixed parameters support vector regression technique is used to identify and minimize the weight of outliers and bad leverage points. The effectiveness of the proposed estimator is investigated using real and simulated data sets. 相似文献

15.

A note on connectedness in fixed effects manova and gmanova models withmissing cells

Leigh W. Murray 《统计学通讯:理论与方法》2013,42(7):2527-2531

Murray and Smith (1985) and Hocking (1985) give a generalized definition and test of connectedness in the case of missing cells using the univariate cell-means model with linear restrictions on the cell-means. The test of connectedness is here extended to multivariate fixed effects models, including the usual MANOVA model with linear restrictions, the MANOVA model with double linear restrictions, and the GMANOVA model. 相似文献

16.

Estimation and diagnostic for skew-normal partially linear models

Clécio S. Ferreira Gilberto A. Paula 《Journal of applied statistics》2017,44(16):3033-3053

Partially linear models (PLMs) are an important tool in modelling economic and biometric data and are considered as a flexible generalization of the linear model by including a nonparametric component of some covariate into the linear predictor. Usually, the error component is assumed to follow a normal distribution. However, the theory and application (through simulation or experimentation) often generate a great amount of data sets that are skewed. The objective of this paper is to extend the PLMs allowing the errors to follow a skew-normal distribution [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178], increasing the flexibility of the model. In particular, we develop the expectation-maximization (EM) algorithm for linear regression models and diagnostic analysis via local influence as well as generalized leverage, following [H. Zhu and S. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126]. A simulation study is also conducted to evaluate the efficiency of the EM algorithm. Finally, a suitable transformation is applied in a data set on ragweed pollen concentration in order to fit PLMs under asymmetric distributions. An illustrative comparison is performed between normal and skew-normal errors. 相似文献

17.

Discrimination of AR,MA and ARMA time series models

H.T. Chan R. Chinipardaz T.F. Cox 《统计学通讯:理论与方法》2013,42(6):1247-1260

The problem of discrimination between two stationary ARMA time series models is considered, and in particular AR(p), MA(p), ARMA(1,1) models. The discriminant based on the likelihood ration leads to a quadratic form that is generally too complicated to evaluated explicitly. The discriminant can be expressed approximately as a linear combination of independent chi–squared random varianles each with one degree of freedom, the coefficients, of which are eigenvalues of cumbersome matrices. An analytical solution which gives the coefficients approximately is suggested. 相似文献

18.

Some algebra and geometry for hierarchical models, applied to diagnostics

J. S. Hodges 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(3):497-536

Recent advances in computing make it practical to use complex hierarchical models. However, the complexity makes it difficult to see how features of the data determine the fitted model. This paper describes an approach to diagnostics for hierarchical models, specifically linear hierarchical models with additive normal or t -errors. The key is to express hierarchical models in the form of ordinary linear models by adding artificial `cases' to the data set corresponding to the higher levels of the hierarchy. The error term of this linear model is not homoscedastic, but its covariance structure is much simpler than that usually used in variance component or random effects models. The re-expression has several advantages. First, it is extremely general, covering dynamic linear models, random effect and mixed effect models, and pairwise difference models, among others. Second, it makes more explicit the geometry of hierarchical models, by analogy with the geometry of linear models. Third, the analogy with linear models provides a rich source of ideas for diagnostics for all the parts of hierarchical models. This paper gives diagnostics to examine candidate added variables, transformations, collinearity, case influence and residuals. 相似文献

19.

Embedding latent class regression and latent class distal outcome models into cluster-weighted latent class analysis: a detailed simulation experiment

Roberto Di Mari Antonio Punzo Zsuzsa Bakk 《Australian & New Zealand Journal of Statistics》2023,65(3):213-233

Usually in latent class (LC) analysis, external predictors are taken to be cluster conditional probability predictors (LC models with external predictors), and/or score conditional probability predictors (LC regression models). In such cases, their distribution is not of interest. Class-specific distribution is of interest in the distal outcome model, when the distribution of the external variables is assumed to depend on LC membership. In this paper, we consider a more general formulation, that embeds both the LC regression and the distal outcome models, as is typically done in cluster-weighted modelling. This allows us to investigate (1) whether the distribution of the external variables differs across classes, (2) whether there are significant direct effects of the external variables on the indicators, by modelling jointly the relationship between the external and the latent variables. We show the advantages of the proposed modelling approach through a set of artificial examples, an extensive simulation study and an empirical application about psychological contracts among employees and employers in Belgium and the Netherlands. 相似文献

20.

Maximum-likelihood estimation and influence analysis in multivariate skew-normal reproductive dispersion mixed models for longitudinal data

Yuan Ying Zhao 《Statistics》2015,49(6):1348-1365

Various mixed models were developed to capture the features of between- and within-individual variation for longitudinal data under the normality assumption of the random effect and the within-individual random error. However, the normality assumption may be violated in some applications. To this end, this article assumes that the random effect follows a skew-normal distribution and the within-individual error is distributed as a reproductive dispersion model. An expectation conditional maximization (ECME) algorithm together with the Metropolis-Hastings (MH) algorithm within the Gibbs sampler is presented to simultaneously obtain estimates of parameters and random effects. Several diagnostic measures are developed to identify the potentially influential cases and assess the effect of minor perturbation to model assumptions via the case-deletion method and local influence analysis. To reduce the computational burden, we derive the first-order approximations to case-deletion diagnostics. Several simulation studies and a real data example are presented to illustrate the newly developed methodologies. 相似文献