期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A robust class of homoscedastic nonlinear regression models

Mohsen Maleki Zahra Barkhordar Zahra Khodadadi Darren Wraith 《Journal of Statistical Computation and Simulation》2019,89(14):2765-2781

In this paper, we examine a nonlinear regression (NLR) model with homoscedastic errors which follows a flexible class of two-piece distributions based on the scale mixtures of normal (TP-SMN) family. The objective of using this family is to develop a robust NLR model. The TP-SMN is a rich class of distributions that covers symmetric/asymmetric and lightly/heavy-tailed distributions and is an alternative family to the well-known scale mixtures of skew-normal (SMSN) family studied by Branco and Dey [35]. A key feature of this study is using a new suitable hierarchical representation of the family to obtain maximum-likelihood estimates of model parameters via an EM-type algorithm. The performances of the proposed robust model are demonstrated using simulated and some natural real datasets and also compared to other well-known NLR models. 相似文献

2.

A skew factor analysis model based on the normal mean–variance mixture of Birnbaum–Saunders distribution

Farzane Hashemi Mehrdad Naderi Ahad Jamalizadeh Tsung-I Lin 《Journal of applied statistics》2020,47(16):3007

This paper presents a robust extension of factor analysis model by assuming the multivariate normal mean–variance mixture of Birnbaum–Saunders distribution for the unobservable factors and errors. A computationally analytical EM-based algorithm is developed to find maximum likelihood estimates of the parameters. The asymptotic standard errors of parameter estimates are derived under an information-based paradigm. Numerical merits of the proposed methodology are illustrated using both simulated and real datasets. 相似文献

3.

Approximate maximum likelihood estimation in the presence of extra-poisson variation

Hassan Johaadien Constantinos Goutis 《统计学通讯:理论与方法》2013,42(6):1625-1636

A two-stage hierarchical model for analysis of discrete data with extra-Poisson variation is examined. The model consists of a Poisson distribution with a mixing lognormal distribution for the mean. A method of approximate maximum likelihood estimation of the parameters is proposed. The method uses the EM algorithm and approximations to facilitate its implementation are derived. Approximate standard errors of the estimates are provided and a numerical example is used to illustrate the method. 相似文献

4.

On influence diagnostics in elliptical multivariate regression models with equicorrelated random errors

《Statistical Methodology》2014

In this paper we discuss estimation and diagnostic procedures in elliptical multivariate regression models with equicorrelated random errors. Two procedures are proposed for the parameter estimation and the local influence curvatures are derived under some usual perturbation schemes to assess the sensitivity of the maximum likelihood estimates (MLEs). Two motivating examples preliminarily analyzed under normal errors are reanalyzed considering appropriate elliptical distributions. The local influence approach is used to compare the sensitivity of the model estimates. 相似文献

5.

Generalized Tobit models: diagnostics and application in econometrics

Michelli Barros Manuel Galea Manoel Santos-Neto 《Journal of applied statistics》2018,45(1):145-167

The standard Tobit model is constructed under the assumption of a normal distribution and has been widely applied in econometrics. Atypical/extreme data have a harmful effect on the maximum likelihood estimates of the standard Tobit model parameters. Then, we need to count with diagnostic tools to evaluate the effect of extreme data. If they are detected, we must have available a Tobit model that is robust to this type of data. The family of elliptically contoured distributions has the Laplace, logistic, normal and Student-t cases as some of its members. This family has been largely used for providing generalizations of models based on the normal distribution, with excellent practical results. In particular, because the Student-t distribution has an additional parameter, we can adjust the kurtosis of the data, providing robust estimates against extreme data. We propose a methodology based on a generalization of the standard Tobit model with errors following elliptical distributions. Diagnostics in the Tobit model with elliptical errors are developed. We derive residuals and global/local influence methods considering several perturbation schemes. This is important because different diagnostic methods can detect different atypical data. We implement the proposed methodology in an R package. We illustrate the methodology with real-world econometrical data by using the R package, which shows its potential applications. The Tobit model based on the Student-t distribution with a small quantity of degrees of freedom displays an excellent performance reducing the influence of extreme cases in the maximum likelihood estimates in the application presented. It provides new empirical evidence on the capabilities of the Student-t distribution for accommodation of atypical data. 相似文献

6.

Regularized proportional odds models

《Journal of Statistical Computation and Simulation》2012,82(2):251-268

The proportional odds model (POM) is commonly used in regression analysis to predict the outcome for an ordinal response variable. The maximum likelihood estimation (MLE) approach is typically used to obtain the parameter estimates. The likelihood estimates do not exist when the number of parameters, p, is greater than the number of observations n. The MLE also does not exist if there are no overlapping observations in the data. In a situation where the number of parameters is less than the sample size but p is approaching to n, the likelihood estimates may not exist, and if they exist they may have quite large standard errors. An estimation method is proposed to address the last two issues, i.e. complete separation and the case when p approaches n, but not the case when p>n. The proposed method does not use any penalty term but uses pseudo-observations to regularize the observed responses by downgrading their effect so that they become close to the underlying probabilities. The estimates can be computed easily with all commonly used statistical packages supporting the fitting of POMs with weights. Estimates are compared with MLE in a simulation study and an application to the real data. 相似文献

7.

Consistency and Normality of M-Estimators in Partly Linear Models with Stochastic Adapted Errors

Li Yan Xia Chen 《统计学通讯:理论与方法》2013,42(9):1557-1568

The robust M-estimators for the partly linear model under stochastic adapted errors are considered. It is shown that the M-estimator of parameter is asymptotically normal and the M-estimator of the nonparametric function achieves the optimal rate of convergence for nonparametric regression. Some known results are improved and generalized. Some simulations and a real data example are conducted to illustrate the proposed method. 相似文献

8.

Estimation and diagnostic for skew-normal partially linear models

Clécio S. Ferreira Gilberto A. Paula 《Journal of applied statistics》2017,44(16):3033-3053

Partially linear models (PLMs) are an important tool in modelling economic and biometric data and are considered as a flexible generalization of the linear model by including a nonparametric component of some covariate into the linear predictor. Usually, the error component is assumed to follow a normal distribution. However, the theory and application (through simulation or experimentation) often generate a great amount of data sets that are skewed. The objective of this paper is to extend the PLMs allowing the errors to follow a skew-normal distribution [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178], increasing the flexibility of the model. In particular, we develop the expectation-maximization (EM) algorithm for linear regression models and diagnostic analysis via local influence as well as generalized leverage, following [H. Zhu and S. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126]. A simulation study is also conducted to evaluate the efficiency of the EM algorithm. Finally, a suitable transformation is applied in a data set on ragweed pollen concentration in order to fit PLMs under asymmetric distributions. An illustrative comparison is performed between normal and skew-normal errors. 相似文献

9.

Robust multivariate mixture regression models with incomplete data

Hwa Kyung Lim Naveen N. Narisetty 《Journal of Statistical Computation and Simulation》2017,87(2):328-347

Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis. 相似文献

10.

Multivariate normal mean-variance mixture distribution based on Lindley distribution

Mehrdad Naderi Ahad Jamalizadeh 《统计学通讯:模拟与计算》2018,47(4):1179-1192

This article introduces a new asymmetric distribution constructed by assuming the multivariate normal mean-variance mixture model. Called normal mean-variance mixture of the Lindley distribution, we derive some mathematical properties of the new distribution. Also, a feasible maximum likelihood estimation procedure using the EM algorithm and the asymptotic standard errors of parameter estimates are developed. The performance of the proposed distribution is illustrated by means of real datasets and simulation analysis. 相似文献

11.

Mean-shift outliers model in skew scale-mixtures of normal distributions

《Journal of Statistical Computation and Simulation》2012,82(12):2346-2361

ABSTRACT

Asymmetric models have been discussed quite extensively in recent years, in situations where the normality assumption is suspected due to lack of symmetry in the data. Techniques for assessing the quality of fit and diagnostic analysis are important for model validation. This paper presents a study of the mean-shift method for the detection of outliers in regression models under skew scale-mixtures of normal distributions. Analytical solutions for the estimators of the parameters are obtained through the use of Expectation–Maximization algorithm. The observed information matrix for the calculation of standard errors is obtained for each distribution. Simulation studies and an application to the analysis of a data have been carried out, showing the efficiency of the proposed method in detecting outliers. 相似文献

12.

Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions

Thalita do Bem Mattos Victor H. Lachos 《Journal of applied statistics》2018,45(11):2039-2066

In many studies, the data collected are subject to some upper and lower detection limits. Hence, the responses are either left or right censored. A complication arises when these continuous measures present heavy tails and asymmetrical behavior; simultaneously. For such data structures, we propose a robust-censored linear model based on the scale mixtures of skew-normal (SMSN) distributions. The SMSN is an attractive class of asymmetrical heavy-tailed densities that includes the skew-normal, skew-t, skew-slash, skew-contaminated normal and the entire family of scale mixtures of normal (SMN) distributions as special cases. We propose a fast estimation procedure to obtain the maximum likelihood (ML) estimates of the parameters, using a stochastic approximation of the EM (SAEM) algorithm. This approach allows us to estimate the parameters of interest easily and quickly, obtaining as a byproducts the standard errors, predictions of unobservable values of the response and the log-likelihood function. The proposed methods are illustrated through real data applications and several simulation studies. 相似文献

13.

A heteroscedastic measurement error model based on skew and heavy-tailed distributions with known error variances

Lorena Cáceres Tomaya 《Journal of Statistical Computation and Simulation》2018,88(11):2185-2200

In this paper, we study inference in a heteroscedastic measurement error model with known error variances. Instead of the normal distribution for the random components, we develop a model that assumes a skew-t distribution for the true covariate and a centred Student's t distribution for the error terms. The proposed model enables to accommodate skewness and heavy-tailedness in the data, while the degrees of freedom of the distributions can be different. Maximum likelihood estimates are computed via an EM-type algorithm. The behaviour of the estimators is also assessed in a simulation study. Finally, the approach is illustrated with a real data set from a methods comparison study in Analytical Chemistry. 相似文献

14.

An ECM Estimation Approach for Analyzing Multivariate Skew-Normal Data with Dropout

T. Baghfalaki 《统计学通讯:模拟与计算》2013,42(10):1970-1988

In this article, an ECM algorithm is developed to obtain the maximum likelihood estimates of parameters where multivariate skew-normal distribution is used for analyzing longitudinal skewed normal regression data with dropout. A simulation study is performed to investigate the performance of the presented algorithm. Also, the methodology is illustrated through two applications and the results of proposed methodology are compared with ECM under multivariate normal assumption using AIC and BIC criteria. Standard errors of parameter estimates are obtained by asymptotic observed information matrix. 相似文献

15.

The generalized Gudermannian distribution: inference and volatility modelling

Emrah Altun 《Statistics》2019,53(2):364-386

In this paper, we introduce a new distribution, called generalized Gudermannian (GG) distribution, and its skew extension for GARCH models in modelling daily Value-at-Risk (VaR). Basic structural properties of the proposed distribution are obtained including probability density and cumulative distribution functions, moments, and stochastic representation. The maximum likelihood method is used to estimate unknown parameters of the proposed model and finite sample performance of maximum likelihood estimates are evaluated by means of Monte-Carlo simulation study. The real data application on Nikkei 225 index is given to demonstrate the performance of GARCH model specified under skew extension of GG innovation distribution against normal, Student's-t, skew normal and generalized error and skew generalized error distributions in terms of the accuracy of VaR forecasts. The empirical results show that the GARCH model with GG innovation distribution produces the most accurate VaR forecasts for all confidence levels. 相似文献

16.

A multiple group item response theory model with centered skew-normal latent trait distributions under a Bayesian framework

Jose R.S. Santos Heleno Bolfarine 《Journal of applied statistics》2013,40(10):2129-2149

Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions. 相似文献

17.

A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications

Leila Amiri Mojtaba Khazaei Mojtaba Ganjali 《AStA Advances in Statistical Analysis》2018,102(1):95-115

Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model. 相似文献

18.

Bayesian analysis of censored linear regression models with scale mixtures of normal distributions

Aldo M. Garay Heleno Bolfarine Celso R.B. Cabral 《Journal of applied statistics》2015,42(12):2694-2714

As is the case of many studies, the data collected are limited and an exact value is recorded only if it falls within an interval range. Hence, the responses can be either left, interval or right censored. Linear (and nonlinear) regression models are routinely used to analyze these types of data and are based on normality assumptions for the errors terms. However, those analyzes might not provide robust inference when the normality assumptions are questionable. In this article, we develop a Bayesian framework for censored linear regression models by replacing the Gaussian assumptions for the random errors with scale mixtures of normal (SMN) distributions. The SMN is an attractive class of symmetric heavy-tailed densities that includes the normal, Student-t, Pearson type VII, slash and the contaminated normal distributions, as special cases. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo algorithm is introduced to carry out posterior inference. A new hierarchical prior distribution is suggested for the degrees of freedom parameter in the Student-t distribution. The likelihood function is utilized to compute not only some Bayesian model selection measures but also to develop Bayesian case-deletion influence diagnostics based on the q-divergence measure. The proposed Bayesian methods are implemented in the R package BayesCR. The newly developed procedures are illustrated with applications using real and simulated data. 相似文献

19.

Influence diagnostics for censored regression models with autoregressive errors

下载免费PDF全文

Fernanda L. Schumacher Victor H. Lachos Filidor E. Vilca‐Labra Luis M. Castro 《Australian & New Zealand Journal of Statistics》2018,60(2):209-229

Observations collected over time are often autocorrelated rather than independent, and sometimes include observations below or above detection limits (i.e. censored values reported as less or more than a level of detection) and/or missing data. Practitioners commonly disregard censored data cases or replace these observations with some function of the limit of detection, which often results in biased estimates. Moreover, parameter estimation can be greatly affected by the presence of influential observations in the data. In this paper we derive local influence diagnostic measures for censored regression models with autoregressive errors of order p (hereafter, AR(p)‐CR models) on the basis of the Q‐function under three useful perturbation schemes. In order to account for censoring in a likelihood‐based estimation procedure for AR(p)‐CR models, we used a stochastic approximation version of the expectation‐maximisation algorithm. The accuracy of the local influence diagnostic measure in detecting influential observations is explored through the analysis of empirical studies. The proposed methods are illustrated using data, from a study of total phosphorus concentration, that contain left‐censored observations. These methods are implemented in the R package ARCensReg. 相似文献

20.

Regression and correlation for 3 × 3 rotation matrices

Louis‐Paul Rivest Ted Chang 《Revue canadienne de statistique》2006,34(2):187-202

This paper investigates a regression model for orthogonal matrices introduced by Prentice (1989). It focuses on the special case of 3 × 3 rotation matrices. The model under study expresses the dependent rotation matrix V as A₁UA^t₂ perturbed by experimental errors, where A₁ and A₂ are unknown 3 × 3 rotation matrices and U is an explanatory 3 × 3 rotation matrix. Several specifications for the errors in this regression model are proposed. The asymptotic distributions, as the sample size n becomes large or as the experimental errors become small, of the least squares estimators for A₁ and A₂ are derived. A new algorithm for calculating the least squares estimates of A₁ and A₂ is presented. The independence model is not a submodel of Prentice's regression model, thus the independence between the U and the V sample cannot be tested when fitting Prentice's model. To overcome this difficulty, permutation tests of independence are investigated. Examples dealing with postural variations of subjects performing a drilling task and with the calibration of a camera system for motion analysis using a magnetic tracking device illustrate the methodology of this paper. 相似文献