期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Outlier detection using difference-based variance estimators in multiple regression

Chun Gun Park 《统计学通讯:理论与方法》2018,47(24):5986-6001

In this article, we propose an outlier detection approach in a multiple regression model using the properties of a difference-based variance estimator. This type of a difference-based variance estimator was originally used to estimate error variance in a non parametric regression model without estimating a non parametric function. This article first employed a difference-based error variance estimator to study the outlier detection problem in a multiple regression model. Our approach uses the leave-one-out type method based on difference-based error variance. The existing outlier detection approaches using the leave-one-out approach are highly affected by other outliers, while ours is not because our approach does not use the regression coefficient estimator. We compared our approach with several existing methods using a simulation study, suggesting the outperformance of our approach. The advantages of our approach are demonstrated using a real data application. Our approach can be extended to the non parametric regression model for outlier detection. 相似文献

2.

Regression Depth with Censored and Truncated Data

《统计学通讯:理论与方法》2013,42(5):997-1008

Abstract

In this article, we consider the problem of estimating regression coefficients for a linear model with censored and truncated data based on regression depth. Any line can be given a rank using regression depth and the deepest regression line is the line with the maximum regression depth. We propose a method to define the regression depth of a line in the presence of censoring and truncation. We show how the proposed regression performs through analyzing Stanford heart transplant data and AIDS incubation data. 相似文献

3.

Regression methods for high dimensional multicollinear data

Lorna S. Aucott Paul H. Garthwaite James Currall 《统计学通讯:模拟与计算》2013,42(4):1021-1037

To compare their performance on high dimensional data, several regression methods are applied to data sets in which the number of exploratory variables greatly exceeds the sample sizes. The methods are stepwise regression, principal components regression, two forms of latent root regression, partial least squares, and a new method developed here. The data are four sample sets for which near infrared reflectance spectra have been determined and the regression methods use the spectra to estimate the concentration of various chemical constituents, the latter having been determined by standard chemical analysis. Thirty-two regression equations are estimated using each method and their performances are evaluated using validation data sets. Although it is the most widely used, stepwise regression was decidedly poorer than the other methods considered. Differences between the latter were small with partial least squares performing slightly better than other methods under all criteria examined, albeit not by a statistically significant amount. 相似文献

4.

Bayesian principal component regression with data-driven component selection

Liuxia Wang 《Journal of applied statistics》2012,39(6):1177-1189

Principal component regression (PCR) has two steps: estimating the principal components and performing the regression using these components. These steps generally are performed sequentially. In PCR, a crucial issue is the selection of the principal components to be included in regression. In this paper, we build a hierarchical probabilistic PCR model with a dynamic component selection procedure. A latent variable is introduced to select promising subsets of components based upon the significance of the relationship between the response variable and principal components in the regression step. We illustrate this model using real and simulated examples. The simulations demonstrate that our approach outperforms some existing methods in terms of root mean squared error of the regression coefficient. 相似文献

5.

Applying Least Absolute Deviation Regression to Regression-type Estimation of the Index of a Stable Distribution Using the Characteristic Function

J. Martin Van Zyl 《统计学通讯:模拟与计算》2015,44(9):2442-2462

Least absolute deviation regression is applied using a fixed number of points for all values of the index to estimate the index and scale parameter of the stable distribution using regression methods based on the empirical characteristic function. The recognized fixed number of points estimation procedure uses ten points in the interval zero to one, and least squares estimation. It is shown that using the more robust least absolute regression based on iteratively re-weighted least squares outperforms the least squares procedure with respect to bias and also mean square error in smaller samples. 相似文献

6.

Dimension Reduction in Regressions through Weighted Variance Estimation

Li-Ping Zhu Ya-Ni Yang Li-Xing Zhu 《统计学通讯:理论与方法》2013,42(11):1929-1944

Because sliced inverse regression (SIR) using the conditional mean of the inverse regression fails to recover the central subspace when the inverse regression mean degenerates, sliced average variance estimation (SAVE) using the conditional variance was proposed in the sufficient dimension reduction literature. However, the efficacy of SAVE depends heavily upon the number of slices. In the present article, we introduce a class of weighted variance estimation (WVE), which, similar to SAVE and simple contour regression (SCR), uses the conditional variance of the inverse regression to recover the central subspace. The strong consistency and the asymptotic normality of the kernel estimation of WVE are established under mild regularity conditions. Finite sample studies are carried out for comparison with existing methods and an application to a real data is presented for illustration. 相似文献

7.

Confidence intervals for the regression coefficient in a simple regression model with a balanced two-fold nested error structure

Dong Joon Park 《统计学通讯:理论与方法》2013,42(17):5053-5065

ABSTRACT

In applications using a simple regression model with a balanced two-fold nested error structure, interest focuses on inferences concerning the regression coefficient. This article derives exact and approximate confidence intervals on the regression coefficient in the simple regression model with a balanced two-fold nested error structure. Eleven methods are considered for constructing the confidence intervals on the regression coefficient. Computer simulation is performed to compare the proposed confidence intervals. Recommendations are suggested for selecting an appropriate method. 相似文献

8.

Bayesian inference for bivariate generalized linear models in diagnosing renal arterial obstruction

Mehmet A. Cengiz 《Statistical Methodology》2005,2(3):168-174

Generalized linear models are well-established generalizations of the linear models used for regression and analysis of variance. They allow flexible mean structures and general distributions, other than the linear link and normal response assumed in regression. Further enhancements using ideas from multivariate analysis improve power and precision by modelling dependencies between response variables. This paper focuses on the specific case of regression models for bivariate Bernoulli responses and investigates their analysis using a Bayesian approach. The important problem of renal arterial obstruction is considered, as a medical application of these models. 相似文献

9.

Semiparametric multiple kernel estimators and model diagnostics for count regression functions

Lamia Djerroud Tristan Senga Kiessé Smail Adjabi 《统计学通讯:理论与方法》2020,49(9):2131-2157

Abstract

This study concerns semiparametric approaches to estimate discrete multivariate count regression functions. The semiparametric approaches investigated consist of combining discrete multivariate nonparametric kernel and parametric estimations such that (i) a prior knowledge of the conditional distribution of model response may be incorporated and (ii) the bias of the traditional nonparametric kernel regression estimator of Nadaraya-Watson may be reduced. We are precisely interested in combination of the two estimations approaches with some asymptotic properties of the resulting estimators. Asymptotic normality results were showed for nonparametric correction terms of parametric start function of the estimators. The performance of discrete semiparametric multivariate kernel estimators studied is illustrated using simulations and real count data. In addition, diagnostic checks are performed to test the adequacy of the parametric start model to the true discrete regression model. Finally, using discrete semiparametric multivariate kernel estimators provides a bias reduction when the parametric multivariate regression model used as start regression function belongs to a neighborhood of the true regression model. 相似文献

10.

Beta Regression for Modelling Rates and Proportions 总被引：9，自引：0，他引：9

Silvia Ferrari Francisco Cribari-Neto 《Journal of applied statistics》2004,31(7):799-815

This paper proposes a regression model where the response is beta distributed using a parameterization of the beta law that is indexed by mean and dispersion parameters. The proposed model is useful for situations where the variable of interest is continuous and restricted to the interval (0, 1) and is related to other variables through a regression structure. The regression parameters of the beta regression model are interpretable in terms of the mean of the response and, when the logit link is used, of an odds ratio, unlike the parameters of a linear regression that employs a transformed response. Estimation is performed by maximum likelihood. We provide closed-form expressions for the score function, for Fisher's information matrix and its inverse. Hypothesis testing is performed using approximations obtained from the asymptotic normality of the maximum likelihood estimator. Some diagnostic measures are introduced. Finally, practical applications that employ real data are presented and discussed. 相似文献

11.

Errors-in-variables regression using estimated latent variables

Shigeru Iwata 《Econometric Reviews》1992,11(2):195-200

This note considers a method for estimating regression parameters from the data containing measurement errors using some natural estimates of the unobserved explanatory variables. It is shown that the resulting estimator is consistent not only in the usual linear regression model but also in the probit model and regression models with censoship or truncation. However, it fails to be consistent in nonlinear regression models except for special cases. 相似文献

12.

Regression Kink With an Unknown Threshold 总被引：1，自引：0，他引：1

Bruce E. Hansen 《商业与经济统计学杂志》2017,35(2):228-240

This article explores estimation and inference in a regression kink model with an unknown threshold. A regression kink model (or continuous threshold model) is a threshold regression constrained to be everywhere continuous with a kink at an unknown threshold. We present methods for estimation, to test for the presence of the threshold, for inference on the regression parameters, and for inference on the regression function. A novel finding is that inference on the regression function is nonstandard since the regression function is a nondifferentiable function of the parameters. We apply recently developed methods for inference on nondifferentiable functions. The theory is illustrated by an application to the growth and debt problem introduced by Reinhart and Rogoff, using their long-span time-series for the United States. 相似文献

13.

Robust estimation for functional coefficient regression models with spatial data

Qingguo Tang 《Statistics》2013,47(2):388-404

A global smoothing procedure is developed using B-spline function approximation for estimating the unknown functions of a functional coefficient regression model with spatial data. A general formulation is used to treat mean regression, median regression, quantile regression and robust mean regression in one setting. The global convergence rates of the estimators of unknown coefficient functions are established. Various applications of the main results, including estimating conditional quantile coefficient functions and robustifying the mean regression coefficient functions are given. Finite sample properties of our procedures are studied through Monte Carlo simulations. A housing data example is used to illustrate the proposed methodology. 相似文献

14.

Bayesian Semiparametric Modelling in Quantile Regression

ATHANASIOS KOTTAS MILOVAN KRNJAJI&#x; 《Scandinavian Journal of Statistics》2009,36(2):297-319

Abstract. We propose a Bayesian semiparametric methodology for quantile regression modelling. In particular, working with parametric quantile regression functions, we develop Dirichlet process mixture models for the error distribution in an additive quantile regression formulation. The proposed non‐parametric prior probability models allow the shape of the error density to adapt to the data and thus provide more reliable predictive inference than models based on parametric error distributions. We consider extensions to quantile regression for data sets that include censored observations. Moreover, we employ dependent Dirichlet processes to develop quantile regression models that allow the error distribution to change non‐parametrically with the covariates. Posterior inference is implemented using Markov chain Monte Carlo methods. We assess and compare the performance of our models using both simulated and real data sets. 相似文献

15.

Bias Reduction in Logistic Regression with Missing Responses When the Missing Data Mechanism is Nonignorable

《The American statistician》2012,66(4):340-349

ABSTRACT

In logistic regression with nonignorable missing responses, Ibrahim and Lipsitz proposed a method for estimating regression parameters. It is known that the regression estimates obtained by using this method are biased when the sample size is small. Also, another complexity arises when the iterative estimation process encounters separation in estimating regression coefficients. In this article, we propose a method to improve the estimation of regression coefficients. In our likelihood-based method, we penalize the likelihood by multiplying it by a noninformative Jeffreys prior as a penalty term. The proposed method reduces bias and is able to handle the issue of separation. Simulation results show substantial bias reduction for the proposed method as compared to the existing method. Analyses using real world data also support the simulation findings. An R package called brlrmr is developed implementing the proposed method and the Ibrahim and Lipsitz method. 相似文献

16.

Errors-in-variables regression using estimated latent variables

Shigeru Iwata 《Econometric Reviews》2013,32(2):195-200

This note considers a method for estimating regression parameters from the data containing measurement errors using some natural estimates of the unobserved explanatory variables. It is shown that the resulting estimator is consistent not only in the usual linear regression model but also in the probit model and regression models with censoship or truncation. However, it fails to be consistent in nonlinear regression models except for special cases. 相似文献

17.

Back propagation neural networks and multiple regressions in the case of heteroskedasticity

Chinmoy Paul Gajendra K. Vishwakarma 《统计学通讯:模拟与计算》2017,46(9):6772-6789

This paper compares the performance between regression analysis and a clustering based neural network approach when the data deviates from the homoscedasticity assumption of regression. Heteroskedasticity is a problem that arises in linear regression due to the unequal error variances. One of the methods to deal heteroskedasticity in classical regression theory is weighted least-square regression (WLS). In order to deal the problem of heteroskedasticity, backpropagation neural network is applied. In this context, an algorithm is proposed which is based on robust estimates of location and dispersion matrix that helps in preserving the error assumption of the linear regression. Analysis is carried out with appropriate designs using simulated data and the results are presented. 相似文献

18.

Robust Tests in Semiparametric Partly Linear Models

ANA BIANCO GRACIELA BOENTE ELENA MARTÍNEZ 《Scandinavian Journal of Statistics》2006,33(3):435-450

Abstract. This paper focuses on the problem of testing the null hypothesis that the regression parameter equals a fixed value under a semiparametric partly linear regression model by using a three-step robust estimate for the regression parameter and the regression function. Two families of tests statistics are considered and their asymptotic distributions are studied under the null hypothesis and under contiguous alternatives. A Monte Carlo study is performed to compare the finite sample behaviour of the proposed tests with the classical one. 相似文献

19.

Variance Estimation in Spatial Regression Using a Non-parametric Semivariogram Based on Residuals

Hyon-Jung Kim Dennis D. Boos 《Scandinavian Journal of Statistics》2004,31(3):387-401

Abstract. The empirical semivariogram of residuals from a regression model with stationary errors may be used to estimate the covariance structure of the underlying process. For prediction (kriging) the bias of the semivariogram estimate induced by using residuals instead of errors has only a minor effect because the bias is small for small lags. However, for estimating the variance of estimated regression coefficients and of predictions, the bias due to using residuals can be quite substantial. Thus we propose a method for reducing this bias. The adjusted empirical semivariogram is then isotonized and made conditionally negative-definite and used to estimate the variance of estimated regression coefficients in a general estimating equations setup. Simulation results for least squares and robust regression show that the proposed method works well in linear models with stationary correlated errors. 相似文献

20.

Dimensionality reduction approach to multivariate prediction

Giovanni M. Merola Bovas Abraham 《Revue canadienne de statistique》2001,29(2):191-200

The authors consider dimensionality reduction methods used for prediction, such as reduced rank regression, principal component regression and partial least squares. They show how it is possible to obtain intermediate solutions by estimating simultaneously the latent variables for the predictors and for the responses. They obtain a continuum of solutions that goes from reduced rank regression to principal component regression via maximum likelihood and least squares estimation. Different solutions are compared using simulated and real data. 相似文献