The varying coefficient model (VCM) is an important generalization of the linear regression model and many existing estimation procedures for VCM were built on L 2 loss, which is popular for its mathematical beauty but is not robust to non-normal errors and outliers. In this paper, we address the problem of both robustness and efficiency of estimation and variable selection procedure based on the convex combined loss of L 1 and L 2 instead of only quadratic loss for VCM. By using local linear modeling method, the asymptotic normality of estimation is driven and a useful selection method is proposed for the weight of composite L 1 and L 2. Then the variable selection procedure is given by combining local kernel smoothing with adaptive group LASSO. With appropriate selection of tuning parameters by Bayesian information criterion (BIC) the theoretical properties of the new procedure, including consistency in variable selection and the oracle property in estimation, are established. The finite sample performance of the new method is investigated through simulation studies and the analysis of body fat data. Numerical studies show that the new method is better than or at least as well as the least square-based method in terms of both robustness and efficiency for variable selection.  相似文献   

The purpose of this paper is to present a parallel implementation of multiple linear regression. We discuss the multiple linear regression model. Traditionally parallelism has been used for either speed up or redundancy (hence reliability). With stochastic data, by clever parsing and algorithm development, it is possible to achieve both speed and reliability enhancement. We demonstrate this with multiple linear regression.  相似文献   

While Bayesian analogues of lasso regression have become popular, comparatively little has been said about formal treatments of model uncertainty in such settings. This paper describes methods that can be used to evaluate the posterior distribution over the space of all possible regression models for Bayesian lasso regression. Access to the model space posterior distribution is necessary if model-averaged inference—e.g., model-averaged prediction and calculation of posterior variable inclusion probabilities—is desired. The key element of all such inference is the ability to evaluate the marginal likelihood of the data under a given regression model, which has so far proved difficult for the Bayesian lasso. This paper describes how the marginal likelihood can be accurately computed when the number of predictors in the model is not too large, allowing for model space enumeration when the total number of possible predictors is modest. In cases where the total number of possible predictors is large, a simple Markov chain Monte Carlo approach for sampling the model space posterior is provided. This Gibbs sampling approach is similar in spirit to the stochastic search variable selection methods that have become one of the main tools for addressing Bayesian regression model uncertainty, and the adaption of these methods to the Bayesian lasso is shown to be straightforward.  相似文献   

Lifetime Data Analysis - Assuming Cox’s regression model, we consider penalized full likelihood approach to conduct variable selection under nested case–control (NCC) sampling....  相似文献   

The Jackknife-after-bootstrap (JaB) technique originally developed by Efron [8 B. Efron, Jackknife-after-bootstrap standard errors and influence functions, J. R. Stat. Soc. 54 (1992), pp. 83127. [Google Scholar]] has been proposed as an approach to improve the detection of influential observations in linear regression models by Martin and Roberts [12 M.A. Martin and S. Roberts, Jackknife-after-bootstrap regression influence diagnostics, J. Nonparametr. Stat. 22 (2010), pp. 257269. doi: 10.1080/10485250903287906[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]] and Beyaztas and Alin [2 U. Beyaztas and A. Alin, Jackknife-after-bootstrap method for detection of influential observations in linear regression model, Comm. Statist. Simulation Comput. 42 (2013), pp. 12561267. doi: 10.1080/03610918.2012.661908[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]]. The method is based on the use of percentile-method confidence intervals to provide improved cut-off values for several single case-deletion influence measures. In order to improve JaB, we propose using robust versions of Efron [7 B. Efron, Better bootstrap confidence intervals, J. Amer. Statist. Assoc. 82 (1987), pp. 171185. doi: 10.1080/01621459.1987.10478410[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]]’s bias-corrected and accelerated (BCa) bootstrap confidence intervals. In this study, the performances of robust BCa–JaB and conventional JaB methods are compared in the cases of DFFITS, Welsch's distance and modified Cook's distance influence diagnostics. Comparisons are based on both real data examples and through a simulation study. Our results reveal that under a variety of scenarios, our proposed method provides more accurate and reliable results, and it is more robust to masking effects.  相似文献   

We propose penalized minimum φ-divergence estimator for parameter estimation and variable selection in logistic regression. Using an appropriate penalty function, we show that penalized φ-divergence estimator has oracle property. With probability tending to 1, penalized φ-divergence estimator identifies the true model and estimates nonzero coefficients as efficiently as if the sparsity of the true model was known in advance. The advantage of penalized φ-divergence estimator is that it produces estimates of nonzero parameters efficiently than penalized maximum likelihood estimator when sample size is small and is equivalent to it for large one. Numerical simulations confirm our findings.  相似文献   

In this note, we propose a new method for selecting the bandwidth parameter in non-parametric regression. While standard criteria, such as cross-validation, are based on the true regression curve about which we know little, we propose a criterion which focuses on the true errors about which assumptions may be made. Our proposal is to choose the bandwidth for which the residuals are as uncorrelated as possible. We use the Box-Pierce statistic as the objective to be minimized. In doing so, the behaviour of our residuals will be close to that of the true errors under the hypothesis of independent errors. A simulation study shows that our method succeeds in capturing the main features of the regression curve, in the sense that the number of turning-points of the curve is correctly estimated most of the time.  相似文献   

The discussion on the use and misuse of p-values in 2016 by the American Statistician Association was a timely assertion that statistical concept should be properly used in science. Some researchers, especially the economists, who adopt significance testing and p-values to report their results, may felt confused by the statement, leading to misinterpretations of the statement. In this study, we aim to re-examine the accuracy of the p-value and introduce an alternative way for testing the hypothesis. We conduct a simulation study to investigate the reliability of the p-value. Apart from investigating the performance of p-value, we also introduce some existing approaches, Minimum Bayes Factors and Belief functions, for replacing p-value. Results from the simulation study confirm unreliable p-value in some cases and that our proposed approaches seem to be useful as the substituted tool in the statistical inference. Moreover, our results show that the plausibility approach is more accurate for making decisions about the null hypothesis than the traditionally used p-values when the null hypothesis is true. However, the MBFs of Edwards et al. [Bayesian statistical inference for psychological research. Psychol. Rev. 70(3) (1963), pp. 193–242]; Vovk [A logic of probability, with application to the foundations of statistics. J. Royal Statistical Soc. Series B (Methodological) 55 (1993), pp. 317–351] and Sellke et al. [Calibration of p values for testing precise null hypotheses. Am. Stat. 55(1) (2001), pp. 62–71] provide more reliable results compared to all other methods when the null hypothesis is false.KEYWORDS: Ban of P-value, Minimum Bayes Factors, belief functions  相似文献   


This article studies the consistency of a local density regression model under a supremum Hellinger distance. Such model applies a piecewise structure where a mixture of Dirichlet process model (MDP) is assigned as the fixed density on each piece. The piecewise construction is a straightforward way to establish sup–Hellinger consistency in a regression settings. A specific piecewise density example is presented in a simulation study.  相似文献   

The orthogonalization of undesigned experiments is introduced to increase statistical precision of the estimated regression coefficients. The goals are to minimize the covariance and the bias of the least squares estimator for estimating the path of the steepest ascent (SA) that leads the users toward the neighbour of the optimum response. An orthogonal design is established to decrease the inverse determinant of XX and the angle between the true and the estimated SA paths. For orthogonalization of an undesigned matrix, our proposed solution is constructed on the modified Gram–Schmidt strategy relevant to the process of Gaussian elimination. The proposed solution offers an orthogonal basis, in full working accuracy, for the space spanned by the columns of the original matrix.  相似文献   

A generalization of Zellner's balanced loss function is proposed according to unified theory of least squares under a general Gauss–Markoff model. Admissibility of linear estimators is investigated under the balanced loss function. And necessary and sufficient conditions that linear estimators are admissible in a class of homogeneous and nonhomogeneous linear estimators are obtained, respectively.  相似文献   

In this paper a test for model selection is proposed which extends the usual goodness-of-fit test in several ways. It is assumed that the underlying distribution H depends on a covariate value in a fixed design setting. Secondly, instead of one parametric class we consider two competing classes one of which may contain the underlying distribution. The test allows to select one of two equally treated model classes which fits the underlying distribution better. To define the distance of distributions various measures are available. Here the Cramér-von Mises has been chosen. The null hypothesis that both parametric classes have the same distance to the underlying distribution H can be checked by means of a test statistic, the asymptotic properties of which are shown under a set of suitable conditions. The performance of the test is demonstrated by Monte Carlo simulations. Finally, the procedure is applied to a data set from an endurance test on electric motors.  相似文献   

The purpose of this paper is to develop a new linear regression model for count data, namely generalized-Poisson Lindley (GPL) linear model. The GPL linear model is performed by applying generalized linear model to GPL distribution. The model parameters are estimated by the maximum likelihood estimation. We utilize the GPL linear model to fit two real data sets and compare it with the Poisson, negative binomial (NB) and Poisson-weighted exponential (P-WE) models for count data. It is found that the GPL linear model can fit over-dispersed count data, and it shows the highest log-likelihood, the smallest AIC and BIC values. As a consequence, the linear regression model from the GPL distribution is a valuable alternative model to the Poisson, NB, and P-WE models.  相似文献   

We develop a Bayesian analysis for the class of Birnbaum–Saunders nonlinear regression models introduced by Lemonte and Cordeiro (Comput Stat Data Anal 53:4441–4452, 2009). This regression model, which is based on the Birnbaum–Saunders distribution (Birnbaum and Saunders in J Appl Probab 6:319–327, 1969a), has been used successfully to model fatigue failure times. We have considered a Bayesian analysis under a normal-gamma prior. Due to the complexity of the model, Markov chain Monte Carlo methods are used to develop a Bayesian procedure for the considered model. We describe tools for model determination, which include the conditional predictive ordinate, the logarithm of the pseudo-marginal likelihood and the pseudo-Bayes factor. Additionally, case deletion influence diagnostics is developed for the joint posterior distribution based on the Kullback–Leibler divergence. Two empirical applications are considered in order to illustrate the developed procedures.  相似文献   

S. Khan 《Statistical Papers》1994,35(1):127-138
A ß-expectation tolerance region has been constructed for the multivariate regression model with heteroscedastic errors which follow a multivariate Student-t distribution with an unknown number of degrees of freedom. The ß-expectaion tolerance region obtained in this paper is optimal in the sense of having minimum enclosure among all such tolerance regions that guarantees that it would cover any preassigned proportions, namely, ß×100 percent of the future responses from the model.  相似文献   

A leading multivariate extension of the univariate quantiles is the so-called “spatial” or “geometric” notion, for which sample versions are highly robust and conveniently satisfy a Bahadur–Kiefer representation. Another extension of univariate quantiles has been to univariate U-quantiles, on the basis of which, for example, the well-known Hodges–Lehmann location estimator has a natural formulation. Generalizing both extensions, we introduce multivariate spatial U-quantiles and develop a corresponding Bahadur–Kiefer representation. New statistics based on spatial U-quantiles are presented for nonparametric estimation of multiple regression coefficients, extending the classical Theil–Sen nonparametric simple linear regression slope estimator, and for robust estimation of multivariate dispersion. Some other applications are mentioned as well.  相似文献   

A class of invariant estimators with respect to the selection of a base population is developed for estimating the hazard rates in multiple populations. The class generalizes the estimators of Begun and Reid (J. Amer. Statist. Assoc. 78 (1983) 337) and includes the estimator of Mantel and Haenszel (J. Natl. Canser Inst. 22 (1959) 719) as a special case. The estimators have explicit forms and, it is shown that their asymptotic covariance matrices are less than those of the Begun–Reid estimators when the number of populations is greater than two. A Monte-Carlo simulation indicates that the estimators are slightly more efficient than the Cox partial likelihood estimator (Biometrika 62 (2) (1975) 269) for small and medium sample sizes. An example is presented for the illustration of the estimators.  相似文献   

In this article, the restricted rk class estimator and restricted rd class estimator are introduced, which are general estimators of the rk class estimator by Baye and Parker [Combining ridge and principal component regression: A money demand illustration, Commun. Stat. Theory Methods 13(2) (1984), pp. 197–205] and the rd class estimator by Kaç?ranlar and Sakall?o?lu [Combining the Liu estimator and the principal component regression estimator, Commun. Stat. Theory Methods 30(12) (2001), pp. 2699–2705], respectively. For the two cases when the restrictions are true and not true, the superiority of the restricted rk class estimator and rd class estimator over the restricted ridge regression estimator by Sarkar [A new estimator combining the ridge regression and the restricted least squares methods of estimation, Commun. Stat. Theory Methods 21 (1992), pp. 1987–2000] and the restricted Liu estimator by Kaç?ranlar et al. [A new biased estimator in linear regression and a detailed analysis of the widely analysed dataset on Portland cement, Sankhya - Indian J. Stat. 61B(3) (1999), pp. 443–459] are discussed with respect to the mean squared error matrix criterion. Furthermore, a Monte Carlo evaluation of the estimators is given to illustrate some of the theoretical results.  相似文献   

