首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Maximum likelihood estimation is investigated in the context of linear regression models under partial independence restrictions. These restrictions aim to assume a kind of completeness of a set of predictors Z in the sense that they are sufficient to explain the dependencies between an outcome Y and predictors X: ?(Y|Z, X) = ?(Y|Z), where ?(·|·) stands for the conditional distribution. From a practical point of view, the former model is particularly interesting in a double sampling scheme where Y and Z are measured together on a first sample and Z and X on a second separate sample. In that case, estimation procedures are close to those developed in the study of double‐regression by Engel & Walstra (1991) and Causeur & Dhorne (1998) . Properties of the estimators are derived in a small sample framework and in an asymptotic one, and the procedure is illustrated by an example from the food industry context.  相似文献   

2.
It has been found that, for a variety of probability distributions, there is a surprising linear relation between mode, mean, and median. In this article, the relation between mode, mean, and median regression functions is assumed to follow a simple parametric model. We propose a semiparametric conditional mode (mode regression) estimation for an unknown (unimodal) conditional distribution function in the context of regression model, so that any m-step-ahead mean and median forecasts can then be substituted into the resultant model to deliver m-step-ahead mode prediction. In the semiparametric model, Least Squared Estimator (LSEs) for the model parameters and the simultaneous estimation of the unknown mean and median regression functions by the local linear kernel method are combined to infer about the parametric and nonparametric components of the proposed model. The asymptotic normality of these estimators is derived, and the asymptotic distribution of the parameter estimates is also given and is shown to follow usual parametric rates in spite of the presence of the nonparametric component in the model. These results are applied to obtain a data-based test for the dependence of mode regression over mean and median regression under a regression model.  相似文献   

3.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

4.
Abstract. Non‐parametric regression models have been studied well including estimating the conditional mean function, the conditional variance function and the distribution function of errors. In addition, empirical likelihood methods have been proposed to construct confidence intervals for the conditional mean and variance. Motivated by applications in risk management, we propose an empirical likelihood method for constructing a confidence interval for the pth conditional value‐at‐risk based on the non‐parametric regression model. A simulation study shows the advantages of the proposed method.  相似文献   

5.
In the multiple linear regression analysis, the ridge regression estimator and the Liu estimator are often used to address multicollinearity. Besides multicollinearity, outliers are also a problem in the multiple linear regression analysis. We propose new biased estimators based on the least trimmed squares (LTS) ridge estimator and the LTS Liu estimator in the case of the presence of both outliers and multicollinearity. For this purpose, a simulation study is conducted in order to see the difference between the robust ridge estimator and the robust Liu estimator in terms of their effectiveness; the mean square error. In our simulations, the behavior of the new biased estimators is examined for types of outliers: X-space outlier, Y-space outlier, and X-and Y-space outlier. The results for a number of different illustrative cases are presented. This paper also provides the results for the robust ridge regression and robust Liu estimators based on a real-life data set combining the problem of multicollinearity and outliers.  相似文献   

6.
Abstract

Nonparametric regression is a standard statistical tool with increased importance in the Big Data era. Boundary points pose additional difficulties but local polynomial regression can be used to alleviate them. Local linear regression, for example, is easy to implement and performs quite well both at interior and boundary points. Estimating the conditional distribution function and/or the quantile function at a given regressor point is immediate via standard kernel methods but problems ensue if local linear methods are to be used. In particular, the distribution function estimator is not guaranteed to be monotone increasing, and the quantile curves can “cross.” In the article at hand, a simple method of correcting the local linear distribution estimator for monotonicity is proposed, and its good performance is demonstrated via simulations and real data examples. Supplementary materials for this article are available online.  相似文献   

7.
In this work, we develop a method of adaptive non‐parametric estimation, based on ‘warped’ kernels. The aim is to estimate a real‐valued function s from a sample of random couples (X,Y). We deal with transformed data (Φ(X),Y), with Φ a one‐to‐one function, to build a collection of kernel estimators. The data‐driven bandwidth selection is performed with a method inspired by Goldenshluger and Lepski (Ann. Statist., 39, 2011, 1608). The method permits to handle various problems such as additive and multiplicative regression, conditional density estimation, hazard rate estimation based on randomly right‐censored data, and cumulative distribution function estimation from current‐status data. The interest is threefold. First, the squared‐bias/variance trade‐off is automatically realized. Next, non‐asymptotic risk bounds are derived. Lastly, the estimator is easily computed, thanks to its simple expression: a short simulation study is presented.  相似文献   

8.
In this article, we develop a local M-estimation for the conditional variance in heteroscedastic regression models. The estimator is based on the local linear smoothing technique and the M-estimation technique, and it is shown to be not only asymptotically equivalent to the local linear estimator but also robust. The consistency and asymptotic normality of the local M-estimator for the conditional variance in heteroscedastic regression models are obtained under mild conditions. The simulation studies demonstrate that the proposed estimators perform well in robustness.  相似文献   

9.
In this article, we propose a semi-parametric mode regression for a non linear model. We use an expectation-maximization algorithm in order to estimate the regression coefficients of modal non linear regression. We also establish asymptotic properties for the proposed estimator under assumptions of the error density. We investigate the performance through a simulation study.  相似文献   

10.
Suppose one estimates the coefficient β2 in E[Y] = β0 + β1 X 1 + β2 X 2 by stagewise regression. That is, first the model E[Y] ≌ β0 + β1 X 1 is fit using simple linear regression followed by a simple linear regression of the residuals from this model on X 2 to yield the estimator β2. The ratio of the squared t statistic for the estimate b 2 from multiple regression to the squared t statistic for β2 is greater than or equal to 1.0 and is shown to be a convenient function of correlation coefficients among Y, X 1, and X 2. Examination of stagewise regression can provide useful insights when introducing concepts of multiple regression.  相似文献   

11.
This paper considers quantile regression for a wide class of time series models including autoregressive and moving average (ARMA) models with asymmetric generalized autoregressive conditional heteroscedasticity errors. The classical mean‐variance models are reinterpreted as conditional location‐scale models so that the quantile regression method can be naturally geared into the considered models. The consistency and asymptotic normality of the quantile regression estimator is established in location‐scale time series models under mild conditions. In the application of this result to ARMA‐generalized autoregressive conditional heteroscedasticity models, more primitive conditions are deduced to obtain the asymptotic properties. For illustration, a simulation study and a real data analysis are provided.  相似文献   

12.
Coefficient estimation in linear regression models with missing data is routinely carried out in the mean regression framework. However, the mean regression theory breaks down if the error variance is infinite. In addition, correct specification of the likelihood function for existing imputation approach is often challenging in practice, especially for skewed data. In this paper, we develop a novel composite quantile regression and a weighted quantile average estimation procedure for parameter estimation in linear regression models when some responses are missing at random. Instead of imputing the missing response by randomly drawing from its conditional distribution, we propose to impute both missing and observed responses by their estimated conditional quantiles given the observed data and to use the parametrically estimated propensity scores to weigh check functions that define a regression parameter. Both estimation procedures are resistant to heavy‐tailed errors or outliers in the response and can achieve nice robustness and efficiency. Moreover, we propose adaptive penalization methods to simultaneously select significant variables and estimate unknown parameters. Asymptotic properties of the proposed estimators are carefully investigated. An efficient algorithm is developed for fast implementation of the proposed methodologies. We also discuss a model selection criterion, which is based on an ICQ ‐type statistic, to select the penalty parameters. The performance of the proposed methods is illustrated via simulated and real data sets.  相似文献   

13.
Abstract. The problem of estimating an unknown density function has been widely studied. In this article, we present a convolution estimator for the density of the responses in a nonlinear heterogenous regression model. The rate of convergence for the mean square error of the convolution estimator is of order n ?1 under certain regularity conditions. This is faster than the rate for the kernel density method. We derive explicit expressions for the asymptotic variance and the bias of the new estimator, and further a data‐driven bandwidth selector is proposed. We conduct simulation experiments to check the finite sample properties, and the convolution estimator performs substantially better than the kernel density estimator for well‐behaved noise densities.  相似文献   

14.
Parametrically guided non‐parametric regression is an appealing method that can reduce the bias of a non‐parametric regression function estimator without increasing the variance. In this paper, we adapt this method to the censored data case using an unbiased transformation of the data and a local linear fit. The asymptotic properties of the proposed estimator are established, and its performance is evaluated via finite sample simulations.  相似文献   

15.
Recently, least absolute deviations (LAD) estimator for median regression models with doubly censored data was proposed and the asymptotic normality of the estimator was established. However, it is invalid to make inference on the regression parameter vectors, because the asymptotic covariance matrices are difficult to estimate reliably since they involve conditional densities of error terms. In this article, three methods, which are based on bootstrap, random weighting, and empirical likelihood, respectively, and do not require density estimation, are proposed for making inference for the doubly censored median regression models. Simulations are also done to assess the performance of the proposed methods.  相似文献   

16.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

17.
In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ? n corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C p statistic that can also be used to estimate this mean square error. The C p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C p .  相似文献   

18.
We discuss the case of the multivariate linear model Y = XB + E with Y an (n × p) matrix, and so on, when there are missing observations in the Y matrix in a so-called nested pattern. We propose an analysis that arises by incorporating the predictive density of the missing observations in determining the posterior distribution of B, and its mean and variance matrix. This involves us with matric-T variables. The resulting analysis is illustrated with some Canadian economic data.  相似文献   

19.
Consider the nonparametric heteroscedastic regression model Y=m(X)+σ(X)?, where m(·) is an unknown conditional mean function and σ(·) is an unknown conditional scale function. In this paper, the limit distribution of the quantile estimate for the scale function σ(X) is derived. Since the limit distribution depends on the unknown density of the errors, an empirical likelihood ratio statistic based on quantile estimator is proposed. This statistics is used to construct confidence intervals for the variance function. Under certain regularity conditions, it is shown that the quantile estimate of the scale function converges to a Brownian motion and the empirical likelihood ratio statistic converges to a chi-squared random variable. Simulation results demonstrate the superiority of the proposed method over the least squares procedure when the underlying errors have heavy tails.  相似文献   

20.
We introduce multicovariate-adjusted regression (MCAR), an adjustment method for regression analysis, where both the response (Y) and predictors (X 1, …, X p ) are not directly observed. The available data have been contaminated by unknown functions of a set of observable distorting covariates, Z 1, …, Z s , in a multiplicative fashion. The proposed method substantially extends the current contaminated regression modelling capability, by allowing for multiple distorting covariate effects. MCAR is a flexible generalisation of the recently proposed covariate-adjusted regression method, an effective adjustment method in the presence of a single covariate, Z. For MCAR estimation, we establish a connection between the MCAR models and adaptive varying coefficient models. This connection leads to an adaptation of a hybrid backfitting estimation algorithm. Extensive simulations are used to study the performance and limitations of the proposed iterative estimation algorithm. In particular, the bias and mean square error of the proposed MCAR estimators are examined, relative to a baseline and a consistent benchmark estimator. The method is also illustrated with a Pima Indian diabetes data set, where the response and predictors are potentially contaminated by body mass index and triceps skin fold thickness. Both distorting covariates measure aspects of obesity, an important risk factor in type 2 diabetes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号