首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 82 毫秒
1.
In the context of the partially linear semiparametric model examined by Robinson (1988), we show that root-n-consisten estimation results established using kernel and series methods can also be obtained by using k-nearest-neighbor (k-nn) method.  相似文献   

2.
This article primarily aims to put forward the linearized restricted ridge regression (LRRR) estimator in linear regression models. Two types of LRRR estimators are investigated under the PRESS criterion and the optimal LRRR estimators and the optimal restricted generalized ridge regression estimator are obtained. We apply the results to the Hald data and finally make a simulation study by using the method of McDonald and Galarneau.  相似文献   

3.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

4.
We introduce a fully model-based approach of studying functional relationships between a multivariate circular-dependent variable and several circular covariates, enabling inference regarding all model parameters and related prediction. Two multiple circular regression models are presented for this approach. First, for an univariate circular-dependent variable, we propose the least circular mean-square error (LCMSE) estimation method, and asymptotic properties of the LCMSE estimators and inferential methods are developed and illustrated. Second, using a simulation study, we provide some practical suggestions for model selection between the two models. An illustrative example is given using a real data set from protein structure prediction problem. Finally, a straightforward extension to the case with a multivariate-dependent circular variable is provided.  相似文献   

5.
We present a Bayesian analysis of a piecewise linear model constructed by using basis functions which generalizes the univariate linear spline to higher dimensions. Prior distributions are adopted on both the number and the locations of the splines, which leads to a model averaging approach to prediction with predictive distributions that take into account model uncertainty. Conditioning on the data produces a Bayes local linear model with distributions on both predictions and local linear parameters. The method is spatially adaptive and covariate selection is achieved by using splines of lower dimension than the data.  相似文献   

6.
This paper considers the problem of estimating a cumulative distribution function (cdf), when it is known a priori to dominate a known cdf. The estimator considered is obtained by adjusting the empirical cdf using the prior information. This adjusted estimator is shown to be consistent, its limiting distribution is found, and its mean squared error (MSE) is shown to be smaller than the MSE of the empirical cdf. Its asymptotic efficiency (compared to the empirical cdf) is also found.  相似文献   

7.
To compare their performance on high dimensional data, several regression methods are applied to data sets in which the number of exploratory variables greatly exceeds the sample sizes. The methods are stepwise regression, principal components regression, two forms of latent root regression, partial least squares, and a new method developed here. The data are four sample sets for which near infrared reflectance spectra have been determined and the regression methods use the spectra to estimate the concentration of various chemical constituents, the latter having been determined by standard chemical analysis. Thirty-two regression equations are estimated using each method and their performances are evaluated using validation data sets. Although it is the most widely used, stepwise regression was decidedly poorer than the other methods considered. Differences between the latter were small with partial least squares performing slightly better than other methods under all criteria examined, albeit not by a statistically significant amount.  相似文献   

8.
The objective of this study is providing a comparative assessment for researchers to deal with the challenges of analyzing count data and examining the factors associated with daily cigarette consumption among the young people in Turkey. We fitted Poisson (P), negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), Poisson hurdle (PH) and negative binomial hurdle (NBH) regressions to cigarette consumption count data by using the 2014 Turkey Health Survey. Our results showed that the ZINB and NBH models should be preferred. We also found that, gender, employment and tobacco use at home are more effective factors for smokers and nonsmokers in the 15–24 age group in Turkey.  相似文献   

9.
Quantile regression (QR) proposed by Koenker and Bassett [Regression quantiles, Econometrica 46(1) (1978), pp. 33–50] is a statistical technique that estimates conditional quantiles. It has been widely studied and applied to economics. Meinshausen [Quantile regression forests, J. Mach. Learn. Res. 7 (2006), pp. 983–999] proposed quantile regression forests (QRF), a non-parametric way based on random forest. QRF performs well in terms of prediction accuracy, but it struggles with noisy data sets. This motivates us to propose a multi-step QR tree method using GUIDE (Generalized, Unbiased, Interaction Detection and Estimation) made by Loh [Regression trees with unbiased variable selection and interaction detection, Statist. Sinica 12 (2002), pp. 361–386]. Our simulation study shows that the multi-step QR tree performs better than a single tree or QRF especially when it deals with data sets having many irrelevant variables.  相似文献   

10.
Several variations of monotone nonparametric regression have been developed over the past 30 years. One approach is to first apply nonparametric regression to data and then monotone smooth the initial estimates to “iron out” violations to the assumed order. Here, such estimators are considered, where local polynomial regression is first used, followed by either least squares isotonic regression or a monotone method using simple averages. The primary focus of this work is to evaluate different types of confidence intervals for these monotone nonparametric regression estimators through Monte Carlo simulation. Most of the confidence intervals use bootstrap or jackknife procedures. Estimation of a response variable as a function of two continuous predictor variables is considered, where the estimation is performed at the observed values of the predictors (instead of on a grid). The methods are then applied to data involving subjects that worked at plants that use beryllium metal who have developed chronic beryllium disease.  相似文献   

11.
Value at Risk (VaR) forecasts can be produced from conditional autoregressive VaR models, estimated using quantile regression. Quantile modeling avoids a distributional assumption, and allows the dynamics of the quantiles to differ for each probability level. However, by focusing on a quantile, these models provide no information regarding expected shortfall (ES), which is the expectation of the exceedances beyond the quantile. We introduce a method for predicting ES corresponding to VaR forecasts produced by quantile regression models. It is well known that quantile regression is equivalent to maximum likelihood based on an asymmetric Laplace (AL) density. We allow the density's scale to be time-varying, and show that it can be used to estimate conditional ES. This enables a joint model of conditional VaR and ES to be estimated by maximizing an AL log-likelihood. Although this estimation framework uses an AL density, it does not rely on an assumption for the returns distribution. We also use the AL log-likelihood for forecast evaluation, and show that it is strictly consistent for the joint evaluation of VaR and ES. Empirical illustration is provided using stock index data. Supplementary materials for this article are available online.  相似文献   

12.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

13.
The present investigation was undertaken to study the gillnet catch efficiency of sardines in the coastal waters of Sri Lanka using commercial catch and effort data. Commercial catch and effort data of small mesh gillnet fishery were collected in five fisheries districts during the period May 1999–August 2002. Gillnet catch efficiency of sardines was investigated by developing catch rates predictive models using data on commercial fisheries and environmental variables. Three statistical techniques [multiple linear regression, generalized additive model and regression tree model (RTM)] were employed to predict the catch rates of trenched sardine Amblygaster sirm (key target species of small mesh gillnet fishery) and other sardines (Sardinella longiceps, S. gibbosa, S. albella and S. sindensis). The data collection programme was conducted for another six months and the models were tested on new data. RTMs were found to be the strongest in terms of reliability and accuracy of the predictions. The two operational characteristics used here for model formulation (i.e. depth of fishing and number of gillnet pieces used per fishing operation) were more useful as predictor variables than the environmental variables. The study revealed a rapid tendency of increasing the catch rates of A. sirm with increased sea depth up to around 32 m.  相似文献   

14.
This article applies and investigates a number of logistic ridge regression (RR) parameters that are estimable by using the maximum likelihood (ML) method. By conducting an extensive Monte Carlo study, the performances of ML and logistic RR are investigated in the presence of multicollinearity and under different conditions. The simulation study evaluates a number of methods of estimating the RR parameter k that has recently been developed for use in linear regression analysis. The results from the simulation study show that there is at least one RR estimator that has a lower mean squared error (MSE) than the ML method for all the different evaluated situations.  相似文献   

15.
Abstract

A simple method based on sliced inverse regression (SIR) is proposed to explore an effective dimension reduction (EDR) vector for the single index model. We avoid the principle component analysis step of the original SIR by using two sample mean vectors in two slices of the response variable and their difference vector. The theories become simpler, the method is equivalent to the multiple linear regression with dichotomized response, and the estimator can be expressed by a closed form, although the objective function might be an unknown nonlinear. It can be applied for the case when the number of covariates is large, and it requires no matrix operation or iterative calculation.  相似文献   

16.
In the ciassical regression model Yi=h(xi) + ? i, i=1,…,n, Cheng (1984) introduced linear combinations of regression quantiles as a new class of estimators for the unknown regression function h(x). The asymptotic properties studied in Cheng (1984) are reconsidered. We obtain a sharper scrong consistency rate and we improve on the conditions for asymptotic normality by proving a new result on the remainder term in the Bahadur representation for regression quantiles.  相似文献   

17.
Abstract.  Let X be a d -variate random vector that is completely observed, and let Y be a random variable that is subject to right censoring and left truncation. For arbitrary functions φ we consider expectations of the form E [ φ ( X ,  Y )], which appear in many statistical problems, and we estimate these expectations by using a product-limit estimator for censored and truncated data, extended to the context where covariates are present. An almost sure representation for these estimators is obtained, with a remainder term that is of a certain negligible order, uniformly over a class of φ -functions. This uniformity is important for the application to goodness-of-fit testing in regression and to inference for the regression depth, which we consider in more detail.  相似文献   

18.
Ranked-set sampling (RSS) and judgment post-stratification (JPS) use ranking information to obtain more efficient inference than is possible using simple random sampling. Both methods were developed with subjective, judgment-based rankings in mind, but the idea of ranking using a covariate has received a lot of attention. We provide evidence here that when rankings are done using a covariate, the standard RSS and JPS mean estimators no longer make efficient use of the available information. We first show that when rankings are done using a covariate, the standard nonparametric mean estimators in JPS and unbalanced RSS are inadmissible under squared error loss. We then show that when rankings are done using a covariate, nonparametric regression techniques yield mean estimators that tend to be significantly more efficient than the standard RSS and JPS mean estimators. We conclude that the standard estimators are best reserved for settings where only subjective, judgment-based rankings are available.  相似文献   

19.
This paper studies robust estimation of multivariate regression model using kernel weighted local linear regression. A robust estimation procedure is proposed for estimating the regression function and its partial derivatives. The proposed estimators are jointly asymptotically normal and attain nonparametric optimal convergence rate. One-step approximations to the robust estimators are introduced to reduce computational burden. The one-step local M-estimators are shown to achieve the same efficiency as the fully iterative local M-estimators as long as the initial estimators are good enough. The proposed estimators inherit the excellent edge-effect behavior of the local polynomial methods in the univariate case and at the same time overcome the disadvantages of the local least-squares based smoothers. Simulations are conducted to demonstrate the performance of the proposed estimators. Real data sets are analyzed to illustrate the practical utility of the proposed methodology. This work was supported by the National Natural Science Foundation of China (Grant No. 10471006).  相似文献   

20.
Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The present paper compares six such methods (two of which are almost completely new) to ordinary regression: Partial least Squares (PLS), Principal Component regression (PCR), Principle covariates regression, reduced rank regression, and two variants of what is called power regression. The comparison is mainly done by means of a series of simulation studies, in which data are constructed in various ways, with different degrees of collinearity and noise, and the methods are compared in terms of their capability of recovering the population regression weights, as well as their prediction quality for the complete population. It turns out that recovery of regression weights in situations with collinearity is often very poor by all methods, unless the regression weights lie in the subspace spanning the first few principal components of the predictor variables. In those cases, typically PLS and PCR give the best recoveries of regression weights. The picture is inconclusive, however, because, especially in the study with more real life like simulated data, PLS and PCR gave the poorest recoveries of regression weights in conditions with relatively low noise and collinearity. It seems that PLS and PCR are particularly indicated in cases with much collinearity, whereas in other cases it is better to use ordinary regression. As far as prediction is concerned: Prediction suffers far less from collinearity than recovery of the regression weights.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号