首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The existence and properties of optimal bandwidths for multivariate local linear regression are established, using either a scalar bandwidth for all regressors or a diagonal bandwidth vector that has a different bandwidth for each regressor. Both involve functionals of the derivatives of the unknown multivariate regression function. Estimating these functionals is difficult primarily because they contain multivariate derivatives. In this paper, an estimator of the multivariate second derivative is obtained via local cubic regression with most cross-terms left out. This estimator has the optimal rate of convergence but is simpler and uses much less computing time than the full local estimator. Using this as a pilot estimator, we obtain plug-in formulae for the optimal bandwidth, both scalar and diagonal, for multivariate local linear regression. As a simpler alternative, we also provide rule-of-thumb bandwidth selectors. All these bandwidths have satisfactory performance in our simulation study.  相似文献   

2.
ABSTRACT

This work treats non-parametric estimation of multivariate probability mass functions, using multivariate discrete associated kernels. We propose a Bayesian local approach to select the matrix of bandwidths considering the multivariate Dirac Discrete Uniform and the product of binomial kernels, and treating the bandwidths as a diagonal matrix of parameters with some prior distribution. The performances of this approach and the cross-validation method are compared using simulations and real count data sets. The obtained results show that the Bayes local method performs better than cross-validation in terms of integrated squared error.  相似文献   

3.
A great deal of research has focused on improving the bias properties of kernel estimators. One proposal involves removing the restriction of non-negativity on the kernel to construct “higher-order” kernels that eliminate additional terms in the Taylor's series expansion of the bias. This paper considers an alternative that uses a local approach to bandwidth selection to not only reduce the bias, but to eliminate it entirely. These so-called “zero-bias bandwidths” are shown to exist for univariate and multivariate kernel density estimation as well as kernel regression. Implications of the existence of such bandwidths are discussed. An estimation strategy is presented, and the extent of the reduction or elimination of bias in practice is studied through simulation and example.  相似文献   

4.
The first step in statistical analysis is the parameter estimation. In multivariate analysis, one of the parameters of interest to be estimated is the mean vector. In multivariate statistical analysis, it is usually assumed that the data come from a multivariate normal distribution. In this situation, the maximum likelihood estimator (MLE), that is, the sample mean vector, is the best estimator. However, when outliers exist in the data, the use of sample mean vector will result in poor estimation. So, other estimators which are robust to the existence of outliers should be used. The most popular robust multivariate estimator for estimating the mean vector is S-estimator with desirable properties. However, computing this estimator requires the use of a robust estimate of mean vector as a starting point. Usually minimum volume ellipsoid (MVE) is used as a starting point in computing S-estimator. For high-dimensional data computing, the MVE takes too much time. In some cases, this time is so large that the existing computers cannot perform the computation. In addition to the computation time, for high-dimensional data set the MVE method is not precise. In this paper, a robust starting point for S-estimator based on robust clustering is proposed which could be used for estimating the mean vector of the high-dimensional data. The performance of the proposed estimator in the presence of outliers is studied and the results indicate that the proposed estimator performs precisely and much better than some of the existing robust estimators for high-dimensional data.  相似文献   

5.
An affine equivariant estimate of multivariate location based on an adaptive transformation and retransformation approach is studied. The work is primarily motivated by earlier work on different versions of the multivariate median and their properties. We explore an issue related to efficiency and equivariance that was originally raised by Bickel and subsequently investigated by Brown and Hettmansperger. Our estimate has better asymptotic performance than the vector of co-ordinatewise medians when the variables are substantially correlated. The finite sample performance of the estimate is investigated by using Monte Carlo simulations. Some examples are presented to demonstrate the effect of the adaptive transformation–retransformation strategy in the construction of multivariate location estimates for real data.  相似文献   

6.
We examine a simple estimator for the multivariate moving average model based on vector autoregressive approximation. In finite samples the estimator has a bias which is low where roots of the characteristic equation are well away from the unit circle, and more substantial where one or more roots have modulus near unity. We show that the representation estimated by this multivariate technique is consistent and asymptotically invertible. This estimator has significant computational advantages over Maximum Likelihood, and more importantly may be more robust than ML to mis-specification of the vector moving average model. The estimation method is applied to a VMA model of wholesale and retail inventories, using Canadian data on inventory investment, and allows us to examine the propagation of shocks between the two classes of inventory.  相似文献   

7.
Factor analysis of multivariate spatial data is considered. A systematic approach for modeling the underlying structure of potentially irregularly spaced, geo-referenced vector observations is proposed. Statistical inference procedures for selecting the number of factors and for model building are discussed. We derive a condition under which a simple and practical inference procedure is valid without specifying the form of distributions and factor covariance functions. The multivariate prediction problem is also discussed, and a procedure combining the latent variable modeling and a measurement-error-free kriging technique is introduced. Simulation results and an example using agricultural data are presented.  相似文献   

8.
Motivated by the need to develop meaningful empirical approximations to a 'typical' data value, we introduce methods for density and mode estimation when data are in the form of random curves. Our approach is based on finite dimensional approximations via generalized Fourier expansions on an empirically chosen basis. The mode estimation problem is reduced to a problem of kernel-type multivariate estimation from vector data and is solved using a new recursive algorithm for finding the empirical mode. The algorithm may be used as an aid to the identification of clusters in a set of data curves. Bootstrap methods are employed to select the bandwidth.  相似文献   

9.
In this paper we use Monte Carlo Simulation methodology to compare the effectiveness of five multivariate quality control methods, namely Hotelling T 2, Multivariate Shewhart Char, Discriminant Analysis, Decomposition Method, and Multivariate Ridge Residual Chart-developed by Authors-, for controlling the mean vector in a multivariate process. P-dimensional multivariate normal data generated using different covariance structures. Various amount of shift in the mean vector is induced and the resulting Average Run Length (ARL) is computed. The effectiveness of each method with regard to ARL is discussed.  相似文献   

10.
Genstat is a general statistical language for data analysis. The facilities for multivariate and cluster analysis within the language are described as well as the many vector and matrix operations which can be used to form multivariate analysis programs. The contents of the standard macro library relevant to multivariate analysis are also discussed.  相似文献   

11.
The class of single-index models (SIMs) has become an important tool for nonparametric regression analysis. As with any other nonparametric regression models, the selection of bandwidth plays an important role in the inferences of the SIMs. However, most results in the literature either take the bandwidths as externally given, or require unpractical assumptions or very restrictive conditions for data-driven bandwidths. We examine the asymptotic properties of a popular bandwidth selection method based on cross-validation that is completely data-driven, under much weaker conditions than those assumed in the literature. And we show that the same bandwidth that is optimal for estimating the index vector, can be used for nearly optimal error variance estimation through the method of varying cross-validation. A simulation study is presented to demonstrate the finite sample performance of the proposed procedures, based on which we recommend a simple 2-step procedure for bandwidth selection, index vector estimation, as well as error variance estimation.  相似文献   

12.
Arjun K. Gupta  J. Tang 《Statistics》2013,47(4):301-309
It is well known that many data, such as the financial or demographic data, exhibit asymmetric distributions. In recent years, researchers have concentrated their efforts to model this asymmetry. Skew normal model is one of such models that are skew and yet possess many properties of the normal model. In this paper, a new multivariate skew model is proposed, along with its statistical properties. It includes the multivariate normal distribution and multivariate skew normal distribution as special cases. The quadratic form of this random vector follows a χ2 distribution. The roles of the parameters in the model are investigated using contour plots of bivariate densities.  相似文献   

13.
Most problems related to environmental studies are innately multivariate. In fact, in each spatial location more than one variable is usually measured. In geostatistics multivariate data analysis, where we intend to predict the value of a random vector in a new site, which has no data, cokriging method is used as the best linear unbiased prediction. In lattice data analysis, where almost exclusively the probability modeling of data is of concern, only auto-Gaussian model has been used for continuous multivariate data. For discrete multivariate data little work has been carried out. In this paper, an auto-multinomial model is suggested for analyzing multivariate lattice discrete data. The proposed method is illustrated by a real example of air pollution in Tehran, Iran.  相似文献   

14.
In this article, the new family of multivariate skew slash distribution is defined. According to the definition, a stochastic representation of the multivariate skew slash distribution is derived. The first four moments and measures of skewness and kurtosis of a random vector with the multivariate skew slash distribution are obtained. The distribution of quadratic forms for the multivariate skew slash distribution and the non central skew slash χ2 distribution are studied. Maximum likelihood inference and real data illustration are discussed. In the end, the potential extension of multivariate skew slash distribution is discussed.  相似文献   

15.
The vector correlation coefficient and other measures of association play a very important role in statistics and especially in multivariate analysis. In this paper a new measure of association is proposed and its upper bound is presented by using a matrix trace Wielandt inequality. Also given are relevant results involving Wishart matrices widely used in multivariate analysis, and especially a new alternative for the relative gain of the covariance adjusted estimator of a vector of parameters.  相似文献   

16.
Likelihood cross-validation for kernel density estimation is known to be sensitive to extreme observations and heavy-tailed distributions. We propose a robust likelihood-based cross-validation method to select bandwidths in multivariate density estimations. We derive this bandwidth selector within the framework of robust maximum likelihood estimation. This method establishes a smooth transition from likelihood cross-validation for nonextreme observations to least squares cross-validation for extreme observations, thereby combining the efficiency of likelihood cross-validation and the robustness of least-squares cross-validation. We also suggest a simple rule to select the transition threshold. We demonstrate the finite sample performance and practical usefulness of the proposed method via Monte Carlo simulations and a real data application on Chinese air pollution.  相似文献   

17.
Amparo Baíllo 《Statistics》2013,47(6):553-569
This work deals with estimating the vector of means of certain characteristics of small areas. In this context, a unit level multivariate model with correlated sampling errors is considered. An approximation is obtained for the mean-squared and cross-product errors of the empirical best linear unbiased predictors of the means, when model parameters are estimated either by maximum likelihood (ML) or by restricted ML. This approach has been implemented on a Monte Carlo study using social and labour data from the Spanish Labour Force Survey.  相似文献   

18.
Summary. A new estimator of the regression parameters is introduced in a multivariate multiple-regression model in which both the vector of explanatory variables and the vector of response variables are assumed to be random. The affine equivariant estimate matrix is constructed using the sign covariance matrix (SCM) where the sign concept is based on Oja's criterion function. The influence function and asymptotic theory are developed to consider robustness and limiting efficiencies of the SCM regression estimate. The estimate is shown to be consistent with a limiting multinormal distribution. The influence function, as a function of the length of the contamination vector, is shown to be linear in elliptic cases; for the least squares (LS) estimate it is quadratic. The asymptotic relative efficiencies with respect to the LS estimate are given in the multivariate normal as well as the t -distribution cases. The SCM regression estimate is highly efficient in the multivariate normal case and, for heavy-tailed distributions, it performs better than the LS estimate. Simulations are used to consider finite sample efficiencies with similar results. The theory is illustrated with an example.  相似文献   

19.
Robust nonparametric smoothers have been proved effective to preserve edges in image denoising. As an extension, they should be capable to estimate multivariate surfaces containing discontinuities on the basis of a random spatial sampling. A crucial problem is the design of their coefficients, in particular those of the kernels which concern robustness. In this paper it is shown that bandwidths which regard smoothness can consistently be estimated, whereas those which concern robustness cannot be estimated with plug-in and cross-validation criteria. Heuristic and graphical methods are proposed for their selection and their efficacy is proved in simulation experiments.  相似文献   

20.
Strategies for improving fixed non-negative kernel estimators have focused on reducing the bias, either by employing higher-order kernels or by adjusting the bandwidth locally. Intuitively, bandwidths in the tails should be relatively larger in order to reduce wiggles since there is less data available in the tails. We show that in regions where the density function is convex, it is theoretically possible to find local bandwidths such that the pointwise bias is exactly zero. The corresponding pointwise mean squared error converges at the parametric rate of O ( n −1 ) rather than the slower O ( n −4/5). These so-called zero-bias bandwidths are constant and are usually orders of magnitude larger than the optimal locally adaptive bandwidths predicted by asymptotic mean squared error analysis. We describe data-based algorithms for estimating zero-bias bandwidths over intervals where the density is convex. We find that our particular density estimator attains the usual O ( n −4/5) rate. However, we demonstrate that the algorithms can provide significant improvement in mean squared error, often clearly visually superior curves, and a new operating point in the usual bias-variance tradeoff.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号