首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The problem of density estimation arises naturally in many contexts. In this paper, we consider the approach using a piecewise constant function to approximate the underlying density. We present a new density estimation method via the random forest method based on the Bayesian Sequential Partition (BSP) (Lu, Jiang, and Wong 2013 Lu, L., H. Jiang, and W. H. Wong, 2013. Multivariate density estimation by Bayesian Sequential Partitioning. Journal of the American Statistical Association 108(504):140210.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]). Extensive simulations are carried out with comparison to the kernel density estimation method, BSP method, and four local kernel density estimation methods. The experiment results show that the new method is capable of providing accurate and reliable density estimation, even at the boundary, especially for i.i.d. data. In addition, the likelihood of the out-of-bag density estimation, which is a byproduct of the training process, is an effective hyperparameter selection criterion.  相似文献   

3.
Although Hartigan (1975) had already put forward the idea of connecting identification of subpopulations with regions with high density of the underlying probability distribution, the actual development of methods for cluster analysis has largely shifted towards other directions, for computational convenience. Current computational resources allow us to reconsider this formulation and to develop clustering techniques directly in order to identify local modes of the density. Given a set of observations, a nonparametric estimate of the underlying density function is constructed, and subsets of points with high density are formed through suitable manipulation of the associated Delaunay triangulation. The method is illustrated with some numerical examples.  相似文献   

4.
In this paper, the simultaneous estimation of the precision parameters of k normal distributions is considered under the squared loss function in a decision-theoretic framework. Several classes of minimax estimators are derived by using the chi-square identity, and the generalized Bayes minimax estimators are developed out of the classes. It is also shown that the improvement on the unbiased estimators is characterized by the superharmonic function. This corresponds to Stein's [1981. Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135–1151] result in simultaneous estimation of normal means.  相似文献   

5.
In this paper, we extend Choi and Hall's [Data sharpening as a prelude to density estimation. Biometrika. 1999;86(4):941–947] data sharpening algorithm for kernel density estimation to interval-censored data. Data sharpening has several advantages, including bias and mean integrated squared error (MISE) reduction as well as increased robustness to bandwidth misspecification. Several interval metrics are explored for use with the kernel function in the data sharpening transformation. A simulation study based on randomly generated data is conducted to assess and compare the performance of each interval metric. It is found that the bias is reduced by sharpening, often with little effect on the variance, thus maintaining or reducing overall MISE. Applications involving time to onset of HIV and running distances subject to measurement error are used for illustration.  相似文献   

6.
This paper derives EM and generalized EM (GEM) algorithms for calculating least absolute deviations (LAD) estimates of the parameters of linear and nonlinear regression models. It shows that Schlossmacher's iterative reweighted least squares algorithm for calculating LAD estimates (E.J. Schlossmacher, Journal of the American Statistical Association 68: 857–859, 1973) is an EM algorithm. A GEM algorithm for computing LAD estimates of the parameters of nonlinear regression models is also provided and is applied in some examples.  相似文献   

7.
Abstract

This paper presents a new method to estimate the quantiles of generic statistics by combining the concept of random weighting with importance resampling. This method converts the problem of quantile estimation to a dual problem of tail probabilities estimation. Random weighting theories are established to calculate the optimal resampling weights for estimation of tail probabilities via sequential variance minimization. Subsequently, the quantile estimation is constructed by using the obtained optimal resampling weights. Experimental results on real and simulated data sets demonstrate that the proposed random weighting method can effectively estimate the quantiles of generic statistics.  相似文献   

8.
This paper presents a new random weighting-based adaptive importance resampling method to estimate the sampling distribution of a statistic. A random weighting-based cross-entropy procedure is developed to iteratively calculate the optimal resampling probability weights by minimizing the Kullback-Leibler distance between the optimal importance resampling distribution and a family of parameterized distributions. Subsequently, the random weighting estimation of the sampling distribution is constructed from the obtained optimal importance resampling distribution. The convergence of the proposed method is rigorously proved. Simulation and experimental results demonstrate that the proposed method can effectively estimate the sampling distribution of a statistic.  相似文献   

9.
In this paper, we consider the estimation problem of f(0), the value of density f at the left endpoint 0. Nonparametric estimation of f(0) is rather formidable due to boundary effects that occur in nonparametric curve estimation. It is well known that the usual kernel density estimates require modifications when estimating the density near endpoints of the support. Here we investigate the local polynomial smoothing technique as a possible alternative method for the problem. It is observed that our density estimator also possesses desirable properties such as automatic adaptability for boundary effects near endpoints. We also obtain an ‘optimal kernel’ in order to estimate the density at endpoints as a solution of a variational problem. Two bandwidth variation schemes are discussed and investigated in a Monte Carlo study.  相似文献   

10.
Shipping and shipping services are a key industry of great importance to the economy of Cyprus and the wider European Union. Assessment, management and future steering of the industry, and its associated economy, is carried out by a range of organisations and is of direct interest to a number of stakeholders. This article presents an analysis of shipping credit flow data: an important and archetypal series whose analysis is hampered by rapid changes of variance. Our analysis uses the recently developed data-driven Haar–Fisz transformation that enables accurate trend estimation and successful prediction in these kinds of situation. Our trend estimation is augmented by bootstrap confidence bands, new in this context. The good performance of the data-driven Haar–Fisz transform contrasts with the poor performance exhibited by popular and established variance stabilisation alternatives: the Box–Cox, logarithm and square root transformations.  相似文献   

11.
This paper investigates on the problem of parameter estimation in statistical model when observations are intervals assumed to be related to underlying crisp realizations of a random sample. The proposed approach relies on the extension of likelihood function in interval setting. A maximum likelihood estimate of the parameter of interest may then be defined as a crisp value maximizing the generalized likelihood function. Using the expectation-maximization (EM) to solve such maximizing problem therefore derives the so-called interval-valued EM algorithm (IEM), which makes it possible to solve a wide range of statistical problems involving interval-valued data. To show the performance of IEM, the following two classical problems are illustrated: univariate normal mean and variance estimation from interval-valued samples, and multiple linear/nonlinear regression with crisp inputs and interval output.  相似文献   

12.
We show that the maximum likelihood estimators (MLEs) of the fixed effects and within‐cluster correlation are consistent in a heteroscedastic nested‐error regression (HNER) model with completely unknown within‐cluster variances under mild conditions. The result implies that the empirical best linear unbiased prediction (EBLUP) method for small area estimation is valid in such a case. We also show that ignoring the heteroscedasticity can lead to inconsistent estimation of the within‐cluster correlation and inferior predictive performance. A jackknife measure of uncertainty for the EBLUP is developed under the HNER model. Simulation studies are carried out to investigate the finite‐sample performance of the EBLUP and MLE under the HNER model, with comparisons to those under the nested‐error regression model in various situations, as well as that of the jackknife measure of uncertainty. The well‐known Iowa crops data is used for illustration. The Canadian Journal of Statistics 40: 588–603; 2012 © 2012 Statistical Society of Canada  相似文献   

13.
In this paper, we present an algorithm for clustering based on univariate kernel density estimation, named ClusterKDE. It consists of an iterative procedure that in each step a new cluster is obtained by minimizing a smooth kernel function. Although in our applications we have used the univariate Gaussian kernel, any smooth kernel function can be used. The proposed algorithm has the advantage of not requiring a priori the number of cluster. Furthermore, the ClusterKDE algorithm is very simple, easy to implement, well-defined and stops in a finite number of steps, namely, it always converges independently of the initial point. We also illustrate our findings by numerical experiments which are obtained when our algorithm is implemented in the software Matlab and applied to practical applications. The results indicate that the ClusterKDE algorithm is competitive and fast when compared with the well-known Clusterdata and K-means algorithms, used by Matlab to clustering data.  相似文献   

14.
In the present paper, we derive lower bounds for the risk of the nonparametric empirical Bayes estimators. In order to attain the optimal convergence rate, we propose generalization of the linear empirical Bayes estimation method which takes advantage of the flexibility of the wavelet techniques. We present an empirical Bayes estimator as a wavelet series expansion and estimate coefficients by minimizing the prior risk of the estimator. As a result, estimation of wavelet coefficients requires solution of a well-posed low-dimensional sparse system of linear equations. The dimension of the system depends on the size of wavelet support and smoothness of the Bayes estimator. An adaptive choice of the resolution level is carried out using Lepski et al. (1997) method. The method is computationally efficient and provides asymptotically optimal adaptive EB estimators. The theory is supplemented by numerous examples.  相似文献   

15.
We consider asymptotic properties of the maximum likelihood and related estimators in a clustered logistic joinpoint model with an unknown joinpoint. Sufficient conditions are given for the consistency of confidence bounds produced by the parametric bootstrap; one of the conditions required is that the true location of the joinpoint is not at one of the observation times. A simulation study is presented to illustrate the lack of consistency of the bootstrap confidence bounds when the joinpoint is an observation time. A removal algorithm is presented which corrects this problem, but at the price of an increased mean square error. Finally, the methods are applied to data on yearly cancer mortality in the US for individuals age 65 and over.  相似文献   

16.
In a multinomial model, the sample space is partitioned into a disjoint union of cells. The partition is usually immutable during sampling of the cell counts. In this paper, we extend the multinomial model to the incomplete multinomial model by relaxing the constant partition assumption to allow the cells to be variable and the counts collected from non-disjoint cells to be modeled in an integrated manner for inference on the common underlying probability. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models including the Bradley–Terry model, the Plackett–Luce model, etc. Since the conventional method, which solves for the zeros of the score functions, is unfruitful, we develop a new approach to establishing a simpler set of estimating equations to obtain the maximum likelihood estimate (MLE), which seeks the simultaneous maximization of all multiplicative components of the likelihood by fitting each component into an inequality. As a consequence, our estimation amounts to solving a system of the equality attainment conditions to the inequalities. The resultant MLE equations are simple and immediately invite a fixed-point iteration algorithm for solution, which is referred to as the weaver algorithm. The weaver algorithm is short and amenable to parallel implementation. We also derive the asymptotic covariance of the MLE, verify main results with simulations, and compare the weaver algorithm with an MM/EM algorithm based on fitting a Plackett–Luce model to a benchmark data set.  相似文献   

17.
In this paper we propose a computationally efficient algorithm to estimate the parameters of a 2-D sinusoidal model in the presence of stationary noise. The estimators obtained by the proposed algorithm are consistent and asymptotically equivalent to the least squares estimators. Monte Carlo simulations are performed for different sample sizes and it is observed that the performances of the proposed method are quite satisfactory and they are equivalent to the least squares estimators. The main advantage of the proposed method is that the estimators can be obtained using only finite number of iterations. In fact it is shown that starting from the average of periodogram estimators, the proposed algorithm converges in three steps only. One synthesized texture data and one original texture data have been analyzed using the proposed algorithm for illustrative purpose.  相似文献   

18.
Multivariate shrinkage estimation of small area means and proportions   总被引:3,自引:0,他引:3  
The familiar (univariate) shrinkage estimator of a small area mean or proportion combines information from the small area and a national survey. We define a multivariate shrinkage estimator which combines information also across subpopulations and outcome variables. The superiority of the multivariate shrinkage over univariate shrinkage, and of the univariate shrinkage over the unbiased (sample) means, is illustrated on examples of estimating the local area rates of economic activity in the subpopulations defined by ethnicity, age and sex. The examples use the sample of anonymized records of individuals from the 1991 UK census. The method requires no distributional assumptions but relies on the appropriateness of the quadratic loss function. The implementation of the method involves minimum outlay of computing. Multivariate shrinkage is particularly effective when the area level means are highly correlated and the sample means of one or a few components have small sampling and between-area variances. Estimations for subpopulations based on small samples can be greatly improved by incorporating information from subpopulations with larger sample sizes.  相似文献   

19.
Conventionally, a ridge parameter is estimated as a function of regression parameters based on ordinary least squares. In this article, we proposed an iterative procedure instead of the one-step or conventional ridge method. Additionally, we construct an indicator that measures the potential degree of improvement in mean squared error when ridge estimates are employed. Simulations show that our methods are appropriate for a wide class of non linear models including generalized linear models and proportional hazards (PHs) regressions. The method is applied to a PH regression with highly collinear covariates in a cancer recurrence study.  相似文献   

20.
Serfling and Xiao [A contribution to multivariate L-moments, L-comoment matrices. J Multivariate Anal. 2007;98:1765–1781] extended the L-moment theory to the multivariate setting. In the present paper, we focus on the two-dimensional random vectors to establish a link between the bivariate L-moments (BLM) and the underlying bivariate copula functions. This connection provides a new estimate of dependence parameters of bivariate statistical data. Extensive simulation study is carried out to compare estimators based on the BLM, the maximum likelihood, the minimum distance and a rank approximate Z-estimation. The obtained results show that, when the sample size increases, BLM-based estimation performs better as far as the bias and computation time are concerned. Moreover, the root-mean-squared error is quite reasonable and less sensitive in general to outliers than those of the above cited methods. Further, the proposed BLM method is an easy-to-use tool for the estimation of multiparameter copula models. A generalization of the BLM estimation method to the multivariate case is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号