首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mutual information (also known as Kullback–Leibler divergence) can be viewed as a measure of multivariate association in a random vector. The definition incorporates the joint density as well as the marginal densities. We will focus on a representation of mutual information in terms of copula densities that is thus independent of the marginal distributions. This representation yields a different approach to estimating mutual information than the original definition does, as only the copula density has to be estimated. We review analytical properties and examples for selected distributions and discuss methods of nonparametric estimation of copula densities and hence of the mutual information from a sample. Based on a simulation study, we compare the performance of these estimators with respect to bias, standard deviation, and the root mean squared error. The Gauss and the Frank copula are considered as examples.  相似文献   

2.
CORRECTING FOR KURTOSIS IN DENSITY ESTIMATION   总被引:1,自引:0,他引:1  
Using a global window width kernel estimator to estimate an approximately symmetric probability density with high kurtosis usually leads to poor estimation because good estimation of the peak of the distribution leads to unsatisfactory estimation of the tails and vice versa. The technique proposed corrects for kurtosis via a transformation of the data before using a global window width kernel estimator. The transformation depends on a “generalised smoothing parameter” consisting of two real-valued parameters and a window width parameter which can be selected either by a simple graphical method or, for a completely data-driven implementation, by minimising an estimate of mean integrated squared error. Examples of real and simulated data demonstrate the effectiveness of this approach, which appears suitable for a wide range of symmetric, unimodal densities. Its performance is similar to ordinary kernel estimation in situations where the latter is effective, e.g. Gaussian densities. For densities like the Cauchy where ordinary kernel estimation is not satisfactory, our methodology offers a substantial improvement.  相似文献   

3.
We will pursue a Bayesian nonparametric approach in the hierarchical mixture modelling of lifetime data in two situations: density estimation, when the distribution is a mixture of parametric densities with a nonparametric mixing measure, and accelerated failure time (AFT) regression modelling, when the same type of mixture is used for the distribution of the error term. The Dirichlet process is a popular choice for the mixing measure, yielding a Dirichlet process mixture model for the error; as an alternative, we also allow the mixing measure to be equal to a normalized inverse-Gaussian prior, built from normalized inverse-Gaussian finite dimensional distributions, as recently proposed in the literature. Markov chain Monte Carlo techniques will be used to estimate the predictive distribution of the survival time, along with the posterior distribution of the regression parameters. A comparison between the two models will be carried out on the grounds of their predictive power and their ability to identify the number of components in a given mixture density.  相似文献   

4.
It is well established that bandwidths exist that can yield an unbiased non–parametric kernel density estimate at points in particular regions (e.g. convex regions) of the underlying density. These zero–bias bandwidths have superior theoretical properties, including a 1/n convergence rate of the mean squared error. However, the explicit functional form of the zero–bias bandwidth has remained elusive. It is difficult to estimate these bandwidths and virtually impossible to achieve the higher–order rate in practice. This paper addresses these issues by taking a fundamentally different approach to the asymptotics of the kernel density estimator to derive a functional approximation to the zero–bias bandwidth. It develops a simple approximation algorithm that focuses on estimating these zero–bias bandwidths in the tails of densities where the convexity conditions favourable to the existence of the zerobias bandwidths are more natural. The estimated bandwidths yield density estimates with mean squared error that is O(n–4/5), the same rate as the mean squared error of density estimates with other choices of local bandwidths. Simulation studies and an illustrative example with air pollution data show that these estimated zero–bias bandwidths outperform other global and local bandwidth estimators in estimating points in the tails of densities.  相似文献   

5.
This paper considers a class of densities formed by taking the product of nonnegative polynomials and normal densities. These densities provide a rich class of distributions that can be used in modelling when faced with non-normal characteristics such as skewness and multimodality. In this paper we address inferential and computational issues arising in the practical implementation of this parametric family in the context of the linear model. Exact results are recorded for the conditional analysis of location-scale models and an importance sampling algorithm is developed for the implementation of a conditional analysis for the general linear model when using polynomial-normal distributions for the error.  相似文献   

6.
Abstract.  A new semiparametric method for density deconvolution is proposed, based on a model in which only the ratio of the unconvoluted to convoluted densities is specified parametrically. Deconvolution results from reweighting the terms in a standard kernel density estimator, where the weights are defined by the parametric density ratio. We propose that in practice, the density ratio be modelled on the log-scale as a cubic spline with a fixed number of knots. Parameter estimation is based on maximization of a type of semiparametric likelihood. The resulting asymptotic properties for our deconvolution estimator mirror the convergence rates in standard density estimation without measurement error when attention is restricted to our semiparametric class of densities. Furthermore, numerical studies indicate that for practical sample sizes our weighted kernel estimator can provide better results than the classical non-parametric kernel estimator for a range of densities outside the specified semiparametric class.  相似文献   

7.
In this article, we propose a flexible parametric (FP) approach for adjusting for covariate measurement errors in regression that can accommodate replicated measurements on the surrogate (mismeasured) version of the unobserved true covariate on all the study subjects or on a sub-sample of the study subjects as error assessment data. We utilize the general framework of the FP approach proposed by Hossain and Gustafson in 2009 for adjusting for covariate measurement errors in regression. The FP approach is then compared with the existing non-parametric approaches when error assessment data are available on the entire sample of the study subjects (complete error assessment data) considering covariate measurement error in a multiple logistic regression model. We also developed the FP approach when error assessment data are available on a sub-sample of the study subjects (partial error assessment data) and investigated its performance using both simulated and real life data. Simulation results reveal that, in comparable situations, the FP approach performs as good as or better than the competing non-parametric approaches in eliminating the bias that arises in the estimated regression parameters due to covariate measurement errors. Also, it results in better efficiency of the estimated parameters. Finally, the FP approach is found to perform adequately well in terms of bias correction, confidence coverage, and in achieving appropriate statistical power under partial error assessment data.  相似文献   

8.
It is well known that adaptive sequential nonparametric estimation of differentiable functions with assigned mean integrated squared error and minimax expected stopping time is impossible. In other words, no sequential estimator can compete with an oracle estimator that knows how many derivatives an estimated curve has. Differentiable functions are typical in probability density and regression models but not in spectral density models, where considered functions are typically smoother. This paper shows that for a large class of spectral densities, which includes spectral densities of classical autoregressive moving average processes, an adaptive minimax sequential estimation with assigned mean integrated squared error is possible. Furthermore, a two‐stage sequential procedure is proposed, which is minimax and adaptive to smoothness of an underlying spectral density.  相似文献   

9.
The Wehrly–Johnson family of bivariate circular distributions is by far the most general one currently available for modelling data on the torus. It allows complete freedom in the specification of the marginal circular densities as well as the binding circular density which regulates any dependence that might exist between them. We propose a parametric bootstrap approach for testing the goodness-of-fit of Wehrly–Johnson distributions when the forms of their marginal and binding densities are assumed known. The approach admits the use of any test for toroidal uniformity, and we consider versions of it incorporating three such tests. Simulation is used to illustrate the operating characteristics of the approach when the underlying distribution is assumed to be bivariate wrapped Cauchy. An analysis of wind direction data recorded at a Texan weather station illustrates the use of the proposed goodness-of-fit testing procedure.  相似文献   

10.
This paper presents a method for Bayesian inference for the regression parameters in a linear model with independent and identically distributed errors that does not require the specification of a parametric family of densities for the error distribution. This method first selects a nonparametric kernel density estimate of the error distribution which is unimodal and based on the least-squares residuals. Once the error distribution is selected, the Metropolis algorithm is used to obtain the marginal posterior distribution of the regression parameters. The methodology is illustrated with data sets, and its performance relative to standard Bayesian techniques is evaluated using simulation results.  相似文献   

11.
Abstract.  We consider the problem of estimating a compactly supported density taking a Bayesian nonparametric approach. We define a Dirichlet mixture prior that, while selecting piecewise constant densities, has full support on the Hellinger metric space of all commonly dominated probability measures on a known bounded interval. We derive pointwise rates of convergence for the posterior expected density by studying the speed at which the posterior mass accumulates on shrinking Hellinger neighbourhoods of the sampling density. If the data are sampled from a strictly positive, α -Hölderian density, with α  ∈ ( 0,1] , then the optimal convergence rate n− α / (2 α +1) is obtained up to a logarithmic factor. Smoothing histograms by polygons, a continuous piecewise linear estimator is obtained that for twice continuously differentiable, strictly positive densities satisfying boundary conditions attains a rate comparable up to a logarithmic factor to the convergence rate n −4/5 for integrated mean squared error of kernel type density estimators.  相似文献   

12.
Multivariate density estimation plays an important role in investigating the mechanism of high-dimensional data. This article describes a nonparametric Bayesian approach to the estimation of multivariate densities. A general procedure is proposed for constructing Feller priors for multivariate densities and their theoretical properties as nonparametric priors are established. A blocked Gibbs sampling algorithm is devised to sample from the posterior of the multivariate density. A simulation study is conducted to evaluate the performance of the procedure.  相似文献   

13.
There are many approaches in the estimation of spectral density. With regard to parametric approaches, different divergences are proposed in fitting a certain parametric family of spectral densities. Moreover, nonparametric approaches are also quite common considering the situation when we cannot specify the model of process. In this paper, we develop a local Whittle likelihood approach based on a general score function, with some special cases of which, the approach applies to more applications. This paper highlights the effective asymptotics of our general local Whittle estimator, and presents a comparison with other estimators. Additionally, for a special case, we construct the one-step ahead predictor based on the form of the score function. Subsequently, we show that it has a smaller prediction error than the classical exponentially weighted linear predictor. The provided numerical studies show some interesting features of our local Whittle estimator.  相似文献   

14.
The expansion, in standard form, consists of some 66 terms involving polyno - mials in a normal deviate, and cumulants and cumulant products to order ten. An assumed order of magnitude reduces these terms to eight groups. Sign patterns in the terms are not obvious. We take a number of Pearson densities and assess from the expansions a set of standard percentiles (1%, 5%, 95%, 99%). Validity of the as-sessments is pivoted on two alternative models:(i) the Bowman-Shenton algorithm for percentage points of Pearson densities, (ii) the 4-moment Johnson translation model. This approach has wide application since the models have proved to be remarkably reliable when compared, and also when compared with simulation as-sessments. A brief account is given of acceleration of convergence for the series, but there seems to be no analogue of the Padè or Levin algorithms.

The Cornish-Fisher application to the Fisher z-statistic is studied and the cumu- lants defined in general. Irwin's expression for the density of means from Pearson Type II is recalled. There is an error in the Cornish-Fisher treatment of the z-statistic but this is one which has its source in the write-up. Again the Irwin density in the general case has a factor missing.  相似文献   

15.
We consider a random effects quantile regression analysis of clustered data and propose a semiparametric approach using empirical likelihood. The random regression coefficients are assumed independent with a common mean, following parametrically specified distributions. The common mean corresponds to the population-average effects of explanatory variables on the conditional quantile of interest, while the random coefficients represent cluster specific deviations in the covariate effects. We formulate the estimation of the random coefficients as an estimating equations problem and use empirical likelihood to incorporate the parametric likelihood of the random coefficients. A likelihood-like statistical criterion function is yield, which we show is asymptotically concave in a neighborhood of the true parameter value and motivates its maximizer as a natural estimator. We use Markov Chain Monte Carlo (MCMC) samplers in the Bayesian framework, and propose the resulting quasi-posterior mean as an estimator. We show that the proposed estimator of the population-level parameter is asymptotically normal and the estimators of the random coefficients are shrunk toward the population-level parameter in the first order asymptotic sense. These asymptotic results do not require Gaussian random effects, and the empirical likelihood based likelihood-like criterion function is free of parameters related to the error densities. This makes the proposed approach both flexible and computationally simple. We illustrate the methodology with two real data examples.  相似文献   

16.
In earlier work (Gelfand and Smith, 1990 and Gelfand et al, 1990) a sampling based approach using the Gibbs sampler was offered as a means for developing marginal posterior densities for a wide range of Bayesian problems several of which were previously inaccessible. Our purpose here is two-fold. First we flesh out the implementation of this approach for calculation of arbitrary expectations of interest. Secondly we offer comparison with perhaps the most prominent approach for calculating posterior expectations, analytic approximation involving application of the LaPlace method. Several illustrative examples are discussed as well. Clear advantages for the sampling based approach emerge.  相似文献   

17.
Simon's two-stage designs are widely used in clinical trials to assess the activity of a new treatment. In practice, it is often the case that the second stage sample size is different from the planned one. For this reason, the critical value for the second stage is no longer valid for statistical inference. Existing approaches for making statistical inference are either based on asymptotic methods or not optimal. We propose an approach to maximize the power of a study while maintaining the type I error rate, where the type I error rate and power are calculated exactly from binomial distributions. The critical values of the proposed approach are numerically searched by an intelligent algorithm over the complete parameter space. It is guaranteed that the proposed approach is at least as powerful as the conditional power approach which is a valid but non-optimal approach. The power gain of the proposed approach can be substantial as compared to the conditional power approach. We apply the proposed approach to a real Phase II clinical trial.  相似文献   

18.
Sequential regression multiple imputation has emerged as a popular approach for handling incomplete data with complex features. In this approach, imputations for each missing variable are produced based on a regression model using other variables as predictors in a cyclic manner. Normality assumption is frequently imposed for the error distributions in the conditional regression models for continuous variables, despite that it rarely holds in real scenarios. We use a simulation study to investigate the performance of several sequential regression imputation methods when the error distribution is flat or heavy tailed. The methods evaluated include the sequential normal imputation and its several extensions which adjust for non normal error terms. The results show that all methods perform well for estimating the marginal mean and proportion, as well as the regression coefficient when the error distribution is flat or moderately heavy tailed. When the error distribution is strongly heavy tailed, all methods retain their good performances for the mean and the adjusted methods have robust performances for the proportion; but all methods can have poor performances for the regression coefficient because they cannot accommodate the extreme values well. We caution against the mechanical use of sequential regression imputation without model checking and diagnostics.  相似文献   

19.
We derive a class of higher-order kernels for estimation of densities and their derivatives, which can be viewed as an extension of the second-order Gaussian kernel. These kernels have some attractive properties such as smoothness, manageable convolution formulae, and Fourier transforms. One important application is the higher-order extension of exact calculations of the mean integrated squared error. The proposed kernels also have the advantage of simplifying computations of common window-width selection algorithms such as least-squares cross-validation. Efficiency calculations indicate that the Gaussian-based kernels perform almost as well as the optimal polynomial kernels when die order of the derivative being estimated is low.  相似文献   

20.
The bias of maximum likelihood estimators of the standard deviation of the response in location/scale regression models is considered. Results are obtained for a very wide family of densities for the response variable. These are used to propose point estimators with improved mean square error properties and to demonstrate the importance of bias correction in statistical inference when samples are moderately small.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号