共查询到20条相似文献,搜索用时 13 毫秒
1.
In conditional logspline modelling, the logarithm of the conditional density function, log f(y|x), is modelled by using polynomial splines and their tensor products. The parameters of the model (coefficients of the spline functions) are estimated by maximizing the conditional log-likelihood function. The resulting estimate is a density function (positive and integrating to one) and is twice continuously differentiable. The estimate is used further to obtain estimates of regression and quantile functions in a natural way. An automatic procedure for selecting the number of knots and knot locations based on minimizing a variant of the AIC is developed. An example with real data is given. Finally, extensions and further applications of conditional logspline models are discussed. 相似文献
2.
We present a local density estimator based on first-order statistics. To estimate the density at a point, x, the original sample is divided into subsets and the average minimum sample distance to x over all such subsets is used to define the density estimate at x. The tuning parameter is thus the number of subsets instead of the typical bandwidth of kernel or histogram-based density estimators. The proposed method is similar to nearest-neighbor density estimators but it provides smoother estimates. We derive the asymptotic distribution of this minimum sample distance statistic to study globally optimal values for the number and size of the subsets. Simulations are used to illustrate and compare the convergence properties of the estimator. The results show that the method provides good estimates of a wide variety of densities without changes of the tuning parameter, and that it offers competitive convergence performance. 相似文献
3.
It is well known that the traditional Pearson correlation in many cases fails to capture non-linear dependence structures in bivariate data. Other scalar measures capable of capturing non-linear dependence exist. A common disadvantage of such measures, however, is that they cannot distinguish between negative and positive dependence, and typically the alternative hypothesis of the accompanying test of independence is simply “dependence”. This paper discusses how a newly developed local dependence measure, the local Gaussian correlation, can be used to construct local and global tests of independence. A global measure of dependence is constructed by aggregating local Gaussian correlation on subsets of \(\mathbb{R}^{2}\) , and an accompanying test of independence is proposed. Choice of bandwidth is based on likelihood cross-validation. Properties of this measure and asymptotics of the corresponding estimate are discussed. A bootstrap version of the test is implemented and tried out on both real and simulated data. The performance of the proposed test is compared to the Brownian distance covariance test. Finally, when the hypothesis of independence is rejected, local independence tests are used to investigate the cause of the rejection. 相似文献
4.
We establish weak and strong posterior consistency of Gaussian process priors studied by Lenk [1988. The logistic normal distribution for Bayesian, nonparametric, predictive densities. J. Amer. Statist. Assoc. 83 (402), 509–516] for density estimation. Weak consistency is related to the support of a Gaussian process in the sup-norm topology which is explicitly identified for many covariance kernels. In fact we show that this support is the space of all continuous functions when the usual covariance kernels are chosen and an appropriate prior is used on the smoothing parameters of the covariance kernel. We then show that a large class of Gaussian process priors achieve weak as well as strong posterior consistency (under some regularity conditions) at true densities that are either continuous or piecewise continuous. 相似文献
5.
《Journal of statistical planning and inference》2006,136(3):839-859
In this paper, we provide a large bandwidth analysis for a class of local likelihood methods. This work complements the small bandwidth analysis of Park et al. (Ann. Statist. 30 (2002) 1480). Our treatment is more general than the large bandwidth analysis of Eguchi and Copas (J. Roy. Statist. Soc. B 60 (1998) 709). We provide a higher-order asymptotic analysis for the risk of the local likelihood density estimator, from which a direct comparison between various versions of local likelihood can be made. The present work, being combined with the small bandwidth results of Park et al. (2002), gives an optimal size of the bandwidth which depends on the degree of departure of the underlying density from the proposed parametric model. 相似文献
6.
M. Di Marzio S. Fensore A. Panzera C. C. Taylor 《Journal of Statistical Computation and Simulation》2016,86(13):2560-2572
ABSTRACTLocal likelihood has been mainly developed from an asymptotic point of view, with little attention to finite sample size issues. The present paper provides simulation evidence of how likelihood density estimation practically performs from two points of view. First, we explore the impact of the normalization step of the final estimate, second we show the effectiveness of higher order fits in identifying modes present in the population when small sample sizes are available. We refer to circular data, nevertheless it is easily seen that our findings straightforwardly extend to the Euclidean setting, where they appear to be somehow new. 相似文献
7.
《Journal of statistical planning and inference》1999,77(1):37-50
The problem of selecting the bandwidth for optimal kernel density estimation at a point is considered. A class of local bandwidth selectors which minimize smoothed bootstrap estimates of mean-squared error in density estimation is introduced. It is proved that the bandwidth selectors in the class achieve optimal relative rates of convergence, dependent upon the local smoothness of the target density. Practical implementation of the bandwidth selection methodology is discussed. The use of Gaussian-based kernels to facilitate computation of the smoothed bootstrap estimate of mean-squared error is proposed. The performance of the bandwidth selectors is investigated empirically. 相似文献
8.
Sonia Petrone 《Revue canadienne de statistique》1999,27(1):105-126
We propose a Bayesian nonparametric procedure for density estimation, for data in a closed, bounded interval, say [0,1]. To this aim, we use a prior based on Bemstein polynomials. This corresponds to expressing the density of the data as a mixture of given beta densities, with random weights and a random number of components. The density estimate is then obtained as the corresponding predictive density function. Comparison with classical and Bayesian kernel estimates is provided. The proposed procedure is illustrated in an example; an MCMC algorithm for approximating the estimate is also discussed. 相似文献
9.
Gaussian proposal density using moment matching in SMC methods 总被引:1,自引:0,他引:1
In this article we introduce a new Gaussian proposal distribution to be used in conjunction with the sequential Monte Carlo
(SMC) method for solving non-linear filtering problems. The proposal, in line with the recent trend, incorporates the current
observation. The introduced proposal is characterized by the exact moments obtained from the dynamical system. This is in
contrast with recent works where the moments are approximated either numerically or by linearizing the observation model.
We show further that the newly introduced proposal performs better than other similar proposal functions which also incorporate
both state and observations.
This work was supported by a research grant from THALES Nederland BV. 相似文献
10.
A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model. The procedure admits bandwidth selection which is flexible in terms of the amount of smoothing required. In addition, the basic model can be extended to incorporate local smoothing of the density estimate. The method is shown to perform well in both theoretical and practical situations, and we compare our method with those of Abramson (The Annals of Statistics 10: 1217–1223) and Sain and Scott (Journal of the American Statistical Association 91: 1525–1534). In particular, we note that in certain cases, the Sain and Scott method performs poorly even with relatively large sample sizes.We compare various bandwidth selection methods using standard mean integrated square error criteria to assess the quality of the density estimates. We study situations where the underlying density is assumed both known and unknown, and note that in practice, our method performs well when sample sizes are small. In addition, we also apply the methods to real data, and again we believe our methods perform at least as well as existing methods. 相似文献
11.
Ethan Anderes 《Journal of statistical planning and inference》2011,141(3):1183-1193
We investigate the problem of estimating a smooth invertible transformation f when observing independent samples X1,…,Xn∼P°f where P is a known measure. We focus on the two-dimensional case where P and f are defined on R2. We present a flexible class of smooth invertible transformations in two dimensions with variational equations for optimizing over the classes, then study the problem of estimating the transformation f by penalized maximum likelihood estimation. We apply our methodology to the case when P°f has a density with respect to Lebesgue measure on R2 and demonstrate improvements over kernel density estimation on three examples. 相似文献
12.
Patrick Marsh 《统计学通讯:理论与方法》2013,42(3):332-339
Conditional information measures the information in a sample for an interest parameter in the presence of nuisance parameter. In the context of Gaussian likelihoods this paper first derives conditions under which a projection of the data may reduce conditional information to zero. These are then applied in the context of time series regressions, and inference on a covariance parameter, such as with either autoregressive or moving average errors. It is shown that regressing out very common regressors, such as a linear trend or dummy variable, can imply that conditional information is zero in the case of non-stationary autoregressions or non-invertible moving averages, respectively. 相似文献
13.
In this paper, we introduce a new nonparametric estimation procedure of the conditional density of a scalar response variable given a random variable taking values in a semi-metric space. Under some general conditions, we establish both the pointwise and the uniform almost-complete consistencies with convergence rates of the conditional density estimator related to this estimation procedure. Moreover, we give some particular cases of our results which can also be considered as novel in the finite-dimensional setting. Notice also that the results of this paper are used to derive some asymptotic properties of the local linear estimator of the conditional mode. 相似文献
14.
When estimating loss distributions in insurance, large and small losses are usually split because it is difficult to find a simple parametric model that fits all claim sizes. This approach involves determining the threshold level between large and small losses. In this article, a unified approach to the estimation of loss distributions is presented. We propose an estimator obtained by transforming the data set with a modification of the Champernowne cdf and then estimating the density of the transformed data by use of the classical kernel density estimator. We investigate the asymptotic bias and variance of the proposed estimator. In a simulation study, the proposed method shows a good performance. We also present two applications dealing with claims costs in insurance. 相似文献
15.
Michael T. Fahey Christopher W. Thane Gemma D. Bramwell W. Andy Coward 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2007,170(1):149-166
Summary. Free-living individuals have multifaceted diets and consume foods in numerous combinations. In epidemiological studies it is desirable to characterize individual diets not only in terms of the quantity of individual dietary components but also in terms of dietary patterns. We describe the conditional Gaussian mixture model for dietary pattern analysis and show how it can be adapted to take account of important characteristics of self-reported dietary data. We illustrate this approach with an analysis of the 2000–2001 National Diet and Nutrition Survey of adults. The results strongly favoured a mixture model solution allowing clusters to vary in shape and size, over the standard approach that has been used previously to find dietary patterns. 相似文献
16.
This article focuses on the conditional density of a scalar response variable given a random variable taking values in a semimetric space. The local linear estimators of the conditional density and its derivative are considered. It is assumed that the observations form a stationary α-mixing sequence. Under some regularity conditions, the joint asymptotic normality of the estimators of the conditional density and its derivative is established. The result confirms the prospect in Rachdi et al. (2014) and can be applied in time-series analysis to make predictions and build confidence intervals. The finite-sample behavior of the estimator is investigated by simulations as well. 相似文献
17.
Russell B. Millar 《Statistics and Computing》2018,28(2):375-385
The predictive loss of Bayesian models can be estimated using a sample from the full-data posterior by evaluating the Watanabe-Akaike information criterion (WAIC) or using an importance sampling (ISCVL) approximation to leave-one-out cross-validation loss. With hierarchical models the loss can be specified at different levels of the hierarchy, and in the published literature, it is routine for these estimators to use the conditional likelihood provided by the lowest level of model hierarchy. However, the regularity conditions underlying these estimators may not hold at this level, and the behaviour of conditional-level WAIC as an estimator of conditional-level predictive loss must be determined on a case-by-case basis. Conditional-level ISCVL does not target conditional-level predictive loss and instead is an estimator of marginal-level predictive loss. Using examples for analysis of over-dispersed count data, it is shown that conditional-level WAIC does not provide a reliable estimator of its target loss, and simulations show that it can favour the incorrect model. Moreover, conditional-level ISCVL is numerically unstable compared to marginal-level ISCVL. It is recommended that WAIC and ISCVL be evaluated using the marginalized likelihood where practicable and that the reliability of these estimators always be checked using appropriate diagnostics. 相似文献
18.
Markov random fields (MRFs) express spatial dependence through conditional distributions, although their stochastic behavior is defined by their joint distribution. These joint distributions are typically difficult to obtain in closed form, the problem being a normalizing constant that is a function of unknown parameters. The Gaussian MRF (or conditional autoregressive model) is one case where the normalizing constant is available in closed form; however, when sample sizes are moderate to large (thousands to tens of thousands), and beyond, its computation can be problematic. Because the conditional autoregressive (CAR) model is often used for spatial-data modeling, we develop likelihood-inference methodology for this model in situations where the sample size is too large for its normalizing constant to be computed directly. In particular, we use simulation methodology to obtain maximum likelihood estimators of mean, variance, and spatial-depencence parameters (including their asymptotic variances and covariances) of CAR models. 相似文献
19.
Probability density estimation via an infinite Gaussian mixture model: application to statistical process monitoring 总被引:1,自引:0,他引:1
Tao Chen Julian Morris Elaine Martin 《Journal of the Royal Statistical Society. Series C, Applied statistics》2006,55(5):699-715
Summary. The primary goal of multivariate statistical process performance monitoring is to identify deviations from normal operation within a manufacturing process. The basis of the monitoring schemes is historical data that have been collected when the process is running under normal operating conditions. These data are then used to establish confidence bounds to detect the onset of process deviations. In contrast with the traditional approaches that are based on the Gaussian assumption, this paper proposes the application of the infinite Gaussian mixture model (GMM) for the calculation of the confidence bounds, thereby relaxing the previous restrictive assumption. The infinite GMM is a special case of Dirichlet process mixtures and is introduced as the limit of the finite GMM, i.e. when the number of mixtures tends to ∞. On the basis of the estimation of the probability density function, via the infinite GMM, the confidence bounds are calculated by using the bootstrap algorithm. The methodology proposed is demonstrated through its application to a simulated continuous chemical process, and a batch semiconductor manufacturing process. 相似文献
20.
The durations between market activities such as trades and quotes provide useful information on the underlying assets while analyzing financial time series. In this article, we propose a stochastic conditional duration model based on the inverse Gaussian distribution. The non-monotonic nature of the failure rate of the inverse Gaussian distribution makes it suitable for modeling the durations in financial time series. The parameters of the proposed model are estimated by an efficient importance sampling method. A simulation experiment is conducted to check the performance of the estimators. These estimates are used to compute estimated hazard functions and to compare with the empirical hazard functions. Finally, a real data analysis is provided to illustrate the practical utility of the models. 相似文献