首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
The performance of nine different nonparametric regression estimates is empirically compared on ten different real datasets. The number of data points in the real datasets varies between 7, 900 and 18, 000, where each real dataset contains between 5 and 20 variables. The nonparametric regression estimates include kernel, partitioning, nearest neighbor, additive spline, neural network, penalized smoothing splines, local linear kernel, regression trees, and random forests estimates. The main result is a table containing the empirical L2 risks of all nine nonparametric regression estimates on the evaluation part of the different datasets. The neural networks and random forests are the two estimates performing best. The datasets are publicly available, so that any new regression estimate can be easily compared with all nine estimates considered in this article by just applying it to the publicly available data and by computing its empirical L2 risks on the evaluation part of the datasets.  相似文献   

2.
Estimating multivariate location and scatter with both affine equivariance and positive breakdown has always been difficult. A well-known estimator which satisfies both properties is the Minimum Volume Ellipsoid Estimator (MVE). Computing the exact MVE is often not feasible, so one usually resorts to an approximate algorithm. In the regression setup, algorithms for positive-breakdown estimators like Least Median of Squares typically recompute the intercept at each step, to improve the result. This approach is called intercept adjustment. In this paper we show that a similar technique, called location adjustment, can be applied to the MVE. For this purpose we use the Minimum Volume Ball (MVB), in order to lower the MVE objective function. An exact algorithm for calculating the MVB is presented. As an alternative to MVB location adjustment we propose L 1 location adjustment, which does not necessarily lower the MVE objective function but yields more efficient estimates for the location part. Simulations compare the two types of location adjustment. We also obtain the maxbias curves of L 1 and the MVB in the multivariate setting, revealing the superiority of L 1.  相似文献   

3.
In this paper, the maximum likelihood estimates of the parameters for the M/Er /1 queueing model are derived when the queue size at each departure point is observed. A numerical example is generated by simulating a finite Markov chain to illustrate the methodology for estimating the parameters with variable Erlang service time distribution. The problem of hypothesis testing and simultaneous Confidence regions of the parameter is also investigated.0  相似文献   

4.
In this paper, the regression model with a nonnegativity constraint on the dependent variable is considered. Under weak conditions, L 1 estimates of the regression coefficients are shown to be consistent.  相似文献   

5.
Abstract

The hypothesis tests of performance measures for an M/Ek/1 queueing system are considered. With pivotal models deduced from sufficient statistics for the unknown parameters, a generalized p-value approach to derive tests about parametric functions are proposed. The focus is on derivation of the p-values of hypothesis testing for five popular performance measures of the system in the steady state. Given a sample T, let p(T) be the p values we developed. We derive a closed form expression to show that, for small samples, the probability P(p(T) ? γ) is approximately equal to γ, for 0 ? γ ? 1.  相似文献   

6.
To summarize a set of data by a distribution function in Johnson's translation system, we use a least-squares approach to parameter estimation wherein we seek to minimize the distance between the vector of "uniformized" oeder statistics and the corresponding vector of expected values. We use the software package FITTRI to apply this technique to three problems arising respectively in medicine, applied statistics, and civil engineering. Compared to traditional methods of distribution fitting based on moment matching, percentile matchingL 1 estimation, and L ? estimation, the least-squares technique is seen to yield fits of similar accuracy and to converge more rapidly and reliably to a set of acceptable parametre estimates.  相似文献   

7.
Let (??, ??) be a space with a σ-field, M = {Ps; s o} a family of probability measures on A, Θ arbitrary, X1,…,Xn independently and identically distributed P random variables. Metrize Θ with the L1 distance between measures, and assume identifiability. Minimum-distance estimators are constructed that relate rates of convergence with Vapnik-Cervonenkis exponents when M is “regular”. An alternative construction of estimates is offered via Kolmogorov's chain argument.  相似文献   

8.
Under proper conditions, two independent tests of the null hypothesis of homogeneity of means are provided by a set of sample averages. One test, with tail probability P 1, relates to the variation between the sample averages, while the other, with tail probability P 2, relates to the concordance of the rankings of the sample averages with the anticipated rankings under an alternative hypothesis. The quantity G = P 1 P 2 is considered as the combined test statistic and, except for the discreteness in the null distribution of P 2, would correspond to the Fisher statistic for combining probabilities. Illustration is made, for the case of four means, on how to get critical values of G or critical values of P 1 for each possible value of P 2, taking discreteness into account. Alternative measures of concordance considered are Spearman's ρ and Kendall's τ. The concept results, in the case of two averages, in assigning two-thirds of the test size to the concordant tail, one-third to the discordant tail.  相似文献   

9.
The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data arrive. However, analyses based on the Dirichlet process prior are sensitive to the choice of the parameters, including an infinite-dimensional distributional parameter G 0. Most previous applications have either fixed G 0 as a member of a parametric family or treated G 0 in a Bayesian fashion, using parametric prior specifications. In contrast, we have developed an adaptive nonparametric method for constructing smooth estimates of G 0. We combine this method with a technique for estimating α, the other Dirichlet process parameter, that is inspired by an existing characterization of its maximum-likelihood estimator. Together, these estimation procedures yield a flexible empirical Bayes treatment of Dirichlet process mixtures. Such a treatment is useful in situations where smooth point estimates of G 0 are of intrinsic interest, or where the structure of G 0 cannot be conveniently modeled with the usual parametric prior families. Analysis of simulated and real-world datasets illustrates the robustness of this approach.  相似文献   

10.
We developed robust estimators that minimize a weighted L1 norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L1 estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L1 estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study.  相似文献   

11.
In this paper, we consider the problem of combining a number of opinions which have been expressed as probability measures P1, …, Pn, over some space. It is shown that a pooling formula which has the marginalization property of McConway (1981) must be of the form T = Σni=1Wi Pi + (1 - Σni =1Wi)Q, where Q is an arbitrary measure and W1, …, Wn ϵ [—1,1] are weights such that| ΣJ Σ j wj | ≤ 1 for every subset J of {1, …, n}. If, in addition, T is required to preserve the independence of arbitrary events A and B whenever these events are independent under each Pi, then either T = Pi for some 1 ≤ in or T = Q, in which case Q takes values in {0, l}.  相似文献   

12.
We propose the L1 distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L1 in the large-N limit. We argue that the L1 distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k.  相似文献   

13.
Let X 1,X 2,…,X n be independent exponential random variables such that X i has hazard rate λ for i = 1,…,p and X j has hazard rate λ* for j = p + 1,…,n, where 1 ≤ p < n. Denote by D i:n (λ, λ*) = X i:n  ? X i?1:n the ith spacing of the order statistics X 1:n  ≤ X 2:n  ≤ ··· ≤ X n:n , i = 1,…,n, where X 0:n ≡ 0. It is shown that the spacings (D 1,n ,D 2,n ,…,D n:n ) are MTP2, strengthening one result of Khaledi and Kochar (2000), and that (D 1:n 2, λ*),…,D n:n 2, λ*)) ≤ lr (D 1:n 1, λ*),…,D n:n 1, λ*)) for λ1 ≤ λ* ≤ λ2, where ≤ lr denotes the multivariate likelihood ratio order. A counterexample is also given to show that this comparison result is in general not true for λ* < λ1 < λ2.  相似文献   

14.
Let f be an unknown possibly multimodal density on Rd and let X1, X2, … be a sequence of independent random vectors with density f. Several recursive estimates of the mode of f are proposed, and sufficient conditions ensuring their weak and strong consistency are established.  相似文献   

15.
In this article, we obtain a Stein operator for the sum of n independent random variables (rvs) which is shown as the perturbation of the negative binomial (NB) operator. Comparing the operator with NB operator, we derive the error bounds for total variation distance by matching parameters. Also, three-parameter approximation for such a sum is considered and is shown to improve the existing bounds in the literature. Finally, an application of our results to a function of waiting time for (k1, k2)-events is given.  相似文献   

16.
In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L2E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L2E is based on an integrated L2 distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L2E estimator, we compare the performance of L2E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L2E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L2E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L2E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L2E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients.  相似文献   

17.
Spatial linear processes {Xs, s ? T} where T is a triangular lattice in R2 are considered. Special attention is given to the class of spatial moving-average processes. Precisely, for each site s T, the variable Xs is defined as a linear combination of real-valued random shocks located at the vertices of regular concentric hexagons centered at s. For Gaussian random shocks, the process is also Gaussian, and estimates of its parameters are obtained by maximizing the exact likelihood. For non-Gaussian random shocks, the exact likelihood is difficult to obtain; however, the Gaussian likelihood is still used giving the pseudo-Gaussian likelihood estimates. The behaviour of these estimates is analyzed through the study of asymptotic properties and some simulation experiments based on an isotropic model defined with one coefficient.  相似文献   

18.
An important statistical problem is to construct a confidence set for some functional T(P) of some unknown probability distribution P. Typically, this involves approximating the sampling distribution Jn(P) of some pivot based on a sample of size n from P. A bootstrap procedure is to estimate Jn(P) by Jn(&Pcirc;n), where P?n is the empirical measure based on a sample of size n from P. Typically, one has that Jn(P) and Jn(P?n) are close in an appropriate sense. Two questions are addressed in this note. Are Jn(P) and Jn(P?n) uniformly close as P varies as well? If so, do confidence statements about T(P) possess a corresponding uniformity property? In the case T(P) = P, the answer to the first questions is yes; the answer to the second is no. However, bootstrap confidence statements about T(P) can be made uniform over a restricted, though large, class of P. Similar results apply to other functional T(P).  相似文献   

19.
In this paper, we consider the linear autoregressive model with varying coefficients θn∈[0,1). When θn tending to the unit root, the moderate deviation principle for empirical covariance is discussed, and as statistical applications, we provide the moderate deviation estimates of the least square and the Yule–Walker estimators of the parameter θn.  相似文献   

20.
We consider automatic data-driven density, regression and autoregression estimates, based on any random bandwidth selector h/T. We show that in a first-order asymptotic approximation they behave as well as the related estimates obtained with the “optimal” bandwidth hT as long as hT/hT → 1 in probability. The results are obtained for dependent observations; some of them are also new for independent observations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号