期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

L 1-estimation for varying coefficient models

Qingguo Tang 《Statistics》2013,47(5):389-404

The varying coefficient model is a useful extension of linear models and has many advantages in practical use. To estimate the unknown functions in the model, the kernel type with local linear least-squares (L ₂) estimation methods has been proposed by several authors. When the data contain outliers or come from population with heavy-tailed distributions, L ₁-estimation should yield better estimators. In this article, we present the local linear L ₁-estimation method and derive the asymptotic distributions of the L ₁-estimators. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L ₁-estimators outperform the L ₂-estimators. 相似文献

2.

Local Linear Estimation for Spatiotemporal Models Based on Least Absolute Deviation

Hongxia Wang Jinguan Lin Jinde Wang 《统计学通讯:理论与方法》2013,42(7):1508-1522

When the data contain outliers or come from population with heavy-tailed distributions, which appear very often in spatiotemporal data, the estimation methods based on least-squares (L₂) method will not perform well. More robust estimation methods are required. In this article, we propose the local linear estimation for spatiotemporal models based on least absolute deviation (L₁) and drive the asymptotic distributions of the L₁-estimators under some mild conditions imposed on the spatiotemporal process. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L₁-estimators perform better than the L₂-estimators. 相似文献

3.

A note on the minimum size of an orthogonal array

Jay H. Beder Margaret Ann McComack 《统计学通讯:理论与方法》2017,46(8):3690-3697

It is an elementary fact that the size of an orthogonal array of strength t on k factors must be a multiple of a certain number, say L_t, that depends on the orders of the factors. Thus L_t is a lower bound on the size of arrays of strength t on those factors, and is no larger than L_k, the size of the complete factorial design. We investigate the relationship between the numbers L_t, and two questions in particular: For what t is L_t < L_k? And when L_t = L_k, is the complete factorial design the only array of that size and strength t? Arrays are assumed to be mixed-level.

We refer to an array of size less than L_k as a proper fraction. Guided by our main result, we construct a variety of mixed-level proper fractions of strength k ? 1 that also satisfy a certain group-theoretic condition. 相似文献

4.

PLUG-IN ESTIMATION OF GENERAL LEVEL SETS

Antonio Cuevas Wenceslao González-Manteiga Alberto Rodríguez-Casal 《Australian & New Zealand Journal of Statistics》2006,48(1):7-19

Given an unknown function (e.g. a probability density, a regression function, …) f and a constant c, the problem of estimating the level set L(c) ={f≥c} is considered. This problem is tackled in a very general framework, which allows f to be defined on a metric space different from . Such a degree of generality is motivated by practical considerations and, in fact, an example with astronomical data is analyzed where the domain of f is the unit sphere. A plug‐in approach is followed; that is, L(c) is estimated by L_n(c) ={f_n≥c} , where f_n is an estimator of f. Two results are obtained concerning consistency and convergence rates, with respect to the Hausdorff metric, of the boundaries ?L_n(c) towards ?L(c) . Also, the consistency of L_n(c) to L(c) is shown, under mild conditions, with respect to the L₁ distance. Special attention is paid to the particular case of spherical data. 相似文献

5.

Remarks on the L1 distance in statistical data analysis

Robert J. Budzyński Witold Kondracki 《统计学通讯:理论与方法》2017,46(19):9355-9363

We propose the L₁ distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L₁ in the large-N limit. We argue that the L₁ distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k. 相似文献

6.

Estimating the parameter of selected uniform population under the squared log error loss function

K. R. Meena Mohd. Arshad Aditi Kar Gangopadhyay 《统计学通讯:理论与方法》2018,47(7):1679-1692

Let π₁, …, π_k be k (? 2) independent populations, where π_i denotes the uniform distribution over the interval (0, θ_i) and θ_i > 0 (i = 1, …, k) is an unknown scale parameter. The population associated with the largest scale parameter is called the best population. For selecting the best population, We use a selection rule based on the natural estimators of θ_i, i = 1, …, k, for the case of unequal sample sizes. Consider the problem of estimating the scale parameter θ_L of the selected uniform population when sample sizes are unequal and the loss is measured by the squared log error (SLE) loss function. We derive the uniformly minimum risk unbiased (UMRU) estimator of θ_L under the SLE loss function and two natural estimators of θ_L are also studied. For k = 2, we derive a sufficient condition for inadmissibility of an estimator of θ_L. Using these condition, we conclude that the UMRU estimator and natural estimator are inadmissible. Finally, the risk functions of various competing estimators of θ_L are compared through simulation. 相似文献

7.

Weighted L1-estimates for the First-order Bifurcating Autoregressive Model

Tamer M. Elbayoumi Jeff Terpstra 《统计学通讯:模拟与计算》2016,45(8):2991-3013

We developed robust estimators that minimize a weighted L₁ norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L₁ estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L₁ estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study. 相似文献

8.

Selection rules based on divergences

A. Berlinet 《Statistics》2013,47(5):479-495

This paper deals with a special adaptive estimation problem, namely how can one select for each set of i.i.d. data X ₁, …, X _n the better of two given estimates of the data-generating probability density. Such a problem was studied by Devroye and Lugosi [Combinatorial Methods in Density Estimation, Springer, Berlin, 2001] who proposed a feasible suboptimal selection (called the Scheffé selection) as an alternative to the optimal but nonfeasible selection which minimizes the L₁-error. In many typical situations, the L₁-error of the Scheffé selection was shown to tend to zero for n→∞ as fast as the L₁-error of the optimal estimate. This asymptotic result was based on an inequality between the total variation errors of the Scheffé and optimal selections. The present paper extends this inequality to the class of φ-divergence errors containing the L₁-error as a special case. The first extension compares the φ-divergence errors of the mentioned Scheffé and optimal selections of Devroye and Lugosi. The second extension deals with a class of generalized Scheffé selections adapted to the φ-divergence error criteria and reducing to the classical Scheffé selection for the L₁-criterion. It compares the φ-divergence errors of these feasible selections and the optimal nonfeasible selections minimizing the φ-divergence errors. Both extensions are motivated and illustrated by examples. 相似文献

9.

Cross-validation Revisited

Santanu Dutta 《统计学通讯:模拟与计算》2016,45(2):472-490

Data-based choice of the bandwidth is an important problem in kernel density estimation. The pseudo-likelihood and the least-squares cross-validation bandwidth selectors are well known, but widely criticized in the literature. For heavy-tailed distributions, the L₁ distance between the pseudo-likelihood-based estimator and the density does not seem to converge in probability to zero with increasing sample size. Even for normal-tailed densities, the rate of L₁ convergence is disappointingly slow. In this article, we report an interesting finding that with minor modifications both the cross-validation methods can be implemented effectively, even for heavy-tailed densities. For both these estimators, the L₁ distance (from the density) are shown to converge completely to zero irrespective of the tail of the density. The expected L₁ distance also goes to zero. These results hold even in the presence of a strongly mixing-type dependence. Monte Carlo simulations and analysis of the Old Faithful geyser data suggest that if implemented appropriately, contrary to the traditional belief, the cross-validation estimators compare well with the sophisticated plug-in and bootstrap-based estimators. 相似文献

10.

Jackknifing L-estimates

Kuang-Fu Cheng 《Revue canadienne de statistique》1982,10(1):49-58

Linear functions of order statistics (“L-estimates”) of the form T_n =under jackknifing are investigated. This paper proves that with suitable conditions on the function J, the jackknifed version T_n of the L-estimate T_n has the same limit distribution as T_n. It is also shown that the jackknife estimate of the asymptotic variance of n^1/2 is consistent. Furthermore, the Berry-Esséen rate associated with asymptotic normality, and a law of the iterated logarithm of a class of jackknife L-estimates, are characterized. 相似文献

11.

Robust variable selection for the varying coefficient model based on composite L 1–L 2 regression

Weihua Zhao Jicai Liu 《Journal of applied statistics》2013,40(9):2024-2040

The varying coefficient model (VCM) is an important generalization of the linear regression model and many existing estimation procedures for VCM were built on L ₂ loss, which is popular for its mathematical beauty but is not robust to non-normal errors and outliers. In this paper, we address the problem of both robustness and efficiency of estimation and variable selection procedure based on the convex combined loss of L ₁ and L ₂ instead of only quadratic loss for VCM. By using local linear modeling method, the asymptotic normality of estimation is driven and a useful selection method is proposed for the weight of composite L ₁ and L ₂. Then the variable selection procedure is given by combining local kernel smoothing with adaptive group LASSO. With appropriate selection of tuning parameters by Bayesian information criterion (BIC) the theoretical properties of the new procedure, including consistency in variable selection and the oracle property in estimation, are established. The finite sample performance of the new method is investigated through simulation studies and the analysis of body fat data. Numerical studies show that the new method is better than or at least as well as the least square-based method in terms of both robustness and efficiency for variable selection. 相似文献

12.

Comparison of computer programs for simple linear L 1 regression

《Journal of Statistical Computation and Simulation》2012,82(1-2):63-68

A number of efficient computer codes are available for the simple linear L ₁ regression problem. However, a number of these codes can be made more efficient by utilizing the least squares solution. In fact, a couple of available computer programs already do so.

We report the results of a computational study comparing several openly available computer programs for solving the simple linear L ₁ regression problem with and without computing and utilizing a least squares solution. 相似文献

13.

On the performance of L2E estimation in modelling heterogeneous count responses with extreme values

《Journal of Statistical Computation and Simulation》2012,82(3):564-581

In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L₂E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L₂E is based on an integrated L₂ distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L₂E estimator, we compare the performance of L₂E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L₂E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L₂E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L₂E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L₂E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients. 相似文献

14.

On the asymptotic normality of the L1- and L2-errors in histogram density estimation

Jan Beirlant Lszl Gyrfi Gbor Lugosi 《Revue canadienne de statistique》1994,22(3):309-318

The L₁ and L₂-errors of the histogram estimate of a density f from a sample X₁,X₂,…,Xn using a cubic partition are shown to be asymptotically normal without any unnecessary conditions imposed on the density f. The asymptotic variances are shown to depend on f only through the corresponding norm of f. From this follows the asymptotic null distribution of a goodness-of-fit test based on the total variation distance, introduced by Györfi and van der Meulen (1991). This note uses the idea of partial inversion for obtaining characteristic functions of conditional distributions, which goes back at least to Bartlett (1938). 相似文献

15.

Robust classification of high-dimensional data using artificial neural networks

D. J. Smith T. C. Bailey A. G. Munford 《Statistics and Computing》1993,3(2):71-81

This paper is concerned with the application of artificial neural networks (ANNs) to a practical, difficult and high-dimensional classification problem, discrimination between selected under-water sounds. The application provides for a particular comparison of the relative performance of time-delay as opposed to fully connected network architectures, in the analysis of temporal data. More originally, suggestions are given for adapting the conventional backpropagation algorithm to give greater robustness to mis-classification errors in the training examples—a particular problem with underwater sound data and one which may arise in other realistic applications of ANNs. An informal comparison is made between the generalisation performance of various architectures in classifying real dolphin sounds when networks are trained using the conventional least squares minimisation norm, L ₂, that of least absolute deviation, L ₁, and that of the Huber criterion, which involves a mixture of both L ₁ and L ₂. The results suggest that L ₁ and Huber may provide performance gains. In order to evaluate these robust adjustments more formally under controlled conditions, an experiment is then conducted using simulated dolphin sounds with known levels of random noise and misclassification error. Here, the results are more ambiguous and significant interactions are indicated which raise issues for future research. 相似文献

16.

Moment Consistency of the Exchangeably Weighted Bootstrap for Semiparametric M‐estimation

下载免费PDF全文

Guang Cheng 《Scandinavian Journal of Statistics》2015,42(3):665-684

The bootstrap variance estimate is widely used in semiparametric inferences. However, its theoretical validity is a well‐known open problem. In this paper, we provide a first theoretical study on the bootstrap moment estimates in semiparametric models. Specifically, we establish the bootstrap moment consistency of the Euclidean parameter, which immediately implies the consistency of t‐type bootstrap confidence set. It is worth pointing out that the only additional cost to achieve the bootstrap moment consistency in contrast with the distribution consistency is to simply strengthen the L₁ maximal inequality condition required in the latter to the L_p maximal inequality condition for p≥1. The general L_p multiplier inequality developed in this paper is also of independent interest. These general conclusions hold for the bootstrap methods with exchangeable bootstrap weights, for example, non‐parametric bootstrap and Bayesian bootstrap. Our general theory is illustrated in the celebrated Cox regression model. 相似文献

17.

Improved minimax estimation of a normal precision matrix

K. Krishnamoorthy A. K. Gupta 《Revue canadienne de statistique》1989,17(1):91-102

Let S_{p × p} have a Wishart distribution with parameter matrix Σ and n degrees of freedom. We consider here the problem of estimating the precision matrix Σ^?1 under the loss functions L₁(σ) tr (σ) - log |σ| and L₂(σ) = tr (σ). James-Stein-type estimators have been derived for an arbitrary p. We also obtain an orthogonal invariant and a diagonal invariant minimax estimator under both loss functions. A Monte-Carlo simulation study indicates that the risk improvement of the orthogonal invariant estimators over the James-Stein type estimators, the Haff (1979) estimator, and the “testimator” given by Sinha and Ghosh (1987) is substantial. 相似文献

18.

Location adjustment for the minimum volume ellipsoid estimator

Christophe Croux Gentiane Haesbroeck Peter J. Rousseeuw 《Statistics and Computing》2002,12(3):191-200

Estimating multivariate location and scatter with both affine equivariance and positive breakdown has always been difficult. A well-known estimator which satisfies both properties is the Minimum Volume Ellipsoid Estimator (MVE). Computing the exact MVE is often not feasible, so one usually resorts to an approximate algorithm. In the regression setup, algorithms for positive-breakdown estimators like Least Median of Squares typically recompute the intercept at each step, to improve the result. This approach is called intercept adjustment. In this paper we show that a similar technique, called location adjustment, can be applied to the MVE. For this purpose we use the Minimum Volume Ball (MVB), in order to lower the MVE objective function. An exact algorithm for calculating the MVB is presented. As an alternative to MVB location adjustment we propose L ₁ location adjustment, which does not necessarily lower the MVE objective function but yields more efficient estimates for the location part. Simulations compare the two types of location adjustment. We also obtain the maxbias curves of L ₁ and the MVB in the multivariate setting, revealing the superiority of L ₁. 相似文献

19.

Robust Coordinate Descent Algorithm Robust Solution Path for High-dimensional Sparse Regression Modeling

H. Park S. Konishi 《统计学通讯:模拟与计算》2016,45(1):115-129

The L₁-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L₁-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L₁-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers. 相似文献

20.

Penalized regression with model‐based penalties

Nancy E. Heckman James O. Ramsay 《Revue canadienne de statistique》2000,28(2):241-258

Nonparametric regression techniques such as spline smoothing and local fitting depend implicitly on a parametric model. For instance, the cubic smoothing spline estimate of a regression function ∫ μ based on observations ti, Yi is the minimizer of Σ{Yi ‐ μ(ti)}² + λ∫(μ′′)². Since ∫(μ″)² is zero when μ is a line, the cubic smoothing spline estimate favors the parametric model μ(t) = α_o + α₁t. Here the authors consider replacing ∫(μ″)² with the more general expression ∫(Lμ)² where L is a linear differential operator with possibly nonconstant coefficients. The resulting estimate of μ performs well, particularly if Lμ is small. They present an O(n) algorithm for the computation of μ. This algorithm is applicable to a wide class of L's. They also suggest a method for the estimation of L. They study their estimates via simulation and apply them to several data sets. 相似文献