期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Location adjustment for the minimum volume ellipsoid estimator

Christophe Croux Gentiane Haesbroeck Peter J. Rousseeuw 《Statistics and Computing》2002,12(3):191-200

Estimating multivariate location and scatter with both affine equivariance and positive breakdown has always been difficult. A well-known estimator which satisfies both properties is the Minimum Volume Ellipsoid Estimator (MVE). Computing the exact MVE is often not feasible, so one usually resorts to an approximate algorithm. In the regression setup, algorithms for positive-breakdown estimators like Least Median of Squares typically recompute the intercept at each step, to improve the result. This approach is called intercept adjustment. In this paper we show that a similar technique, called location adjustment, can be applied to the MVE. For this purpose we use the Minimum Volume Ball (MVB), in order to lower the MVE objective function. An exact algorithm for calculating the MVB is presented. As an alternative to MVB location adjustment we propose L ₁ location adjustment, which does not necessarily lower the MVE objective function but yields more efficient estimates for the location part. Simulations compare the two types of location adjustment. We also obtain the maxbias curves of L ₁ and the MVB in the multivariate setting, revealing the superiority of L ₁. 相似文献

2.

Remarks on the L1 distance in statistical data analysis

Robert J. Budzyński Witold Kondracki 《统计学通讯:理论与方法》2017,46(19):9355-9363

We propose the L₁ distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L₁ in the large-N limit. We argue that the L₁ distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k. 相似文献

3.

Least-squares estimation of distribution functions in johnson's translation system

《Journal of Statistical Computation and Simulation》2012,82(4):271-297

To summarize a set of data by a distribution function in Johnson's translation system, we use a least-squares approach to parameter estimation wherein we seek to minimize the distance between the vector of "uniformized" oeder statistics and the corresponding vector of expected values. We use the software package FITTRI to apply this technique to three problems arising respectively in medicine, applied statistics, and civil engineering. Compared to traditional methods of distribution fitting based on moment matching, percentile matchingL ₁ estimation, and L _? estimation, the least-squares technique is seen to yield fits of similar accuracy and to converge more rapidly and reliably to a set of acceptable parametre estimates. 相似文献

4.

Using a Truncated C p Statistic for Variable Selection in Multiple Linear Regression

D. W. Uys S. J. Steel 《统计学通讯:模拟与计算》2013,42(2):420-432

In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ?ⁿ corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C _p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C _p statistic that can also be used to estimate this mean square error. The C _p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C _p. 相似文献

5.

On the performance of L2E estimation in modelling heterogeneous count responses with extreme values

《Journal of Statistical Computation and Simulation》2012,82(3):564-581

In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L₂E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L₂E is based on an integrated L₂ distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L₂E estimator, we compare the performance of L₂E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L₂E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L₂E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L₂E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L₂E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients. 相似文献

6.

Robustness of weighted L ^p–depth and L ^p–median

Yijun Zuo 《Allgemeines Statistisches Archiv》2004,88(2):215-234

Summary: L ^p–norm weighted depth functions are introduced and the local and global robustness of these weighted L ^p–depth functions and their induced multivariate medians are investigated via influence function and finite sample breakdown point. To study the global robustness of depth functions, a notion of finite sample breakdown point is introduced. The weighted L ^p–depth functions turn out to have the same low breakdown point as some other popular depth functions. Their influence functions are also unbounded. On the other hand, the weighted L ^p–depth induced medians are globally robust with the highest possible breakdown point for any reasonable estimator. The weighted L ^p–medians are also locally robust with bounded influence functions for suitable weight functions. Unlike other existing depth functions and multivariate medians, the weighted L ^p depth and medians are easy to calculate in high dimensions. The price for this advantage is the lack of affine invariance and equivariance of the weighted L ^p depth and medians, respectively.*The author thanks the referees for their very insightful and constructive comments and suggestions which led to corrections and substantial improvements. Supported in part by NSF Grants DMS-0071976 and DMS-0134628. 相似文献

7.

Comparison of computer programs for simple linear L 1 regression

《Journal of Statistical Computation and Simulation》2012,82(1-2):63-68

A number of efficient computer codes are available for the simple linear L ₁ regression problem. However, a number of these codes can be made more efficient by utilizing the least squares solution. In fact, a couple of available computer programs already do so.

We report the results of a computational study comparing several openly available computer programs for solving the simple linear L ₁ regression problem with and without computing and utilizing a least squares solution. 相似文献

8.

Order statistics from trivariate normal and -distributions in terms of generalized skew-normal and skew- distributions 总被引：1，自引：0，他引：1

A. Jamalizadeh N. Balakrishnan 《Journal of statistical planning and inference》2009,139(11):3799

We consider here a generalization of the skew-normal distribution, GSN(λ₁,λ₂,ρ), defined through a standard bivariate normal distribution with correlation ρ, which is a special case of the unified multivariate skew-normal distribution studied recently by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574]. We then present some simple and useful properties of this distribution and also derive its moment generating function in an explicit form. Next, we show that distributions of order statistics from the trivariate normal distribution are mixtures of these generalized skew-normal distributions; thence, using the established properties of the generalized skew-normal distribution, we derive the moment generating functions of order statistics, and also present expressions for means and variances of these order statistics.Next, we introduce a generalized skew-t_ν distribution, which is a special case of the unified multivariate skew-elliptical distribution presented by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574] and is in fact a three-parameter generalization of Azzalini and Capitanio's [2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution. J. Roy. Statist. Soc. Ser. B 65, 367–389] univariate skew-t_ν form. We then use the relationship between the generalized skew-normal and skew-t_ν distributions to discuss some properties of generalized skew-t_ν as well as distributions of order statistics from bivariate and trivariate t_ν distributions. We show that these distributions of order statistics are indeed mixtures of generalized skew-t_ν distributions, and then use this property to derive explicit expressions for means and variances of these order statistics. 相似文献

9.

Statistical analysis of process capability indices with measurement errors: The case ofC p

Silvano Bordignon Michele Scagliarini 《Statistical Methods and Applications》2001,10(1-3):273-285

Process capability indices (PCIs) have been widely used in manufacturing industries to previde a quantitative measure of process potential and performance. While some efforts have been dedicated in the literature to the statistical properties of PCIs estimators, scarce attention has been given to the evaluation of these properties when sample data are affected by measurement errors. In this work we deal with the problem of measurement errors effects on the performance of PCIs. The analysis is illustrated with reference toC _p, i.e. the simplest and most common measure suggested to evaluate process capability. The authors would like to thank two anonymous referees for their comments and suggestion that were useful in the preparation and improvement of this paper. This work was partially supported by a MURST research grant. 相似文献

10.

Bootstrap inference in local polynomial regression of time series

Maria Lucia Parrella Cosimo Vitale 《Statistical Methods and Applications》2007,16(1):117-139

In this paper we consider the inferential aspect of the nonparametric estimation of a conditional function , where X _t,m represents the vector containing the m conditioning lagged values of the series. Here is an arbitrary measurable function. The local polynomial estimator of order p is used for the estimation of the function g, and of its partial derivatives up to a total order p. We consider α-mixing processes, and we propose the use of a particular resampling method, the local polynomial bootstrap, for the approximation of the sampling distribution of the estimator. After analyzing the consistency of the proposed method, we present a simulation study which gives evidence of its finite sample behaviour. 相似文献

11.

On perturbations of Stein operator

A. N. Kumar N. S. Upadhye 《统计学通讯:理论与方法》2017,46(18):9284-9302

In this article, we obtain a Stein operator for the sum of n independent random variables (rvs) which is shown as the perturbation of the negative binomial (NB) operator. Comparing the operator with NB operator, we derive the error bounds for total variation distance by matching parameters. Also, three-parameter approximation for such a sum is considered and is shown to improve the existing bounds in the literature. Finally, an application of our results to a function of waiting time for (k₁, k₂)-events is given. 相似文献

12.

Estimation for a scale parameter with known coefficient of variation

Koji Kanefuji Kosei Iwase 《Statistical Papers》1998,39(4):377-388

A loss function proposed by Wasan (1970) is well-fitted for a measure of inaccuracy for an estimator of a scale parameter of a distribution defined onR ⁺=(0, ∞). We refer to this loss function as the K-loss function. A relationship between the K-loss and squared error loss functions is discussed. And an optimal estimator for a scale parameter with known coefficient of variation under the K-loss function is presented. 相似文献

13.

Weighted L1-estimates for the First-order Bifurcating Autoregressive Model

Tamer M. Elbayoumi Jeff Terpstra 《统计学通讯:模拟与计算》2016,45(8):2991-3013

We developed robust estimators that minimize a weighted L₁ norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L₁ estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L₁ estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study. 相似文献

14.

Robust Coordinate Descent Algorithm Robust Solution Path for High-dimensional Sparse Regression Modeling

H. Park S. Konishi 《统计学通讯:模拟与计算》2016,45(1):115-129

The L₁-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L₁-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L₁-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers. 相似文献

15.

Minimum variance unbiased estimation of stress–strength reliability under bivariate normal and its comparisons

Parimal Hor 《统计学通讯:模拟与计算》2017,46(3):2447-2456

In many industrial and natural phenomena, we need the probability that a component is smaller than the other component. Under a stress–strength model, this is reliability of an item. Under independent setup, there are different approaches for the estimation of such reliability. Here, estimation is considered under the dependent case. Under bi-variate setup uniformly minimum variance unbiased estimator is obtained. Also comparison with available estimator based on Maximum Likelihood Estimate (MLE) is done through Mean Square Error (MSE) and bias. Also these are compared by computing L₁ distance between their distribution functions. From this idea and numerical computations, UMVUE appears to be good. 相似文献

16.

Choosing a robustness tuning parameter

《Journal of Statistical Computation and Simulation》2012,82(7):581-588

A novel method is proposed for choosing the tuning parameter associated with a family of robust estimators. It consists of minimising estimated mean squared error, an approach that requires pilot estimation of model parameters. The method is explored for the family of minimum distance estimators proposed by [Basu, A., Harris, I.R., Hjort, N.L. and Jones, M.C., 1998, Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 549–559.] Our preference in that context is for a version of the method using the L ₂ distance estimator [Scott, D.W., 2001, Parametric statistical modeling by minimum integrated squared error. Technometrics, 43, 274–285.] as pilot estimator. 相似文献

17.

A new test for the mean vector in large dimension and small samples

Junguang Zhao 《统计学通讯:模拟与计算》2017,46(8):6115-6128

In this article, we consider the problem of testing the mean vector in the multivariate normal distribution, where the dimension p is greater than the sample size N. We propose a new test T_Block and obtain its asymptotic distribution. We also compare the proposed test with other two tests. The simulation results suggest that the performance of the new test is comparable to the existing two tests, and under some circumstances it may have higher power. Therefore, the new statistic can be employed in practice as an alternative choice. 相似文献

18.

Probability generating function of GPED<Subscript>2</Subscript>

A. Bazargan-Lari 《Statistical Papers》2007,48(3):459-466

The Probability generating function of a random variable which has Generalized Polya Eggenberger Distribution of the second kind (GPED ₂) is obtained. The probability density function of the range R, in random sampling from a uniform distribution on (k, l) and exponential distribution with parameter λ is obtained, when the sample size is a random variable from GPED ₂. The results of Bazargan-Lari (2004) follow as special cases. 相似文献

19.

Empirical Comparison of Nonparametric Regression Estimates on Real Data

Daniel Jones Michael Kohler Alexander Richter 《统计学通讯:模拟与计算》2016,45(7):2309-2319

The performance of nine different nonparametric regression estimates is empirically compared on ten different real datasets. The number of data points in the real datasets varies between 7, 900 and 18, 000, where each real dataset contains between 5 and 20 variables. The nonparametric regression estimates include kernel, partitioning, nearest neighbor, additive spline, neural network, penalized smoothing splines, local linear kernel, regression trees, and random forests estimates. The main result is a table containing the empirical L₂ risks of all nine nonparametric regression estimates on the evaluation part of the different datasets. The neural networks and random forests are the two estimates performing best. The datasets are publicly available, so that any new regression estimate can be easily compared with all nine estimates considered in this article by just applying it to the publicly available data and by computing its empirical L₂ risks on the evaluation part of the datasets. 相似文献

20.

On the strong Kotz approximation of Dirichlet random vectors

Enkelejd Hashorva Samuel Kotz 《Statistics》2013,47(4):393-408

Let (X ₁, X ₂) be a bivariate L _p-norm generalized symmetrized Dirichlet (LpGSD) random vector with parameters α₁,α₂. If p=α₁=α₂=2, then (X ₁, X ₂) is a spherical random vector. The estimation of the conditional distribution of Z _u*:=X ₂ | X ₁>u for u large is of some interest in statistical applications. When (X ₁, X ₂) is a spherical random vector with associated random radius in the Gumbel max-domain of attraction, the distribution of Z _u* can be approximated by a Gaussian distribution. Surprisingly, the same Gaussian approximation holds also for Z _u:=X ₂| X ₁=u. In this paper, we are interested in conditional limit results in terms of convergence of the density functions considering a d-dimensional LpGSD random vector. Stating our results for the bivariate setup, we show that the density function of Z _u* and Z _u can be approximated by the density function of a Kotz type I LpGSD distribution, provided that the associated random radius has distribution function in the Gumbel max-domain of attraction. Further, we present two applications concerning the asymptotic behaviour of concomitants of order statistics of bivariate Dirichlet samples and the estimation of the conditional quantile function. 相似文献