期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Comparison of Generalized Lambda Distribution (GLD) and Response Modeling Methodology (RMM) as General Platforms for Distribution Fitting

Haim Shore 《统计学通讯:理论与方法》2013,42(15):2805-2819

Distribution fitting is widely practiced in all branches of engineering and applied science, yet only a few studies have examined the relative capability of various parameter-rich families of distributions to represent a wide spectrum of diversely shaped distributions. In this article, two such families of distributions, Generalized Lambda Distribution (GLD) and Response Modeling Methodology (RMM), are compared. For a sample of some commonly used distributions, each family is fitted to each distribution, using two methods: fitting by minimization of the L ₂ norm (minimizing density function distance) and nonlinear regression applied to a sample of exact quantile values (minimizing quantile function distance). The resultant goodness-of-fit is assessed by four criteria: the optimized value of the L ₂ norm, and three additional criteria, relating to quantile function matching. Results show that RMM is uniformly better than GLD. An additional study includes Shore's quantile function (QF) and again RMM is the best performer, followed by Shore's QF and then GLD. 相似文献

2.

Weighted L1-estimates for the First-order Bifurcating Autoregressive Model

Tamer M. Elbayoumi Jeff Terpstra 《统计学通讯:模拟与计算》2016,45(8):2991-3013

We developed robust estimators that minimize a weighted L₁ norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L₁ estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L₁ estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study. 相似文献

3.

Order statistics from trivariate normal and -distributions in terms of generalized skew-normal and skew- distributions 总被引：1，自引：0，他引：1

A. Jamalizadeh N. Balakrishnan 《Journal of statistical planning and inference》2009,139(11):3799

We consider here a generalization of the skew-normal distribution, GSN(λ₁,λ₂,ρ), defined through a standard bivariate normal distribution with correlation ρ, which is a special case of the unified multivariate skew-normal distribution studied recently by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574]. We then present some simple and useful properties of this distribution and also derive its moment generating function in an explicit form. Next, we show that distributions of order statistics from the trivariate normal distribution are mixtures of these generalized skew-normal distributions; thence, using the established properties of the generalized skew-normal distribution, we derive the moment generating functions of order statistics, and also present expressions for means and variances of these order statistics.Next, we introduce a generalized skew-t_ν distribution, which is a special case of the unified multivariate skew-elliptical distribution presented by Arellano-Valle and Azzalini [2006. On the unification of families of skew-normal distributions. Scand. J. Statist. 33, 561–574] and is in fact a three-parameter generalization of Azzalini and Capitanio's [2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution. J. Roy. Statist. Soc. Ser. B 65, 367–389] univariate skew-t_ν form. We then use the relationship between the generalized skew-normal and skew-t_ν distributions to discuss some properties of generalized skew-t_ν as well as distributions of order statistics from bivariate and trivariate t_ν distributions. We show that these distributions of order statistics are indeed mixtures of generalized skew-t_ν distributions, and then use this property to derive explicit expressions for means and variances of these order statistics. 相似文献

4.

On the performance of L2E estimation in modelling heterogeneous count responses with extreme values

《Journal of Statistical Computation and Simulation》2012,82(3):564-581

In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L₂E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L₂E is based on an integrated L₂ distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L₂E estimator, we compare the performance of L₂E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L₂E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L₂E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L₂E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L₂E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients. 相似文献

5.

Least-squares estimation of distribution functions in johnson's translation system

《Journal of Statistical Computation and Simulation》2012,82(4):271-297

To summarize a set of data by a distribution function in Johnson's translation system, we use a least-squares approach to parameter estimation wherein we seek to minimize the distance between the vector of "uniformized" oeder statistics and the corresponding vector of expected values. We use the software package FITTRI to apply this technique to three problems arising respectively in medicine, applied statistics, and civil engineering. Compared to traditional methods of distribution fitting based on moment matching, percentile matchingL ₁ estimation, and L _? estimation, the least-squares technique is seen to yield fits of similar accuracy and to converge more rapidly and reliably to a set of acceptable parametre estimates. 相似文献

6.

On least absolute values estimation

J. E. Gentle W. J. Kennedy V. A. Sposito 《统计学通讯:理论与方法》2013,42(9):839-845

The resistance of least absolute values (L₁) estimators to outliers and their robustness to heavy-tailed distributions make these estimators useful alternatives to the usual least squares estimators. The recent development of efficient algorithms for L₁ estimation in linear models has permitted their use in practical data analysis. Although in general the L₁ estimators are not unique, there are a number of properties they all share. The set of all L₁ estimators for a given model and data set can be characterized as the convex hull of some extreme estimators. Properties of the extreme estimators and of the L₁-estimate set are considered. 相似文献

7.

On the asymptotic normality of the L1- and L2-errors in histogram density estimation

Jan Beirlant Lszl Gyrfi Gbor Lugosi 《Revue canadienne de statistique》1994,22(3):309-318

The L₁ and L₂-errors of the histogram estimate of a density f from a sample X₁,X₂,…,Xn using a cubic partition are shown to be asymptotically normal without any unnecessary conditions imposed on the density f. The asymptotic variances are shown to depend on f only through the corresponding norm of f. From this follows the asymptotic null distribution of a goodness-of-fit test based on the total variation distance, introduced by Györfi and van der Meulen (1991). This note uses the idea of partial inversion for obtaining characteristic functions of conditional distributions, which goes back at least to Bartlett (1938). 相似文献

8.

The minimum L₂ distance estimator for Poisson mixture models

Ian R. Harris Shuyi Shen 《Journal of statistical planning and inference》2011,141(3):1088-1101

A robust estimator is developed for Poisson mixture models with a known number of components. The proposed estimator minimizes the L₂ distance between a sample of data and the model. When the component distributions are completely known, the estimators for the mixing proportions are in closed form. When the parameters for the component Poisson distributions are unknown, numerical methods are needed to calculate the estimators. Compared to the minimum Hellinger distance estimator, the minimum L₂ estimator can be less robust to extreme outliers, and often more robust to moderate outliers. 相似文献

9.

Asymmetric generalizations of symmetric univariate probability distributions obtained through quantile splicing

Brenda V. Mac’Oduol Paul J. van Staden Robert A. R. King 《统计学通讯:理论与方法》2020,49(18):4413-4429

Abstract

Balakrishnan et al. proposed a two-piece skew logistic distribution by making use of the cumulative distribution function (CDF) of half distributions as the building block, to give rise to an asymmetric family of two-piece distributions, through the inclusion of a single shape parameter. This paper proposes the construction of asymmetric families of two-piece distributions by making use of quantile functions of symmetric distributions as building blocks. This proposition will enable the derivation of a general formula for the L-moments of two-piece distributions. Examples will be presented, where the logistic, normal, Student’s t(2) and hyperbolic secant distributions are considered. 相似文献

10.

Comparison of computer programs for simple linear L 1 regression

《Journal of Statistical Computation and Simulation》2012,82(1-2):63-68

A number of efficient computer codes are available for the simple linear L ₁ regression problem. However, a number of these codes can be made more efficient by utilizing the least squares solution. In fact, a couple of available computer programs already do so.

We report the results of a computational study comparing several openly available computer programs for solving the simple linear L ₁ regression problem with and without computing and utilizing a least squares solution. 相似文献

11.

$${\mathcal{L}}_p$$ loss functions: a robust bayesian approach

J. P. Arias-Nicolás J. Martín A. Suárez-Llorens 《Statistical Papers》2009,50(3):501-509

In bayesian inference, the Bayes estimator is the alternative with the minimum expected loss. In most cases, the loss function shows the distance between the alternative and the parameter. Therefore, any distance can lead to a loss function. Among the best known distance functions is L _p one, where the choice of value p may be difficult and arbitrary. This paper examines robust models where the loss function is modelled by family L _p. Our solution concept is the non-dominated alternative. We characterize the non-dominated set by having the posterior distribution function satisfy a particular asymmetry property. We also include an example to illustrate the methodology described. 相似文献

12.

Using a Truncated C p Statistic for Variable Selection in Multiple Linear Regression

D. W. Uys S. J. Steel 《统计学通讯:模拟与计算》2013,42(2):420-432

In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ?ⁿ corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C _p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C _p statistic that can also be used to estimate this mean square error. The C _p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C _p. 相似文献

13.

Choosing a robustness tuning parameter

《Journal of Statistical Computation and Simulation》2012,82(7):581-588

A novel method is proposed for choosing the tuning parameter associated with a family of robust estimators. It consists of minimising estimated mean squared error, an approach that requires pilot estimation of model parameters. The method is explored for the family of minimum distance estimators proposed by [Basu, A., Harris, I.R., Hjort, N.L. and Jones, M.C., 1998, Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 549–559.] Our preference in that context is for a version of the method using the L ₂ distance estimator [Scott, D.W., 2001, Parametric statistical modeling by minimum integrated squared error. Technometrics, 43, 274–285.] as pilot estimator. 相似文献

14.

On the efficiency of using the sample kurtosis in selecting optimal lpestimators

V. A. Sposito M. L. Hand Bradley Skarpness 《统计学通讯:模拟与计算》2013,42(3):265-272

This paper examines the efficiency of thesample kurtosisin obtaining L_P estimates as an estimates of central tendency for symmetric distributions. Moreover, guidelines are established for determining an optimal value of P based on the kurtosis of the error distribution. 相似文献

15.

Lower bound of average centered L2-discrepancy for U-type designs

Xue Yang Gui-Jun Yang 《统计学通讯:理论与方法》2019,48(4):995-1008

Uniform designs are widely used in various scientific investigations and industrial applications. By considering all possible level permutation of the factors, a connection between average centered L₂-discrepancy and generalized wordlength pattern for asymmetrical fractional factorial designs is derived. Moreover, we present new lower bounds to the average centered L₂-discrepancy for symmetrical and asymmetrical U-type designs. For illustration of the theoretical results, the lower bounds for symmetrical and asymmetrical U-type designs are tabulated, and numerical results indicate that our lower bounds behave well and can be recommended for use in practice. 相似文献

16.

A stochastic representation for the lp-norm symmetric distribution and its applications

Jiajuan Liang Kai Wang Ng Guoliang Tian 《统计学通讯:模拟与计算》2017,46(8):6011-6018

The family of l_p-norm symmetric distributions was proposed by Yue and Ma and is a natural generalization to the family of l₁-norm symmetric distributions studied by Fang et al. In this article, we propose a stochastic representation for the l_p-norm symmetric distribution for any constant p > 0. The stochastic representation is expressed through independent and identically distributed uniform U(0, 1) random variables. It is illustrated that the stochastic representation can be applied to statistical simulation and uniform experimental design. 相似文献

17.

The 3 F 2 with Complex Parameters as Generating Function of Discrete Distribution

J. Rodríguez Avi M. J. Olmo Jiménez A. Conde Sánchez A. J. Sáez Castillo 《统计学通讯:理论与方法》2013,42(19):3009-3022

A new discrete family of probability distributions that are generated by the ₃ F ₂ function with complex parameters is presented. Some of the properties of this new family are studied as well as methods of estimation for its parameters. It affords considerable flexibility of shape which turns the distribution into an appropriate candidate for modeling data that cannot be adequately fitted by classical families with fewer parameters. Finally, three examples in the fields of Agriculture and Education are included in order to show the versatility and utility of this distribution. 相似文献

18.

Remarks on the L1 distance in statistical data analysis

Robert J. Budzyński Witold Kondracki 《统计学通讯:理论与方法》2017,46(19):9355-9363

We propose the L₁ distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L₁ in the large-N limit. We argue that the L₁ distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k. 相似文献

19.

Local Linear Estimation for Spatiotemporal Models Based on Least Absolute Deviation

Hongxia Wang Jinguan Lin Jinde Wang 《统计学通讯:理论与方法》2013,42(7):1508-1522

When the data contain outliers or come from population with heavy-tailed distributions, which appear very often in spatiotemporal data, the estimation methods based on least-squares (L₂) method will not perform well. More robust estimation methods are required. In this article, we propose the local linear estimation for spatiotemporal models based on least absolute deviation (L₁) and drive the asymptotic distributions of the L₁-estimators under some mild conditions imposed on the spatiotemporal process. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L₁-estimators perform better than the L₂-estimators. 相似文献

20.

Empirical Comparison of Nonparametric Regression Estimates on Real Data

Daniel Jones Michael Kohler Alexander Richter 《统计学通讯:模拟与计算》2016,45(7):2309-2319

The performance of nine different nonparametric regression estimates is empirically compared on ten different real datasets. The number of data points in the real datasets varies between 7, 900 and 18, 000, where each real dataset contains between 5 and 20 variables. The nonparametric regression estimates include kernel, partitioning, nearest neighbor, additive spline, neural network, penalized smoothing splines, local linear kernel, regression trees, and random forests estimates. The main result is a table containing the empirical L₂ risks of all nine nonparametric regression estimates on the evaluation part of the different datasets. The neural networks and random forests are the two estimates performing best. The datasets are publicly available, so that any new regression estimate can be easily compared with all nine estimates considered in this article by just applying it to the publicly available data and by computing its empirical L₂ risks on the evaluation part of the datasets. 相似文献