期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

ROBUST RIDGE REGRESSION BASED ON AN M-ESTIMATOR

MERVYN J. SILVAPULLE 《Australian & New Zealand Journal of Statistics》1991,33(3):319-333

Consider the linear regression model y =β₀1 +Xβ+ in the usual notation. It is argued that the class of ordinary ridge estimators obtained by shrinking the least squares estimator by the matrix (X¹X + kI)^-1X'X is sensitive to outliers in the ^variable. To overcome this problem, we propose a new class of ridge-type M-estimators, obtained by shrinking an M-estimator (instead of the least squares estimator) by the same matrix. Since the optimal value of the ridge parameter k is unknown, we suggest a procedure for choosing it adaptively. In a reasonably large scale simulation study with a particular M-estimator, we found that if the conditions are such that the M-estimator is more efficient than the least squares estimator then the corresponding ridge-type M-estimator proposed here is better, in terms of a Mean Squared Error criteria, than the ordinary ridge estimator with k chosen suitably. An example illustrates that the estimators proposed here are less sensitive to outliers in the y-variable than ordinary ridge estimators. 相似文献

2.

Robust Estimation for Parameters of the Extended Burr Type III Distribution

Yeliz Mert Kantar Vural Yildirim 《统计学通讯:模拟与计算》2015,44(7):1901-1930

We consider various robust estimators for the extended Burr Type III (EBIII) distribution for complete data with outliers. The considered robust estimators are M-estimators, least absolute deviations, Theil, Siegel's repeated median, least trimmed squares, and least median of squares. Before we perform the aforementioned estimators for the EBIII, we adapt the quantiles method to the estimation of the shape parameter k of the EBIII. The simulation results show that the considered robust estimators generally outperform the existing estimation approaches for data with upper outliers, with certain of them retaining a relatively high degree of efficiency for small sample sizes. 相似文献

3.

Rank‐based estimate of four‐parameter logistic model

Kimberly S. Crimin Joseph W. McKean Thomas J. Vidmar 《Pharmaceutical statistics》2012,11(3):214-221

During drug development, the calculation of inhibitory concentration that results in a response of 50% (IC₅₀) is performed thousands of times every day. The nonlinear model most often used to perform this calculation is a four‐parameter logistic, suitably parameterized to estimate the IC₅₀ directly. When performing these calculations in a high‐throughput mode, each and every curve cannot be studied in detail, and outliers in the responses are a common problem. A robust estimation procedure to perform this calculation is desirable. In this paper, a rank‐based estimate of the four‐parameter logistic model that is analogous to least squares is proposed. The rank‐based estimate is based on the Wilcoxon norm. The robust procedure is illustrated with several examples from the pharmaceutical industry. When no outliers are present in the data, the robust estimate of IC₅₀ is comparable with the least squares estimate, and when outliers are present in the data, the robust estimate is more accurate. A robust goodness‐of‐fit test is also proposed. To investigate the impact of outliers on the traditional and robust estimates, a small simulation study was conducted. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

4.

Robust two parameter ridge M-estimator for linear regression

Hasan Ertaş Selma Toker Selahattin Kaçıranlar 《Journal of applied statistics》2015,42(7):1490-1502

The problem of multicollinearity and outliers in the data set produce undesirable effects on the ordinary least squares estimator. Therefore, robust two parameter ridge estimation based on M-estimator (ME) is introduced to deal with multicollinearity and outliers in the y-direction. The proposed estimator outperforms ME, two parameter ridge estimator and robust ridge M-estimator according to mean square error criterion. Moreover, a numerical example and a Monte Carlo simulation experiment are presented. 相似文献

5.

Superiority of the r–k Class Estimator Over Some Estimators In A Linear Model

Gülesen Üstündaǧ Şiray Sadullah Sakallıoǧlu 《统计学通讯:理论与方法》2013,42(15):2819-2832

In regression analysis, to overcome the problem of multicollinearity, the r ? k class estimator is proposed as an alternative to the ordinary least squares estimator which is a general estimator including the ordinary ridge regression estimator, the principal components regression estimator and the ordinary least squares estimator. In this article, we derive the necessary and sufficient conditions for the superiority of the r ? k class estimator over each of these estimators under the Mahalanobis loss function by the average loss criterion. Then, we compare these estimators with each other using the same criterion. Also, we suggest to test to verify if these conditions are indeed satisfied. Finally, a numerical example and a Monte Carlo simulation are done to illustrate the theoretical results. 相似文献

6.

Asymptotic normal tests for integration in panels with cross-dependent units

Uwe Hassler Matei Demetrescu Adina I. Tarcolea 《AStA Advances in Statistical Analysis》2011,95(2):187-204

The asymptotically normal, regression-based LM integration test is adapted for panels with correlated units. The N different units may be integrated of different (fractional) orders under the null hypothesis. The paper first reviews conditions under which the test statistic is asymptotically (as T→∞) normal in a single unit. Then we adopt the framework of seemingly unrelated regression [SUR] for cross-correlated panels, and discuss a panel test statistic based on the feasible generalized least squares [GLS] estimator, which follows a χ ²(N) distribution. Third, a more powerful statistic is obtained by working under the assumption of equal deviations from the respective null in all units. Fourth, feasible GLS requires inversion of sample covariance matrices typically imposing T>N; in addition we discuss alternative covariance matrix estimators for T<N. The usefulness of our results is assessed in Monte Carlo experimentation. 相似文献

7.

Bonferroni-type Plug-in Procedure Controlling Generalized Familywise Error Rate

Li Wang 《统计学通讯:理论与方法》2013,42(14):3042-3055

Consider the multiple hypotheses testing problem controlling the generalized familywise error rate k-FWER, the probability of at least k false rejections. We propose a plug-in procedure based on the estimation of the number of true null hypotheses. Under the independence assumption of the p-values corresponding to the true null hypotheses, we first introduce the least favorable configuration (LFC) of k-FWER for Bonferroni-type plug-in procedure, then we construct a plug-in k-FWER-controlled procedure based on LFC. For dependent p-values, we establish the asymptotic k-FWER control under some mild conditions. Simulation studies suggest great improvement over generalized Bonferroni test and generalized Holm test. 相似文献

8.

Concomitants of multivariate order statistics from multivariate elliptical distributions

Roohollah Roozegar Ahad Jamalizadeh Alireza Nematollahi 《统计学通讯:理论与方法》2013,42(3):722-738

ABSTRACT

In this article, we consider a (k + 1)n-dimensional elliptically contoured random vector (X^T₁, X₂^T, …, X^T_k, Z^T)^T = (X₁₁, …, X_1n, …, X_k1, …, X_kn, Z₁, …, Z_n)^T and derive the distribution of concomitant of multivariate order statistics arising from X₁, X₂, …, X_k. Specially, we derive a mixture representation for concomitant of bivariate order statistics. The joint distribution of the concomitant of bivariate order statistics is also obtained. Finally, the usefulness of our result is illustrated by a real-life data. 相似文献

9.

Comparing several exponential populations with more than one control

Parminder Singh Asheber Abebe 《Statistical Methods and Applications》2009,18(3):359-374

Suppose there are k ₁ (k ₁ ≥ 1) test treatments that we wish to compare with k ₂ (k ₂ ≥ 1) control treatments. Assume that the observations from the ith test treatment and the jth control treatment follow a two-parameter exponential distribution and , where θ is a common scale parameter and and are the location parameters of the ith test and the jth control treatment, respectively, i = 1, . . . ,k ₁; j = 1, . . . ,k ₂. In this paper, simultaneous one-sided and two-sided confidence intervals are proposed for all k ₁ k ₂ differences between the test treatment location and control treatment location parameters, namely , and the required critical points are provided. Discussions of multiple comparisons of all test treatments with the best control treatment and an optimal sample size allocation are given. Finally, it is shown that the critical points obtained can be used to construct simultaneous confidence intervals for Pareto distribution location parameters. 相似文献

10.

Detecting Outliers in Gamma Distribution

M. Jabbari Nooghabi H. Jabbari Nooghabi P. Nasiri 《统计学通讯:理论与方法》2013,42(4):698-706

Zerbet and Nikulin presented the new statistic Z _k for detecting outliers in exponential distribution. They also compared this statistic with Dixon's statistic D _k. In this article, we extend this approach to gamma distribution and compare the result with Dixon's statistic. The results show that the test based on statistic Z _k is more powerful than the test based on the Dixon's statistic. 相似文献

11.

A note on contamination models and outliers

Järgen Wellmann Ursula Gather 《统计学通讯:理论与方法》2013,42(8):1793-1802

In order to describe or generate so-called outliers in univariate statistical data, contamination models are often used. These models assume that k out of n independent random variables are shifted or multiplicated by some constant, whereas the other observations still come i.i.d. from some common target distribution. Of course, these contaminants do not necessarily stick out as the extremes in the sample. Moreover, it is the amount and magnitude of ‘contamination” which determines the number of obvious outliers. Using the concept of Davies and Gather (1993) to formalize the outlier notion we quantify the amount of contamination needed to produce a prespecified expected number of ‘genuine’ outliers. In particular, we demonstrate that for sample of moderate size from a normal target distribution a rather large shift of the contaminants is necessary to yield a certain expected number of outliers. Such an insight is of interest when designing simulation studies where outliers shoulod occur as well as in theoretical investigations on outliers. 相似文献

12.

Identifying Variables Contributing to Outliers in Phase I

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(7):1103-1118

When a process is monitored with a T ² control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T ² statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T ² statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T ² statistic in Phase I with the corresponding T ² statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition. 相似文献

13.

Heine process as a q-analog of the Poisson process—waiting and interarrival times

Andreas Kyriakoussis 《统计学通讯:理论与方法》2017,46(8):4088-4102

In this study, we introduce the Heine process, {X_q(t), t > 0}, 0 < q < 1, where the random variable X_q(t), for every t > 0, represents the number of events (occurrences or arrivals) during a time interval (0, t]. The Heine process is introduced as a q-analog of the basic Poisson process. Also, in this study, we prove that the distribution of the waiting time W_{ν, q}, ν ? 1, up to the νth arrival, is a q-Erlang distribution and the interarrival times T_{k, q} = W_{k, q} ? W_{k ? 1, q},?k = 1, 2, …, ν with W_{0, q} = 0 are independent and equidistributed with a q-Exponential distribution. 相似文献

14.

Influence measure for the L1 regression

Silvia N. Elian Carmen D.S. André Subhash C. Narula 《统计学通讯:理论与方法》2013,42(4):837-849

Because outliers and leverage observations unduly affect the least squares regression, the identification of influential observations is considered an important and integrai part of the analysis. However, very few techniques have been developed for the residual analysis and diagnostics for the minimum sum of absolute errors, L₁ regression. Although the L₁ regression is more resistant to the outliers than the least squares regression, it appears that outliers (leverage) in the predictor variables may affect it. In this paper, our objective is to develop an influence measure for the L₁ regression based on the likelihood displacement function. We illustrate the proposed influence measure with examples. 相似文献

15.

The roles of prior catchability assumptions and sample features on bayes estimates of the number of classes in a population

Olga S. Yoshida José G. Leite Heleno Bolfarine 《统计学通讯:理论与方法》2013,42(8):1895-1904

The object of this paper is to explain the role played by the catchability and sampling in the Bayesian estimation of k, the unknown number of classes in a multinomial population. It is shown that the posterior distribution of k increases as the capture probabilities of the classes become more unequal, and that the posterior distribution of k increases with the number of classes observed in the sample and decreases with the sample size. Moreover, it is shown that the posterior mean of k is consistent. 相似文献

16.

On the Sampling Distributions of the Estimated Process Loss Indices with Asymmetric Tolerances

Y. C. Chang W. L. Pearn Chien-Wei Wu 《统计学通讯:模拟与计算》2013,42(6):1153-1170

The inverse Gaussian distribution provides a flexible model for analyzing positive, right-skewed data. The generalized variable test for equality of several inverse Gaussian means with unknown and arbitrary variances has satisfactory Type-I error rate when the number of samples (k) is small (Tian, 2006). However, the Type-I error rate tends to be inflated when k goes up. In this article, we propose a parametric bootstrap (PB) approach for this problem. Simulation results show that the proposed test performs very satisfactorily regardless of the number of samples and sample sizes. This method is illustrated by an example. 相似文献

17.

On adaptive linear regression

Arnab Maity Michael Sherman 《Journal of applied statistics》2008,35(12):1409-1422

Ordinary least squares (OLS) is omnipresent in regression modeling. Occasionally, least absolute deviations (LAD) or other methods are used as an alternative when there are outliers. Although some data adaptive estimators have been proposed, they are typically difficult to implement. In this paper, we propose an easy to compute adaptive estimator which is simply a linear combination of OLS and LAD. We demonstrate large sample normality of our estimator and show that its performance is close to best for both light-tailed (e.g. normal and uniform) and heavy-tailed (e.g. double exponential and t ₃) error distributions. We demonstrate this through three simulation studies and illustrate our method on state public expenditures and lutenizing hormone data sets. We conclude that our method is general and easy to use, which gives good efficiency across a wide range of error distributions. 相似文献

18.

Penalized weighted composite quantile regression in the linear regression model with heavy-tailed autocorrelated errors

《Journal of the Korean Statistical Society》2014,43(4):531-543

In this paper, a penalized weighted composite quantile regression estimation procedure is proposed to estimate unknown regression parameters and autoregression coefficients in the linear regression model with heavy-tailed autoregressive errors. Under some conditions, we show that the proposed estimator possesses the oracle properties. In addition, we introduce an iterative algorithm to achieve the proposed optimization problem, and use a data-driven method to choose the tuning parameters. Simulation studies demonstrate that the proposed new estimation method is robust and works much better than the least squares based method when there are outliers in the dataset or the autoregressive error distribution follows heavy-tailed distributions. Moreover, the proposed estimator works comparably to the least squares based estimator when there are no outliers and the error is normal. Finally, we apply the proposed methodology to analyze the electricity demand dataset. 相似文献

19.

Improved estimators for the selected location parameters

P. Vellaisamy Abraham P. Punnen 《Statistical Papers》2002,43(2):291-299

_i , i = 1, 2, ..., k be k independent exponential populations with different unknown location parameters θ_i, i = 1, 2, ..., k and common known scale parameter σ. Let Y _i denote the smallest observation based on a random sample of size n from the i-th population. Suppose a subset of the given k population is selected using the subset selection procedure according to which the population π_i is selected iff Y _i≥Y ₍₁₎−d, where Y ₍₁₎ is the largest of the Y _i's and d is some suitable constant. The estimation of the location parameters associated with the selected populations is considered for the squared error loss. It is observed that the natural estimator dominates the unbiased estimator. It is also shown that the natural estimator itself is inadmissible and a class of improved estimators that dominate the natural estimator is obtained. The improved estimators are consistent and their risks are shown to be O(kn ⁻²). As a special case, we obtain the coresponding results for the estimation of θ₍₁₎, the parameter associated with Y ₍₁₎. Received: January 6, 1998; revised version: July 11, 2000 相似文献

20.

Lower percentage points of hartley's extremal quotient statistic and their applications

J. J. Bau Hubert J. Chen Shun-Yi Chen 《统计学通讯:模拟与计算》2013,42(2):443-465

Consider K(>2) independent populations π₁,..,π_k such that observations obtained from π_k are independent and normally distributed with unknown mean µ _i and unknown variance θ _i i = 1,…,k. In this paper, we provide lower percentage points of Hartley's extremal quotient statistic for testing an interval hypothesisH ₀ θ _[k] θ _[k] > δ vs. H _a : θ _[k] θ _[1] ≤ δ , where δ ≥ 1 is a predetermined constant and θ _[k](θ _[1]) is the max (min) of the θ_i,…,θ_k. The least favorable configuration (LFC) for the test under H ₀ is determined in order to obtain the lower percentage points. These percentage points can also be used to construct an upper confidence bound for θ_[k]/θ_[1]. 相似文献