首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Consider the linear regression model y =β01 ++ in the usual notation. It is argued that the class of ordinary ridge estimators obtained by shrinking the least squares estimator by the matrix (X1X + kI)-1X'X is sensitive to outliers in the ^variable. To overcome this problem, we propose a new class of ridge-type M-estimators, obtained by shrinking an M-estimator (instead of the least squares estimator) by the same matrix. Since the optimal value of the ridge parameter k is unknown, we suggest a procedure for choosing it adaptively. In a reasonably large scale simulation study with a particular M-estimator, we found that if the conditions are such that the M-estimator is more efficient than the least squares estimator then the corresponding ridge-type M-estimator proposed here is better, in terms of a Mean Squared Error criteria, than the ordinary ridge estimator with k chosen suitably. An example illustrates that the estimators proposed here are less sensitive to outliers in the y-variable than ordinary ridge estimators.  相似文献   

2.
We consider various robust estimators for the extended Burr Type III (EBIII) distribution for complete data with outliers. The considered robust estimators are M-estimators, least absolute deviations, Theil, Siegel's repeated median, least trimmed squares, and least median of squares. Before we perform the aforementioned estimators for the EBIII, we adapt the quantiles method to the estimation of the shape parameter k of the EBIII. The simulation results show that the considered robust estimators generally outperform the existing estimation approaches for data with upper outliers, with certain of them retaining a relatively high degree of efficiency for small sample sizes.  相似文献   

3.
During drug development, the calculation of inhibitory concentration that results in a response of 50% (IC50) is performed thousands of times every day. The nonlinear model most often used to perform this calculation is a four‐parameter logistic, suitably parameterized to estimate the IC50 directly. When performing these calculations in a high‐throughput mode, each and every curve cannot be studied in detail, and outliers in the responses are a common problem. A robust estimation procedure to perform this calculation is desirable. In this paper, a rank‐based estimate of the four‐parameter logistic model that is analogous to least squares is proposed. The rank‐based estimate is based on the Wilcoxon norm. The robust procedure is illustrated with several examples from the pharmaceutical industry. When no outliers are present in the data, the robust estimate of IC50 is comparable with the least squares estimate, and when outliers are present in the data, the robust estimate is more accurate. A robust goodness‐of‐fit test is also proposed. To investigate the impact of outliers on the traditional and robust estimates, a small simulation study was conducted. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
The problem of multicollinearity and outliers in the data set produce undesirable effects on the ordinary least squares estimator. Therefore, robust two parameter ridge estimation based on M-estimator (ME) is introduced to deal with multicollinearity and outliers in the y-direction. The proposed estimator outperforms ME, two parameter ridge estimator and robust ridge M-estimator according to mean square error criterion. Moreover, a numerical example and a Monte Carlo simulation experiment are presented.  相似文献   

5.
In regression analysis, to overcome the problem of multicollinearity, the r ? k class estimator is proposed as an alternative to the ordinary least squares estimator which is a general estimator including the ordinary ridge regression estimator, the principal components regression estimator and the ordinary least squares estimator. In this article, we derive the necessary and sufficient conditions for the superiority of the r ? k class estimator over each of these estimators under the Mahalanobis loss function by the average loss criterion. Then, we compare these estimators with each other using the same criterion. Also, we suggest to test to verify if these conditions are indeed satisfied. Finally, a numerical example and a Monte Carlo simulation are done to illustrate the theoretical results.  相似文献   

6.
The asymptotically normal, regression-based LM integration test is adapted for panels with correlated units. The N different units may be integrated of different (fractional) orders under the null hypothesis. The paper first reviews conditions under which the test statistic is asymptotically (as T→∞) normal in a single unit. Then we adopt the framework of seemingly unrelated regression [SUR] for cross-correlated panels, and discuss a panel test statistic based on the feasible generalized least squares [GLS] estimator, which follows a χ 2(N) distribution. Third, a more powerful statistic is obtained by working under the assumption of equal deviations from the respective null in all units. Fourth, feasible GLS requires inversion of sample covariance matrices typically imposing T>N; in addition we discuss alternative covariance matrix estimators for T<N. The usefulness of our results is assessed in Monte Carlo experimentation.  相似文献   

7.
Consider the multiple hypotheses testing problem controlling the generalized familywise error rate k-FWER, the probability of at least k false rejections. We propose a plug-in procedure based on the estimation of the number of true null hypotheses. Under the independence assumption of the p-values corresponding to the true null hypotheses, we first introduce the least favorable configuration (LFC) of k-FWER for Bonferroni-type plug-in procedure, then we construct a plug-in k-FWER-controlled procedure based on LFC. For dependent p-values, we establish the asymptotic k-FWER control under some mild conditions. Simulation studies suggest great improvement over generalized Bonferroni test and generalized Holm test.  相似文献   

8.
ABSTRACT

In this article, we consider a (k + 1)n-dimensional elliptically contoured random vector (XT1, X2T, …, XTk, ZT)T = (X11, …, X1n, …, Xk1, …, Xkn, Z1, …, Zn)T and derive the distribution of concomitant of multivariate order statistics arising from X1, X2, …, Xk. Specially, we derive a mixture representation for concomitant of bivariate order statistics. The joint distribution of the concomitant of bivariate order statistics is also obtained. Finally, the usefulness of our result is illustrated by a real-life data.  相似文献   

9.
Suppose there are k 1 (k 1 ≥ 1) test treatments that we wish to compare with k 2 (k 2 ≥ 1) control treatments. Assume that the observations from the ith test treatment and the jth control treatment follow a two-parameter exponential distribution and , where θ is a common scale parameter and and are the location parameters of the ith test and the jth control treatment, respectively, i = 1, . . . ,k 1; j = 1, . . . ,k 2. In this paper, simultaneous one-sided and two-sided confidence intervals are proposed for all k 1 k 2 differences between the test treatment location and control treatment location parameters, namely , and the required critical points are provided. Discussions of multiple comparisons of all test treatments with the best control treatment and an optimal sample size allocation are given. Finally, it is shown that the critical points obtained can be used to construct simultaneous confidence intervals for Pareto distribution location parameters.  相似文献   

10.
Zerbet and Nikulin presented the new statistic Z k for detecting outliers in exponential distribution. They also compared this statistic with Dixon's statistic D k . In this article, we extend this approach to gamma distribution and compare the result with Dixon's statistic. The results show that the test based on statistic Z k is more powerful than the test based on the Dixon's statistic.  相似文献   

11.
In order to describe or generate so-called outliers in univariate statistical data, contamination models are often used. These models assume that k out of n independent random variables are shifted or multiplicated by some constant, whereas the other observations still come i.i.d. from some common target distribution. Of course, these contaminants do not necessarily stick out as the extremes in the sample. Moreover, it is the amount and magnitude of ‘contamination” which determines the number of obvious outliers. Using the concept of Davies and Gather (1993) to formalize the outlier notion we quantify the amount of contamination needed to produce a prespecified expected number of ‘genuine’ outliers. In particular, we demonstrate that for sample of moderate size from a normal target distribution a rather large shift of the contaminants is necessary to yield a certain expected number of outliers. Such an insight is of interest when designing simulation studies where outliers shoulod occur as well as in theoretical investigations on outliers.  相似文献   

12.
When a process is monitored with a T 2 control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T 2 statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T 2 statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T 2 statistic in Phase I with the corresponding T 2 statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition.  相似文献   

13.
In this study, we introduce the Heine process, {Xq(t), t > 0}, 0 < q < 1, where the random variable Xq(t), for every t > 0, represents the number of events (occurrences or arrivals) during a time interval (0, t]. The Heine process is introduced as a q-analog of the basic Poisson process. Also, in this study, we prove that the distribution of the waiting time Wν, q, ν ? 1, up to the νth arrival, is a q-Erlang distribution and the interarrival times Tk, q = Wk, q ? Wk ? 1, q,?k = 1, 2, …, ν with W0, q = 0 are independent and equidistributed with a q-Exponential distribution.  相似文献   

14.
Because outliers and leverage observations unduly affect the least squares regression, the identification of influential observations is considered an important and integrai part of the analysis. However, very few techniques have been developed for the residual analysis and diagnostics for the minimum sum of absolute errors, L1 regression. Although the L1 regression is more resistant to the outliers than the least squares regression, it appears that outliers (leverage) in the predictor variables may affect it. In this paper, our objective is to develop an influence measure for the L1 regression based on the likelihood displacement function. We illustrate the proposed influence measure with examples.  相似文献   

15.
The object of this paper is to explain the role played by the catchability and sampling in the Bayesian estimation of k, the unknown number of classes in a multinomial population. It is shown that the posterior distribution of k increases as the capture probabilities of the classes become more unequal, and that the posterior distribution of k increases with the number of classes observed in the sample and decreases with the sample size. Moreover, it is shown that the posterior mean of k is consistent.  相似文献   

16.
The inverse Gaussian distribution provides a flexible model for analyzing positive, right-skewed data. The generalized variable test for equality of several inverse Gaussian means with unknown and arbitrary variances has satisfactory Type-I error rate when the number of samples (k) is small (Tian, 2006). However, the Type-I error rate tends to be inflated when k goes up. In this article, we propose a parametric bootstrap (PB) approach for this problem. Simulation results show that the proposed test performs very satisfactorily regardless of the number of samples and sample sizes. This method is illustrated by an example.  相似文献   

17.
Ordinary least squares (OLS) is omnipresent in regression modeling. Occasionally, least absolute deviations (LAD) or other methods are used as an alternative when there are outliers. Although some data adaptive estimators have been proposed, they are typically difficult to implement. In this paper, we propose an easy to compute adaptive estimator which is simply a linear combination of OLS and LAD. We demonstrate large sample normality of our estimator and show that its performance is close to best for both light-tailed (e.g. normal and uniform) and heavy-tailed (e.g. double exponential and t 3) error distributions. We demonstrate this through three simulation studies and illustrate our method on state public expenditures and lutenizing hormone data sets. We conclude that our method is general and easy to use, which gives good efficiency across a wide range of error distributions.  相似文献   

18.
In this paper, a penalized weighted composite quantile regression estimation procedure is proposed to estimate unknown regression parameters and autoregression coefficients in the linear regression model with heavy-tailed autoregressive errors. Under some conditions, we show that the proposed estimator possesses the oracle properties. In addition, we introduce an iterative algorithm to achieve the proposed optimization problem, and use a data-driven method to choose the tuning parameters. Simulation studies demonstrate that the proposed new estimation method is robust and works much better than the least squares based method when there are outliers in the dataset or the autoregressive error distribution follows heavy-tailed distributions. Moreover, the proposed estimator works comparably to the least squares based estimator when there are no outliers and the error is normal. Finally, we apply the proposed methodology to analyze the electricity demand dataset.  相似文献   

19.
i , i = 1, 2, ..., k be k independent exponential populations with different unknown location parameters θ i , i = 1, 2, ..., k and common known scale parameter σ. Let Y i denote the smallest observation based on a random sample of size n from the i-th population. Suppose a subset of the given k population is selected using the subset selection procedure according to which the population π i is selected iff Y i Y (1)d, where Y (1) is the largest of the Y i 's and d is some suitable constant. The estimation of the location parameters associated with the selected populations is considered for the squared error loss. It is observed that the natural estimator dominates the unbiased estimator. It is also shown that the natural estimator itself is inadmissible and a class of improved estimators that dominate the natural estimator is obtained. The improved estimators are consistent and their risks are shown to be O(kn −2). As a special case, we obtain the coresponding results for the estimation of θ(1), the parameter associated with Y (1). Received: January 6, 1998; revised version: July 11, 2000  相似文献   

20.
Consider K(>2) independent populations π1,..,π k such that observations obtained from π k are independent and normally distributed with unknown mean µ i and unknown variance θ i i = 1,…,k. In this paper, we provide lower percentage points of Hartley's extremal quotient statistic for testing an interval hypothesisH 0 θ [k] θ [k] > δ vs. H a : θ [k] θ [1] ≤ δ , where δ ≥ 1 is a predetermined constant and θ [k](θ [1]) is the max (min) of the θi,…,θ k . The least favorable configuration (LFC) for the test under H 0 is determined in order to obtain the lower percentage points. These percentage points can also be used to construct an upper confidence bound for θ[k][1].  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号