期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Another Cautionary Note about R 2: Its Use in Weighted Least-Squares Regression Analysis

John B. Willett Judith D. Singer 《The American statistician》2013,67(3):236-238

A recent article in this journal presented a variety of expressions for the coefficient of determination (R ²) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R ² statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R ² statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example. 相似文献

2.

Convergence criteria for interval-valued inequality indices

Ignacio Cascos-Fernández Miguel López-Díaz Maria Ángeles Gil 《Statistics》2013,47(1):59-66

In this paper we introduce an interval-valued inequality index for random intervals based on a convex function. We show that if this function does not grow faster than x ^p, then the inequality index is continuous on the space of random intervals with finite p-th moment. A bound for the distance between the inequality indices of two random intervals is also constructed. An example is presented to motivate and illustrate the developments in this paper. 相似文献

3.

On perturbations of Stein operator

A. N. Kumar N. S. Upadhye 《统计学通讯:理论与方法》2017,46(18):9284-9302

In this article, we obtain a Stein operator for the sum of n independent random variables (rvs) which is shown as the perturbation of the negative binomial (NB) operator. Comparing the operator with NB operator, we derive the error bounds for total variation distance by matching parameters. Also, three-parameter approximation for such a sum is considered and is shown to improve the existing bounds in the literature. Finally, an application of our results to a function of waiting time for (k₁, k₂)-events is given. 相似文献

4.

Kernel Method Starting with Half-Normal Detection Function for Line Transect Density Estimation

Omar Eidous 《统计学通讯:理论与方法》2013,42(14):2366-2378

In this article, we introduce the nonparametric kernel method starting with half-normal detection function using line transect sampling. The new method improves bias from O(h ²), as the smoothing parameter h → 0, to O(h ³) and in some cases to O(h ⁴). Properties of the proposed estimator are derived and an expression for the asymptotic mean square error (AMSE) of the estimator is given. Minimization of the AMSE leads to an explicit formula for an optimal choice of the smoothing parameter. Small-sample properties of the estimator are investigated and compared with the traditional kernel estimator by using simulation technique. A numerical results show that improvements over the traditional kernel estimator often can be realized even when the true detection function is far from the half-normal detection function. 相似文献

5.

Minimum-Distance Estimator for Stable Exponent

Zhaozhi Fan 《统计学通讯:理论与方法》2013,42(4):511-528

Assume that X ₁, X ₂,…, X _n is a sequence of i.i.d. random variables with α-stable distribution (α ∈ (0,2], the stable exponent, is the unknown parameter). We construct minimum distance estimators for α by minimizing the Kolmogorov distance or the Cramér–von-Mises distance between the empirical distribution function G _n, and a class of distributions defined based on the sum-preserving property of stable random variables. The minimum distance estimators can also be obtained by minimizing a U-statistic estimate of an empirical distribution function involving the stable exponent. They share the same invariance property with the maximum likelihood estimates. In this article, we prove the strong consistency of the minimum distance estimators. We prove the asymptotic normality of our estimators. Simulation study shows that the new estimators are competitive to the existing ones and perform very closely even to the maximum likelihood estimator. 相似文献

6.

Slow convergence of the Gibbs sampler

Claude Blisle 《Revue canadienne de statistique》1998,26(4):629-641

We consider the Gibbs sampler as a tool for generating an absolutely continuous probability measure ≥ on R^d. When an appropriate irreducibility condition is satisfied, the Gibbs Markov chain (X_n;n ≥ 0) converges in total variation to its target distribution ≥. Sufficient conditions for geometric convergence have been given by various authors. Here we illustrate, by means of simple examples, how slow the convergence can be. In particular, we show that given a sequence of positive numbers decreasing to zero, say (b_n;n ≥ 1), one can construct an absolutely continuous probability measure ≥ on R^d which is such that the total variation distance between ≥ and the distribution of X_n, converges to 0 at a rate slower than that of the sequence (b_n;n ≥ 1). This can even be done in such a way that ≥ is the uniform distribution over a bounded connected open subset of R^d. Our results extend to hit-and-run samplers with direction distributions having supports with symmetric gaps. 相似文献

7.

Characterization of Balanced Fractional 3 m Factorial Designs of Resolution III

Hiromu Yumiba Yoshifumi Hyodo 《统计学通讯:理论与方法》2013,42(11):2074-2080

We consider a fractional 3^m factorial design derived from a simple array (SA), which is a balanced array of full strength, where the non negligible factorial effects are the general mean and the linear and quadratic components of the main effect, and m ≥ 2. In this article, we give a necessary and sufficient condition for an SA to be a balanced fractional 3^m factorial design of resolution III. Such a design is characterized by the suffixes of indices of an SA. 相似文献

8.

A Pitman measure of similarity in k-means for clustering heavy-tailed data

Arman Reybod Javad Etminan Adel Mohammadpour 《统计学通讯:模拟与计算》2019,48(6):1595-1605

One of the most popular methods and algorithms to partition data to k clusters is k-means clustering algorithm. Since this method relies on some basic conditions such as, the existence of mean and finite variance, it is unsuitable for data that their variances are infinite such as data with heavy tailed distribution. Pitman Measure of Closeness (PMC) is a criterion to show how much an estimator is close to its parameter with respect to another estimator. In this article using PMC, based on k-means clustering, a new distance and clustering algorithm is developed for heavy tailed data. 相似文献

9.

On Properties of Reversed Mean Residual Life Order for Weighted Distributions

S. Izadkhah A. H. Rezaei Roknabadi G. R. Mohtashami Borzadaran 《统计学通讯:理论与方法》2013,42(5):838-851

In this article, first, in order to compare X and X ^w (the weighted version of X with weight function w(·)) according to reversed mean residual life order, we provide an equivalent condition. We then try to provide conditions under which the reversed mean residual life order is preserved by weighted distributions. For this end, we obtain several independent results. Finally, the problem of preservation of increasing reversed mean residual life class under weighting is investigated, as well. Some examples are also given to illustrate the results. 相似文献

10.

AN ASYMPTOTIC EXPANSION OF THE EXPECTATION OF THE ESTIMATED ERROR RATE IN DISCRIMINANT ANALYSIS1

G. J. Mclachlan 《Australian & New Zealand Journal of Statistics》1973,15(3):210-214

When a sample discriminant function is computed, it is desired to estimate the error rate using this function. This is often done by computing G(-D/2), where G is the cumulative normal distribution and D² is the estimated Mahalanobis' distance. In this paper an asymptotic expansion of the expectation of G(-D/2) is derived and is compared with existing Monte Carlo estimates. The asymptotic bias of G(-D/2) is derived also and the well-known practical result that G(-D/2) gives too favourable an estimate of the true error rate 相似文献

11.

Improved Score Tests in Symmetric Linear Regression Models

Miguel A. Uribe-Opazo Gauss M. Cordeiro 《统计学通讯:理论与方法》2013,42(2):261-276

The class of symmetric linear regression models has the normal linear regression model as a special case and includes several models that assume that the errors follow a symmetric distribution with longer-than-normal tails. An important member of this class is the t linear regression model, which is commonly used as an alternative to the usual normal regression model when the data contain extreme or outlying observations. In this article, we develop second-order asymptotic theory for score tests in this class of models. We obtain Bartlett-corrected score statistics for testing hypotheses on the regression and the dispersion parameters. The corrected statistics have chi-squared distributions with errors of order O(n ^?3/2), n being the sample size. The corrections represent an improvement over the corresponding original Rao's score statistics, which are chi-squared distributed up to errors of order O(n ^?1). Simulation results show that the corrected score tests perform much better than their uncorrected counterparts in samples of small or moderate size. 相似文献

12.

On Estimation of Hurst Parameter Under Noisy Observations

Guangying Liu Bing-Yi Jing 《商业与经济统计学杂志》2018,36(3):483-492

It is widely accepted that some financial data exhibit long memory or long dependence, and that the observed data usually possess noise. In the continuous time situation, the factional Brownian motion B^H and its extension are an important class of models to characterize the long memory or short memory of data, and Hurst parameter H is an index to describe the degree of dependence. In this article, we estimate the Hurst parameter of a discretely sampled fractional integral process corrupted by noise. We use the preaverage method to diminish the impact of noise, employ the filter method to exclude the strong dependence, and obtain the smoothed data, and estimate the Hurst parameter by the smoothed data. The asymptotic properties such as consistency and asymptotic normality of the estimator are established. Simulations for evaluating the performance of the estimator are conducted. Supplementary materials for this article are available online. 相似文献

13.

Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment

Steele Brian M. Patterson David A. 《Statistics and Computing》2000,10(4):349-355

Euclidean distance k-nearest neighbor (k-NN) classifiers are simple nonparametric classification rules. Bootstrap methods, widely used for estimating the expected prediction error of classification rules, are motivated by the objective of calculating the ideal bootstrap estimate of expected prediction error. In practice, bootstrap methods use Monte Carlo resampling to estimate the ideal bootstrap estimate because exact calculation is generally intractable. In this article, we present analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers and propose a new weighted k-NN classifier based on resampling ideas. The resampling-weighted k-NN classifier replaces the k-NN posterior probability estimates by their expectations under resampling and predicts an unclassified covariate as belonging to the group with the largest resampling expectation. A simulation study and an application involving remotely sensed data show that the resampling-weighted k-NN classifier compares favorably to unweighted and distance-weighted k-NN classifiers. 相似文献

14.

Cross-validation Revisited

Santanu Dutta 《统计学通讯:模拟与计算》2016,45(2):472-490

Data-based choice of the bandwidth is an important problem in kernel density estimation. The pseudo-likelihood and the least-squares cross-validation bandwidth selectors are well known, but widely criticized in the literature. For heavy-tailed distributions, the L₁ distance between the pseudo-likelihood-based estimator and the density does not seem to converge in probability to zero with increasing sample size. Even for normal-tailed densities, the rate of L₁ convergence is disappointingly slow. In this article, we report an interesting finding that with minor modifications both the cross-validation methods can be implemented effectively, even for heavy-tailed densities. For both these estimators, the L₁ distance (from the density) are shown to converge completely to zero irrespective of the tail of the density. The expected L₁ distance also goes to zero. These results hold even in the presence of a strongly mixing-type dependence. Monte Carlo simulations and analysis of the Old Faithful geyser data suggest that if implemented appropriately, contrary to the traditional belief, the cross-validation estimators compare well with the sophisticated plug-in and bootstrap-based estimators. 相似文献

15.

Non Autonomous Semilinear Stochastic Evolution Equations

Xi-Liang Fan 《统计学通讯:理论与方法》2013,42(9):1806-1818

In this article, we first give a version with continuous paths for stochastic convolution ∫^t₀U(t, s)φ(s)dW(s) driven by a Wiener process W in a Hilbert space under weaker conditions. Based on the Picard approximation and the factorization method, we prove the existence, uniqueness and regularity of mild solutions for non-autonomous semilinear stochastic evolution equations with more general assumptions on the coefficients. As an application, we obtain the Feller property of the associated semigroup. 相似文献

16.

Existence Conditions for Balanced Fractional 2m Factorial Designs of Resolution 2l + 1 Derived from Simple Arrays

Y. Hyodo M. Kuwada 《统计学通讯:理论与方法》2013,42(12):2564-2570

We consider a fractional 2^m factorial design derived from a simple array (SA) such that the (? + 1)-factor and higher-order interactions are negligible, where 2? ? m. The purpose of this article is to give a necessary and sufficient condition for an SA to be a balanced fractional 2^m factorial design of resolution 2? + 1. Such a design is concretely characterized by the suffixes of the indices of an SA. 相似文献

17.

An Efficient Method to Generate Data and Compute Exact P-Values in Goodness-of-Fit Testing

David Magis 《统计学通讯:模拟与计算》2013,42(4):805-815

In this article, we use a characterization of the set of sample counts that do not match with the null hypothesis of the test of goodness of fit. Two direct applications arise: first, to instantaneously generate data sets whose corresponding asymptotic P-values belong to a certain pre-defined range; and second, to compute exact P-values for this test in an efficient way. We present both issues before illustrating them by analyzing a couple of data sets. Method's efficiency is also assessed by means of simulations. We focus on Pearson's X ² statistic but the case of likelihood-ratio statistic is also discussed. 相似文献

18.

Uniformly strong consistency and Berry-Esseen bound of frequency polygons for α-mixing samples

Guo-Dong Xing Shan-Chao Yang Xiaohu Li 《统计学通讯:模拟与计算》2019,48(2):416-430

In this article, the frequency polygon investigated by Scott is studied as a nonparametric estimator for α-mixing samples. By some known exponent and moment inequalities, we obtain the uniformly strong consistency and Berry-Esseen bound of the estimator. The present results relax the relevant conditions used by Carbon et al. Furthermore, the convergence rate of the uniformly asymptotic normality is derived, which is O(n^{? 1/11}) under the given conditions. 相似文献

19.

Identifying Variables Contributing to Outliers in Phase I

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(7):1103-1118

When a process is monitored with a T ² control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T ² statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T ² statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T ² statistic in Phase I with the corresponding T ² statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition. 相似文献

20.

Convergence Rate of Strong Consistency of the Maximum Likelihood Estimator in Exponential Family Nonlinear Models

Tian Xia Shun-Fang Wang Xue-Ren Wang 《统计学通讯:理论与方法》2013,42(1):103-115

This article proposes some regularity conditions. On the basis of the proposed regularity conditions, we show the strong consistency of the maximum likelihood estimator (MLE) in exponential family nonlinear models (EFNM) and give its convergence rate. In an important case, we obtain the convergence rate O(n ^?1/2(log log n)^1/2)—the rate as that in the Law of the Iterated Logarithm (LIL) for iid partial sums and thus cannot be improved anymore. 相似文献