期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using principal components to test normality of high-dimensional data

Rashid Mansoor 《统计学通讯:模拟与计算》2017,46(5):3396-3405

Many multivariate statistical procedures are based on the assumption of normality and different approaches have been proposed for testing this assumption. The vast majority of these tests, however, are exclusively designed for cases when the sample size n is larger than the dimension of the variable p, and the null distributions of their test statistics are usually derived under the asymptotic case when p is fixed and n increases. In this article, a test that utilizes principal components to test for nonnormality is proposed for cases when p/n → c. The power and size of the test are examined through Monte Carlo simulations, and it is argued that the test remains well behaved and consistent against most nonnormal distributions under this type of asymptotics. 相似文献

2.

Testing homogeneity of several covariance matrices and multi-sample sphericity for high-dimensional data under non-normality

M. Rauf Ahmad 《统计学通讯:理论与方法》2017,46(8):3738-3753

A test for homogeneity of g ? 2 covariance matrices is presented when the dimension, p, may exceed the sample size, n_i, i = 1, …, g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when n_i, p → ∞. Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests. 相似文献

3.

Central limit theorems for functionals of large sample covariance matrix and mean vector in matrix‐variate location mixture of normal distributions

Taras Bodnar Stepan Mazur Nestor Parolya 《Scandinavian Journal of Statistics》2019,46(2):636-660

In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/n→c ∈ [0, + ∞) when the sample covariance matrix does not need to be invertible and p/n→c ∈ [0,1) otherwise. 相似文献

4.

Nonparametric estimation of 100(1 − p)% expected shortfall: p → 0 as sample size is increased

Santanu Dutta Suparna Biswas 《统计学通讯:模拟与计算》2018,47(2):338-352

Expected shortfall (ES) is a well-known measure of extreme loss associated with a risky asset or portfolio. For any 0 < p < 1, the 100(1 ? p) percent ES is defined as the mean of the conditional loss distribution, given the event that the loss exceeds (1 ? p)th quantile of the marginal loss distribution. Estimation of ES based on asset return data is an important problem in finance. Several nonparametric estimators of the expected shortfall are available in the literature. Using Monte Carlo simulations, we compare the accuracy of these estimators under the condition that p → 0 as n → ∞ for several asset return time series models, where n is the sample size. Not much seems to be known regarding the properties of the ES estimators under this condition. For p close to zero, the ES measures an extreme loss in the right tail of the loss distribution of the asset or portfolio. Our simulations and real-data analysis provide insight into the effect of varying p with n on the performance of nonparametric ES estimators. 相似文献

5.

On random walks on affine group

Darning Xu 《统计学通讯:理论与方法》2013,42(8):2925-2942

Diaconis' presumption that the number of steps required to get close to uniform for a random walk on the affine group A _pis c(p)p ²with c(p) →ã is verified. We also discuss the random number generation associated with the random walk on the affine group. The number of steps to force the generated number to become random is improved. A modified version of Diacohis-Shahshahani's upper bound lemma is given and applied 相似文献

6.

Moments of Order Statistics from Weibull Distribution in the Presence of Multiple Outliers

Khalaf S. Sultan Mohamed E. Moshref 《统计学通讯:理论与方法》2014,43(10-12):2214-2226

In this article, we derive exact expressions for the single and product moments of order statistics from Weibull distribution under the contamination model. We assume that X₁, X₂, …, X_{n ? p} are independent with density function f(x) while the remaining, p observations (outliers) X_{n ? p + 1}, …, X_n are independent with density function arises from some modified version of f(x), which is called g(x), in which the location and/or scale parameters have been shifted in value. Next, we investigate the effect of the outliers on the BLUE of the scale parameter. Finally, we deduce some special cases. 相似文献

7.

Geometric ergodicity of nonlinear autoregressive models with changing conditional variances

Min Chen Gemai Chen 《Revue canadienne de statistique》2000,28(3):605-614

The authors give easy‐to‐check sufficient conditions for the geometric ergodicity and the finiteness of the moments of a random process x_t = ?(x_t‐1,…, x_t‐p) + ?_tσ(x_t‐1,…, x_t‐q) in which ?: R^p → R, σ R^q → R and (?_t) is a sequence of independent and identically distributed random variables. They deduce strong mixing properties for this class of nonlinear autoregressive models with changing conditional variances which includes, among others, the ARCH(p), the AR(p)‐ARCH(p), and the double‐threshold autoregressive models. 相似文献

8.

The Probability Distribution for the Number of Successes in Independent Trials

Philip J. Boland 《统计学通讯:理论与方法》2013,42(7):1327-1331

The number of successes S in n independent trials is one of the classic distributions in statistics. When the probability of success p is constant, S has the binomial distribution with parameters n and p. Distributional properties of S in the homogeneous case are well known, but they are considerably harder to establish when p varies. A brief historical review is undertaken to highlight some of the more important results obtained over the past 50 years. In spite of the progress made, there are still some important open problems concerning stochastic comparisons for S that deserve to be solved! 相似文献

9.

Sample quantiles and additive statistics: Information,sufficiency, estimation

《Journal of statistical planning and inference》1996,52(1):93-108

The order of the increase in the Fisher information measure contained in a finite number k of additive statistics or sample quantiles, constructed from a sample of size n, as n → ∞, is investigated. It is shown that the Fisher information in additive statistics increases asymptotically in a manner linear with respect to n, if 2 + δ moments of additive statistics exist for some δ > 0. If this condition does not hold, the order of increase in this information is non-linear and the information may even decrease. The problem of asymptotic sufficiency of sample quantiles is investigated and some linear analogues of maximum likelihood equations are constructed. 相似文献

10.

Robust variable selection for generalized linear models with a diverging number of parameters

Chaohui Guo Hu Yang 《统计学通讯:理论与方法》2017,46(6):2967-2981

相似文献

11.

Order statistics from non-identical right-truncated Lomax random variables with applications

Aaron Childs N. Balakrishnan Mohamed Moshref 《Statistical Papers》2001,42(2):187-206

In this paper, we derive some recurrence relations for the single and the product moments of order statistics from n independent and non-identically distributed Lomax and right-truncated Lomax random variables. These recurrence relations are simple in nature and could be used systematically in order to compute all the single and product moments of all order statistics in a simple recursive manner. The results for order statistics from the multiple-outlier model (with a slippage of p observations) are deduced as special cases. We then apply these results by examining the robustness of censored BLUE's to the presence of multiple outliers. Received: November 30, 1998; revised version: March 8, 2000 相似文献

12.

Statistical Inference for High‐Dimensional Global Minimum Variance Portfolios

Konstantin Glombek 《Scandinavian Journal of Statistics》2014,41(4):845-865

Many studies demonstrate that inference for the parameters arising in portfolio optimization often fails. The recent literature shows that this phenomenon is mainly due to a high‐dimensional asset universe. Typically, such a universe refers to the asymptotics that the sample size n + 1 and the sample dimension d both go to infinity while d ∕ n → c ∈ (0,1). In this paper, we analyze the estimators for the excess returns’ mean and variance, the weights and the Sharpe ratio of the global minimum variance portfolio under these asymptotics concerning consistency and asymptotic distribution. Problems for stating hypotheses in high dimension are also discussed. The applicability of the results is demonstrated by an empirical study. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

13.

APPROXIMATION OF EXCEEDANCE PROCESSES IN LARGE POPULATIONS

《随机性模型》2013,29(2):147-156

We consider a population of n individuals. Each of these individuals generates a discrete time branching stochastic process. We study the number of ancestors S(n,t) whose offspring at time t exceeds level θ(t), where θ(t) is some positive valued function. It is proved that S(n,t) may be approximated as t → ∞ and n → ∞ by some stochastic processes with independent increments.

相似文献

14.

Some small-sample properties of some recently proposed multivariate outlier detection techniques

《Journal of Statistical Computation and Simulation》2012,82(8):701-712

Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p _n, their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p _n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables. 相似文献

15.

Outlier detection with Mahalanobis square distance: incorporating small sample correction factor

Meltem Ekiz O.Ufuk Ekiz 《Journal of applied statistics》2017,44(13):2444-2457

Mahalanobis square distances (MSDs) based on robust estimators improves outlier detection performance in multivariate data. However, the unbiasedness of robust estimators are not guaranteed when the sample size is small and this reduces their performance in outlier detection. In this study, we propose a framework that uses MSDs with incorporated small sample correction factor (c) and show its impact on performance when the sample size is small. This is achieved by using two prototypes, minimum covariance determinant estimator and S-estimators with bi-weight and t-biweight functions. The results from simulations show that distribution of MSDs for non-extreme observations are more likely to fit to chi-square with p degrees of freedom and MSDs of the extreme observations fit to F distribution, when c is incorporated into the model. However, without c, the distributions deviate significantly from chi-square and F observed for the case with incorporated c. These results are even more prominent for S-estimators. We present seven distinct comparison methods with robust estimators and various cut-off values and test their outlier detection performance with simulated data. We also present an application of some of these methods to the real data. 相似文献

16.

Tests for high-dimensional covariance matrices using the theory of U-statistics

M. Rauf Ahmad D. von Rosen 《Journal of Statistical Computation and Simulation》2015,85(13):2619-2631

Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Under certain mild conditions mainly on the traces of the unknown covariance matrix, and using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p?n. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. A real data set is used to illustrate the application of the proposed test statistics. 相似文献

17.

Variable selection in partial linear regression with functional covariate

G. Aneiros F. Ferraty P. Vieu 《Statistics》2015,49(6):1322-1347

The problem of variable selection is considered in high-dimensional partial linear regression under some model allowing for possibly functional variable. The procedure studied is that of nonconcave-penalized least squares. It is shown the existence of a √n/s_n-consistent estimator for the vector of p_n linear parameters in the model, even when p_n tends to ∞ as the sample size n increases (s_n denotes the number of influential variables). An oracle property is also obtained for the variable selection method, and the nonparametric rate of convergence is stated for the estimator of the nonlinear functional component of the model. Finally, a simulation study illustrates the finite sample size performance of our procedure. 相似文献

18.

Likelihood Ratio Tests for High‐Dimensional Normal Distributions

下载免费PDF全文

Tiefeng Jiang Yongcheng Qi 《Scandinavian Journal of Statistics》2015,42(4):988-1009

In their recent work, Jiang and Yang studied six classical Likelihood Ratio Test statistics under high‐dimensional setting. Assuming that a random sample of size n is observed from a p‐dimensional normal population, they derive the central limit theorems (CLTs) when p and n are proportional to each other, which are different from the classical chi‐square limits as n goes to infinity, while p remains fixed. In this paper, by developing a new tool, we prove that the mentioned six CLTs hold in a more applicable setting: p goes to infinity, and p can be very close to n. This is an almost sufficient and necessary condition for the CLTs. Simulations of histograms, comparisons on sizes and powers with those in the classical chi‐square approximations and discussions are presented afterwards. 相似文献

19.

Robust Confidence Intervals for the Bernoulli Parameter

Wheyming Tina Song Chia-Jung Chang Sin-Long Liu 《统计学通讯:理论与方法》2013,42(19):3544-3560

Despite the simplicity of the Bernoulli process, developing good confidence interval procedures for its parameter—the probability of success p—is deceptively difficult. The binary data yield a discrete number of successes from a discrete number of trials, n. This discreteness results in actual coverage probabilities that oscillate with the n for fixed values of p (and with p for fixed n). Moreover, this oscillation necessitates a large sample size to guarantee a good coverage probability when p is close to 0 or 1.

It is well known that the Wilson procedure is superior to many existing procedures because it is less sensitive to p than any other procedures, therefore it is less costly. The procedures proposed in this article work as well as the Wilson procedure when 0.1 ≤p ≤ 0.9, and are even less sensitive (i.e., more robust) than the Wilson procedure when p is close to 0 or 1. Specifically, when the nominal coverage probability is 0.95, the Wilson procedure requires a sample size 1, 021 to guarantee that the coverage probabilities stay above 0.92 for any 0.001 ≤ min {p, 1 ?p} <0.01. By contrast, our procedures guarantee the same coverage probabilities but only need a sample size 177 without increasing either the expected interval width or the standard deviation of the interval width. 相似文献

20.

Regularized proportional odds models

《Journal of Statistical Computation and Simulation》2012,82(2):251-268

The proportional odds model (POM) is commonly used in regression analysis to predict the outcome for an ordinal response variable. The maximum likelihood estimation (MLE) approach is typically used to obtain the parameter estimates. The likelihood estimates do not exist when the number of parameters, p, is greater than the number of observations n. The MLE also does not exist if there are no overlapping observations in the data. In a situation where the number of parameters is less than the sample size but p is approaching to n, the likelihood estimates may not exist, and if they exist they may have quite large standard errors. An estimation method is proposed to address the last two issues, i.e. complete separation and the case when p approaches n, but not the case when p>n. The proposed method does not use any penalty term but uses pseudo-observations to regularize the observed responses by downgrading their effect so that they become close to the underlying probabilities. The estimates can be computed easily with all commonly used statistical packages supporting the fitting of POMs with weights. Estimates are compared with MLE in a simulation study and an application to the real data. 相似文献