首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative hypothesis requires complex analytic approximations, and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p≤20. On the other hand, assuming that the data dimension p as well as the number q of regression variables are fixed while the sample size n grows, several asymptotic approximations are proposed in the literature for Wilk's Λ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension p and a large sample size n. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null hypothesis and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large p and large n context, but also for moderately large data dimensions such as p=30 or p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in multivariate analysis of variance which is valid for high-dimensional data.  相似文献   

The estimation of extreme conditional quantiles is an important issue in different scientific disciplines. Up to now, the extreme value literature focused mainly on estimation procedures based on independent and identically distributed samples. Our contribution is a two-step procedure for estimating extreme conditional quantiles. In a first step nonextreme conditional quantiles are estimated nonparametrically using a local version of [Koenker, R. and Bassett, G. (1978). Regression quantiles. Econometrica, 46, 33–50.] regression quantile methodology. Next, these nonparametric quantile estimates are used as analogues of univariate order statistics in procedures for extreme quantile estimation. The performance of the method is evaluated for both heavy tailed distributions and distributions with a finite right endpoint using a small sample simulation study. A bootstrap procedure is developed to guide in the selection of an optimal local bandwidth. Finally the procedure is illustrated in two case studies.  相似文献   


In many statistical applications estimation of population quantiles is desired. In this study, a log–flip–robust (LFR) approach is proposed to estimate, specifically, lower-end quantiles (those below the median) from a continuous, positive, right-skewed distribution. Characteristics of common right-skewed distributions suggest that a logarithm transformation (L) followed by flipping the lower half of the sample (F) allows for the estimation of the lower-end quantile using robust methods (R) based on symmetric populations. Simulations show that this approach is superior in many cases to current methods, while not suffering from the sample size restrictions of other approaches.  相似文献   

Quantile regression models are a powerful tool for studying different points of the conditional distribution of univariate response variables. Their multivariate counterpart extension though is not straightforward, starting with the definition of multivariate quantiles. We propose here a flexible Bayesian quantile regression model when the response variable is multivariate, where we are able to define a structured additive framework for all predictor variables. We build on previous ideas considering a directional approach to define the quantiles of a response variable with multiple outputs, and we define noncrossing quantiles in every directional quantile model. We define a Markov chain Monte Carlo (MCMC) procedure for model estimation, where the noncrossing property is obtained considering a Gaussian process design to model the correlation between several quantile regression models. We illustrate the results of these models using two datasets: one on dimensions of inequality in the population, such as income and health; the second on scores of students in the Brazilian High School National Exam, considering three dimensions for the response variable.  相似文献   


This paper presents a new method to estimate the quantiles of generic statistics by combining the concept of random weighting with importance resampling. This method converts the problem of quantile estimation to a dual problem of tail probabilities estimation. Random weighting theories are established to calculate the optimal resampling weights for estimation of tail probabilities via sequential variance minimization. Subsequently, the quantile estimation is constructed by using the obtained optimal resampling weights. Experimental results on real and simulated data sets demonstrate that the proposed random weighting method can effectively estimate the quantiles of generic statistics.  相似文献   

With the advent of modern technology, manufacturing processes have become very sophisticated; a single quality characteristic can no longer reflect a product's quality. In order to establish performance measures for evaluating the capability of a multivariate manufacturing process, several new multivariate capability (NMC) indices, such as NMC p and NMC pm , have been developed over the past few years. However, the sample size determination for multivariate process capability indices has not been thoroughly considered in previous studies. Generally, the larger the sample size, the more accurate an estimation will be. However, too large a sample size may result in excessive costs. Hence, the trade-off between sample size and precision in estimation is a critical issue. In this paper, the lower confidence limits of NMC p and NMC pm indices are used to determine the appropriate sample size. Moreover, a procedure for conducting the multivariate process capability study is provided. Finally, two numerical examples are given to demonstrate that the proper determination of sample size for multivariate process indices can achieve a good balance between sampling costs and estimation precision.  相似文献   

The robust principal components analysis (RPCA) introduced by Campbell (Applied Statistics 1980, 29, 231–237) provides in addition to robust versions of the usual output of a principal components analysis, weights for the contribution of each point to the robust estimation of each component. Low weights may thus be used to indicate outliers. The present simulation study provides critical values for testing the kth smallest weight in the RPCA of a sample of n p-dimensional vectors, under the null hypothesis of a multivariate normal distribution. The cases p=2(2)10, 15, 20 for n=20, 30, 40, 50, 75, 100 subject to n≥p/2, are examined, with k≤√n.  相似文献   

Every random q-vector with finite moments generates a set of orthonormal polynomials. These are generated from the basis functions xn = xn11xnqq using Gram–Schmidt orthogonalization. One can cycle through these basis functions using any number of ways. Here, we give results using minimum cycling. The polynomials look simpler when centered about the mean of X, and still simpler form when X is symmetric about zero. This leads to an extension of the multivariate Hermite polynomial for a general random vector symmetric about zero. As an example, the results are applied to the multivariate normal distribution.  相似文献   

Serfling and Xiao [A contribution to multivariate L-moments, L-comoment matrices. J Multivariate Anal. 2007;98:1765–1781] extended the L-moment theory to the multivariate setting. In the present paper, we focus on the two-dimensional random vectors to establish a link between the bivariate L-moments (BLM) and the underlying bivariate copula functions. This connection provides a new estimate of dependence parameters of bivariate statistical data. Extensive simulation study is carried out to compare estimators based on the BLM, the maximum likelihood, the minimum distance and a rank approximate Z-estimation. The obtained results show that, when the sample size increases, BLM-based estimation performs better as far as the bias and computation time are concerned. Moreover, the root-mean-squared error is quite reasonable and less sensitive in general to outliers than those of the above cited methods. Further, the proposed BLM method is an easy-to-use tool for the estimation of multiparameter copula models. A generalization of the BLM estimation method to the multivariate case is discussed.  相似文献   

Simultaneous estimation of scale parameters is considered in mixture distributions under squared-error loss. A general class of estimators is obtained which dominates the componentwise best multiple estimators and the moment estimators. As special cases, improved estimators are obtained for the multivariate t-distribution and the p-variate Lomax distribution.  相似文献   

Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

Linear mixed models are widely used when multiple correlated measurements are made on each unit of interest. In many applications, the units may form several distinct clusters, and such heterogeneity can be more appropriately modelled by a finite mixture linear mixed model. The classical estimation approach, in which both the random effects and the error parts are assumed to follow normal distribution, is sensitive to outliers, and failure to accommodate outliers may greatly jeopardize the model estimation and inference. We propose a new mixture linear mixed model using multivariate t distribution. For each mixture component, we assume the response and the random effects jointly follow a multivariate t distribution, to conveniently robustify the estimation procedure. An efficient expectation conditional maximization algorithm is developed for conducting maximum likelihood estimation. The degrees of freedom parameters of the t distributions are chosen data adaptively, for achieving flexible trade-off between estimation robustness and efficiency. Simulation studies and an application on analysing lung growth longitudinal data showcase the efficacy of the proposed approach.  相似文献   

Papers on the analysis of means (ANOM) have been circulating in the quality control literature for decades, routinely describing it as a statistical stand-alone concept. Therefore, we clarify that ANOM should rather be regarded as a special case of a much more universal approach known as multiple contrast tests (MCTs). Perceiving ANOM as a grand-mean-type MCT paves the way for implementing it in the open-source software R. We give a brief tutorial on how to exploit R's versatility and introduce the R package ANOM for drawing the familiar decision charts. Beyond that, we illustrate two practical aspects of data analysis with ANOM: firstly, we compare merits and drawbacks of ANOM-type MCTs and ANOVA F-test and assess their respective statistical powers, and secondly, we show that the benefit of using critical values from multivariate t-distributions for ANOM instead of simple Bonferroni quantiles is oftentimes negligible.  相似文献   


Several approximations of copulas have been proposed in the literature. By using empirical versions of checker-type copulas approximations, we propose non parametric estimators of the copula. Under some conditions, the proposed estimators are copulas and their main advantage is that they can be sampled from easily. One possible application is the estimation of quantiles of sums of dependent random variables from a small sample of the multivariate law and a full knowledge of the marginal laws. We show that estimations may be improved by including in an easy way in the approximated copula some additional information on the law of a sub-vector for example. Our approach is illustrated by numerical examples.  相似文献   

This article is devoted to the study of tail index estimation based on i.i.d. multivariate observations, drawn from a standard heavy-tailed distribution, that is, of which Pareto-like marginals share the same tail index. A multivariate central limit theorem for a random vector, whose components correspond to (possibly dependent) Hill estimators of the common tail index α, is established under mild conditions. We introduce the concept of (standard) heavy-tailed random vector of tail index α and show how this limit result can be used in order to build an estimator of α with small asymptotic mean squared error, through a proper convex linear combination of the coordinates. Beyond asymptotic results, simulation experiments illustrating the relevance of the approach promoted are also presented.  相似文献   


In survival or reliability data analysis, it is often useful to estimate the quantiles of the lifetime distribution, such as the median time to failure. Different nonparametric methods can construct confidence intervals for the quantiles of the lifetime distributions, some of which are implemented in commonly used statistical software packages. We here investigate the performance of different interval estimation procedures under a variety of settings with different censoring schemes. Our main objectives in this paper are to (i) evaluate the performance of confidence intervals based on the transformation approach commonly used in statistical software, (ii) introduce a new density-estimation-based approach to obtain confidence intervals for survival quantiles, and (iii) compare it with the transformation approach. We provide a comprehensive comparative study and offer some useful practical recommendations based on our results. Some numerical examples are presented to illustrate the methodologies developed.  相似文献   


This article considers the estimation of a distribution function FX(x) based on a random sample X1, X2, …, Xn when the sample is suspected to come from a close-by distribution F0(x). The new estimators, namely the preliminary test (PTE) and Stein-type estimator (SE) are defined and compared with the “empirical distribution function” (edf) under local departure. In this case, we show that Stein-type estimators are superior to edf and PTE is superior to edf when it is close to F0(x). As a by-product similar estimators are proposed for population quantiles.  相似文献   


We propose parametric inferences for quantile event times with adjustment for covariates on competing risks data. We develop parametric quantile inferences using parametric regression modeling of the cumulative incidence function from the cause-specific hazard and direct approaches. Maximum likelihood inferences are developed for estimation of the cumulative incidence function and quantiles. We develop the construction of parametric confidence intervals for quantiles. Simulation studies show that the proposed methods perform well. We illustrate the methods using early stage breast cancer data.  相似文献   


In this note, we use multivariate subordination to introduce a multivariate extension of the generalized asymmetric Laplace motion. The class introduced provides a unified framework for several multivariate extensions of the popular variance gamma process. We also show that the associated time one distribution extends the multivariate generalized asymmetric Laplace distributions proposed in the statistical literature.  相似文献   

This article considers an approach to estimating and testing a new Kronecker product covariance structure for three-level (multiple time points (p), multiple sites (u), and multiple response variables (q)) multivariate data. Testing of such covariance structure is potentially important for high dimensional multi-level multivariate data. The hypothesis testing procedure developed in this article can not only test the hypothesis for three-level multivariate data, but also can test many different hypotheses, such as blocked compound symmetry, for two-level multivariate data as special cases. The tests are implemented with two real data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号