期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Testing linear hypotheses in high-dimensional regressions

Zhidong Bai Dandan Jiang Jian-feng Yao 《Statistics》2013,47(6):1207-1223

For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative hypothesis requires complex analytic approximations, and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p≤20. On the other hand, assuming that the data dimension p as well as the number q of regression variables are fixed while the sample size n grows, several asymptotic approximations are proposed in the literature for Wilk's Λ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension p and a large sample size n. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null hypothesis and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large p and large n context, but also for moderately large data dimensions such as p=30 or p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in multivariate analysis of variance which is valid for high-dimensional data. 相似文献

2.

Outlier detection in contingency tables using decomposable graphical models

Mads Lindskou Poul Svante Eriksen Torben Tvedebrink 《Scandinavian Journal of Statistics》2020,47(2):347-360

For high-dimensional data, it is a tedious task to determine anomalies such as outliers. We present a novel outlier detection method for high-dimensional contingency tables. We use the class of decomposable graphical models to model the relationship among the variables of interest, which can be depicted by an undirected graph called the interaction graph. Given an interaction graph, we derive a closed-form expression of the likelihood ratio test (LRT) statistic and an exact distribution for efficient simulation of the test statistic. An observation is declared an outlier if it deviates significantly from the approximated distribution of the test statistic under the null hypothesis. We demonstrate the use of the LRT outlier detection framework on genetic data modeled by Chow–Liu trees. 相似文献

3.

Two-sample high-dimensional empirical likelihood

Jianglin Fang Xuewen Lu 《统计学通讯:理论与方法》2017,46(13):6323-6335

In this paper, we apply empirical likelihood for two-sample problems with growing high dimensionality. Our results are demonstrated for constructing confidence regions for the difference of the means of two p-dimensional samples and the difference in value between coefficients of two p-dimensional sample linear model. We show that empirical likelihood based estimator has the efficient property. That is, as p → ∞ for high-dimensional data, the limit distribution of the EL ratio statistic for the difference of the means of two samples and the difference in value between coefficients of two-sample linear model is asymptotic normal distribution. Furthermore, empirical likelihood (EL) gives efficient estimator for regression coefficients in linear models, and can be as efficient as a parametric approach. The performance of the proposed method is illustrated via numerical simulations. 相似文献

4.

A high-dimensional likelihood ratio test for circular symmetric covariance structure

Linqi Yi 《统计学通讯:理论与方法》2018,47(6):1392-1402

相似文献

5.

Change-Point Detection in Two-Phase Regression with Inequality Constraints on the Regression Parameters

K. Nosek 《统计学通讯:理论与方法》2014,43(5):932-946

Two-phase regression models with inequality constraints on the regression coefficients and with a small number of measurements is considered. A new test based on the likelihood ratio in linear model with inequality constraints for the presence of a change-point is proposed. Numerical approximations to the powers against various alternatives are given and compared with the powers of the likelihood ratio test in the two-phase regression models without inequality constraints, the backwards CUSUM test, and the k-linear-r-ahead recursive residuals tests. Performance of related likelihood based estimators of the change-point is briefly studied in a Monte Carlo experiment. 相似文献

6.

Robust Coordinate Descent Algorithm Robust Solution Path for High-dimensional Sparse Regression Modeling

H. Park S. Konishi 《统计学通讯:模拟与计算》2016,45(1):115-129

The L₁-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L₁-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L₁-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers. 相似文献

7.

Estimating individual-level interaction effects in multilevel models: a Monte Carlo simulation study with application*

Julie Ann Lorah 《Journal of applied statistics》2018,45(12):2238-2255

Moderated multiple regression provides a useful framework for understanding moderator variables. These variables can also be examined within multilevel datasets, although the literature is not clear on the best way to assess data for significant moderating effects, particularly within a multilevel modeling framework. This study explores potential ways to test moderation at the individual level (level one) within a 2-level multilevel modeling framework, with varying effect sizes, cluster sizes, and numbers of clusters. The study examines five potential methods for testing interaction effects: the Wald test, F-test, likelihood ratio test, Bayesian information criterion (BIC), and Akaike information criterion (AIC). For each method, the simulation study examines Type I error rates and power. Following the simulation study, an applied study uses real data to assess interaction effects using the same five methods. Results indicate that the Wald test, F-test, and likelihood ratio test all perform similarly in terms of Type I error rates and power. Type I error rates for the AIC are more liberal, and for the BIC typically more conservative. A four-step procedure for applied researchers interested in examining interaction effects in multi-level models is provided. 相似文献

8.

Variable selection in the high-dimensional continuous generalized linear model with current status data

Guo-Liang Tian Lixin Song 《Journal of applied statistics》2014,41(3):467-483

In survival studies, current status data are frequently encountered when some individuals in a study are not successively observed. This paper considers the problem of simultaneous variable selection and parameter estimation in the high-dimensional continuous generalized linear model with current status data. We apply the penalized likelihood procedure with the smoothly clipped absolute deviation penalty to select significant variables and estimate the corresponding regression coefficients. With a proper choice of tuning parameters, the resulting estimator is shown to be a root n/p_n-consistent estimator under some mild conditions. In addition, we show that the resulting estimator has the same asymptotic distribution as the estimator obtained when the true model is known. The finite sample behavior of the proposed estimator is evaluated through simulation studies and a real example. 相似文献

9.

On Small Sample Properties of the Wald,LR and LM Tests in a Linear Model with AR(1) Errors

Hideo Kozumi 《统计学通讯:模拟与计算》2013,42(4):1361-1375

When the error terms are autocorrelated, the conventional t-tests for individual regression coefficients mislead us to over-rejection of the null hypothesis. We examine, by Monte Carlo experiments, the small sample properties of the unrestricted estimator of ρ and of the estimator of ρ restricted by the null hypothesis. We compare the small sample properties of the Wald, likelihood ratio and Lagrange multiplier test statistics for individual regression coefficients. It is shown that when the null hypothesis is true, the unrestricted estimator of ρ is biased. It is also shown that the Lagrange multiplier test using the maximum likelihood estimator of ρ performs better than the Wald and likelihood ratio tests. 相似文献

10.

Small-sample likelihood inference in extreme-value regression models

《Journal of Statistical Computation and Simulation》2012,82(3):582-595

We deal with a general class of extreme-value regression models introduced by Barreto-Souza and Vasconcellos [Bias and skewness in a general extreme-value regression model, Comput. Statist. Data Anal. 55 (2011), pp. 1379–1393]. Our goal is to derive an adjusted likelihood ratio statistic that is approximately distributed as χ² with a high degree of accuracy. Although the adjusted statistic requires more computational effort than its unadjusted counterpart, it is shown that the adjustment term has a simple compact form that can be easily implemented in standard statistical software. Further, we compare the finite-sample performance of the three classical tests (likelihood ratio, Wald, and score), the gradient test that has been recently proposed by Terrell [The gradient statistic, Comput. Sci. Stat. 34 (2002), pp. 206–215], and the adjusted likelihood ratio test obtained in this article. Our simulations favour the latter. Applications of our results are presented. 相似文献

11.

Empirical likelihood inference for a semiparametric hazards regression model

Wei Chen Dehui Wang 《统计学通讯:理论与方法》2013,42(11):3236-3248

ABSTRACT

We investigated the empirical likelihood inference approach under a general class of semiparametric hazards regression models with survival data subject to right-censoring. An empirical likelihood ratio for the full 2p regression parameters involved in the model is obtained. We showed that it converged weakly to a random variable which could be written as a weighted sum of 2p independent chi-squared variables with one degree of freedom. Using this, we could construct a confidence region for parameters. We also suggested an adjusted version for the preceding statistic, whose limit followed a standard chi-squared distribution with 2p degrees of freedom. 相似文献

12.

A variable selection proposal for multiple linear regression analysis

《Journal of Statistical Computation and Simulation》2012,82(12):2095-2105

Variable selection in multiple linear regression models is considered. It is shown that for the special case of orthogonal predictor variables, an adaptive pre-test-type procedure proposed by Venter and Steel [Simultaneous selection and estimation for the some zeros family of normal models, J. Statist. Comput. Simul. 45 (1993), pp. 129–146] is almost equivalent to least angle regression, proposed by Efron et al. [Least angle regression, Ann. Stat. 32 (2004), pp. 407–499]. A new adaptive pre-test-type procedure is proposed, which extends the procedure of Venter and Steel to the general non-orthogonal case in a multiple linear regression analysis. This new procedure is based on a likelihood ratio test where the critical value is determined data-dependently. A practical illustration and results from a simulation study are presented. 相似文献

13.

C472. The expected value of a conditional variance: An upper bound

《Journal of Statistical Computation and Simulation》2012,82(8):609-612

The article derives Bartlett corrections for improving the chi-square approximation to the likelihood ratio statistics in a class of symmetric nonlinear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. In this paper we present, in matrix notation, Bartlett corrections to likelihood ratio statistics in nonlinear regression models with errors that follow a symmetric distribution. We generalize the results obtained by Ferrari, S. L. P. and Arellano-Valle, R. B. (1996). Modified likelihood ratio and score tests in linear regression models using the t distribution. Braz. J. Prob. Statist., 10, 15–33, who considered a t distribution for the errors, and by Ferrari, S. L. P. and Uribe-Opazo, M. A. (2001). Corrected likelihood ratio tests in a class of symmetric linear regression models. Braz. J. Prob. Statist., 15, 49–67, who considered a symmetric linear regression model. The formulae derived are simple enough to be used analytically to obtain several Bartlett corrections in a variety of important models. We also present simulation results comparing the sizes and powers of the usual likelihood ratio tests and their Bartlett corrected versions. 相似文献

14.

A lack-of-fit test for parametric zero-inflated Poisson models

《Journal of Statistical Computation and Simulation》2012,82(9):1081-1098

Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology. 相似文献

15.

Bootstrap versus traditional hypothesis testing procedures for coefficients in least absolute value regression

《Journal of Statistical Computation and Simulation》2012,82(8):665-675

Several approaches to hypothesis testing for coefficients in least absolute value regression are compared using a Monte Carlo simulation: likelihood ratio test, Lagrange multiplier test, and three versions of the bootstrap hypothesis test. Factors considered that might influence test performance include the disturbance distribution, the type of independent variable, and the sample size. Overall, the likelihood ratio and the bootstrap tests perform best, with the likelihood ratio test being marginally more powerful. Least absolute value tests are also compared to the standard t test and three versions of the bootstrapped t test for least squares regression. 相似文献

16.

On testing inference in beta regressions

《Journal of Statistical Computation and Simulation》2012,82(1):186-203

This article deals with testing inference in the class of beta regression models with varying dispersion. We focus on inference in small samples. We perform a numerical analysis in order to evaluate the sizes and powers of different tests. We consider the likelihood ratio test, two adjusted likelihood ratio tests proposed by Ferrari and Pinheiro [Improved likelihood inference in beta regression, J. Stat. Comput. Simul. 81 (2011), pp. 431–443], the score test, the Wald test and bootstrap versions of the likelihood ratio, score and Wald tests. We perform tests on the parameters that index the mean submodel and also on the parameters in the linear predictor of the precision submodel. Overall, the numerical evidence favours the bootstrap tests. It is also shown that the score test is considerably less size-distorted than the likelihood ratio and Wald tests. An application that uses real (not simulated) data is presented and discussed. 相似文献

17.

Powerful Parametric Tests Based on Sum‐Functions of Spacings

Magnus Ekström 《Scandinavian Journal of Statistics》2013,40(4):886-898

Assume that we have a sequence of n independent and identically distributed random variables with a continuous distribution function F, which is specified up to a few unknown parameters. In this paper, tests based on sum‐functions of sample spacings are proposed, and large sample theory of the tests are presented under simple null hypotheses as well as under close alternatives. Tests, which are optimal within this class, are constructed, and it is noted that these tests have properties that closely parallel those of the likelihood ratio test in regular parametric models. Some examples are given, which show that the proposed tests work also in situations where the likelihood ratio test breaks down. Extensions to more general hypotheses are discussed. 相似文献

18.

Modified likelihood ratio statistics for inflated beta regressions

《Journal of Statistical Computation and Simulation》2012,82(5):982-998

The class of inflated beta regression models generalizes that of beta regressions [S.L.P. Ferrari and F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat. 31 (2004), pp. 799–815] by incorporating a discrete component that allows practitioners to model data on rates and proportions with observations that equal an interval limit. For instance, one can model responses that assume values in (0, 1]. The likelihood ratio test tends to be quite oversized (liberal, anticonservative) in inflated beta regressions estimated with a small number of observations. Indeed, our numerical results show that its null rejection rate can be almost twice the nominal level. It is thus important to develop alternative testing strategies. This paper develops small-sample adjustments to the likelihood ratio and signed likelihood ratio test statistics in inflated beta regression models. The adjustments do not require orthogonality between the parameters of interest and the nuisance parameters and are fairly simple since they only require first- and second-order log-likelihood cumulants. Simulation results show that the modified likelihood ratio tests deliver much accurate inference in small samples. An empirical application is presented and discussed. 相似文献

19.

A test for the linearity of the nonparametric part of a semiparametric logistic regression model

Chin-Shang Li 《Journal of applied statistics》2016,43(3):461-475

A semiparametric logistic regression model is proposed in which its nonparametric component is approximated with fixed-knot cubic B-splines. To assess the linearity of the nonparametric component, we construct a penalized likelihood ratio test statistic. When the number of knots is fixed, the null distribution of the test statistic is shown to be asymptotically the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom. We set the asymptotic null expectation of this test statistic equal to a value to determine the smoothing parameter value. Monte Carlo experiments are conducted to investigate the performance of the proposed test. Its practical use is illustrated with a real-life example. 相似文献

20.

R 2 Measures Based on Wald and Likelihood Ratio Joint Significance Tests

Lonnie Magee 《The American statistician》2013,67(3):250-253

Two methods are suggested for generating R ² measures for a wide class of models. These measures are linked to the R ² of the standard linear regression model through Wald and likelihood ratio statistics for testing the joint significance of the explanatory variables. Some currently used R ²'s are shown to be special cases of these methods. 相似文献