期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Decomposition of the main effects and interaction term by using orthogonal polynomials in multiple non symmetrical correspondence analysis

Antonello D'Ambra Pietro Amenta Anna Crisci 《统计学通讯:理论与方法》2017,46(20):10179-10188

The multiple non symmetric correspondence analysis (MNSCA) is a useful technique for analyzing a two-way contingency table. In more complex cases, the predictor variables are more than one. In this paper, the MNSCA, along with the decomposition of the Gray–Williams Tau index, in main effects and interaction term, is used to analyze a contingency table with two predictor categorical variables and an ordinal response variable. The Multiple-Tau index is a measure of association that contains both main effects and interaction term. The main effects represent the change in the response variables due to the change in the level/categories of the predictor variables, considering the effects of their addition, while the interaction effect represents the combined effect of predictor categorical variables on the ordinal response variable. Moreover, for ordinal scale variables, we propose a further decomposition in order to check the existence of power components by using Emerson's orthogonal polynomials. 相似文献

2.

Visualizing main effects and interaction in multiple non-symmetric correspondence analysis

Luigi D'Ambra Antonello D'Ambra 《Journal of applied statistics》2012,39(10):2165-2175

Non-symmetric correspondence analysis (NSCA) is a useful technique for analysing a two-way contingency table. Frequently, the predictor variables are more than one; in this paper, we consider two categorical variables as predictor variables and one response variable. Interaction represents the joint effects of predictor variables on the response variable. When interaction is present, the interpretation of the main effects is incomplete or misleading. To separate the main effects and the interaction term, we introduce a method that, starting from the coordinates of multiple NSCA and using a two-way analysis of variance without interaction, allows a better interpretation of the impact of the predictor variable on the response variable. The proposed method has been applied on a well-known three-way contingency table proposed by Bockenholt and Bockenholt in which they cross-classify subjects by person's attitude towards abortion, number of years of education and religion. We analyse the case where the variables education and religion influence a person's attitude towards abortion. 相似文献

3.

The construction of a partial least-squares biplot

Opeoluwa F. Oyedele Sugnet Lubbe 《Journal of applied statistics》2015,42(11):2449-2460

Biplots are useful tools to explore the relationship among variables. In this paper, the specific regression relationship between a set of predictors X and set of response variables Y by means of partial least-squares (PLS) regression is represented. The PLS biplot provides a single graphical representation of the samples together with the predictor and response variables, as well as their interrelationships in terms of the matrix of regression coefficients. 相似文献

4.

Extension of biplot methodology to multivariate regression analysis

Opeoluwa F. Oyedele 《Journal of applied statistics》2021,48(10):1816

At the core of multivariate statistics is the investigation of relationships between different sets of variables. More precisely, the inter-variable relationships and the causal relationships. The latter is a regression problem, where one set of variables is referred to as the response variables and the other set of variables as the predictor variables. In this situation, the effect of the predictors on the response variables is revealed through the regression coefficients. Results from the resulting regression analysis can be viewed graphically using the biplot. The consequential biplot provides a single graphical representation of the samples together with the predictor variables and response variables. In addition, their effect in terms of the regression coefficients can be visualized, although sub-optimally, in the said biplot.KEYWORDS: Biplot, regression analysis, multivariate regression, rank approximation 相似文献

5.

Robust group-Lasso for functional regression model

Jasdeep Pannu Nedret Billor 《统计学通讯:模拟与计算》2017,46(5):3356-3374

In this article, we consider the problem of selecting functional variables using the L1 regularization in a functional linear regression model with a scalar response and functional predictors, in the presence of outliers. Since the LASSO is a special case of the penalized least-square regression with L1 penalty function, it suffers from the heavy-tailed errors and/or outliers in data. Recently, Least Absolute Deviation (LAD) and the LASSO methods have been combined (the LAD-LASSO regression method) to carry out robust parameter estimation and variable selection simultaneously for a multiple linear regression model. However, variable selection of the functional predictors based on LASSO fails since multiple parameters exist for a functional predictor. Therefore, group LASSO is used for selecting functional predictors since group LASSO selects grouped variables rather than individual variables. In this study, we propose a robust functional predictor selection method, the LAD-group LASSO, for a functional linear regression model with a scalar response and functional predictors. We illustrate the performance of the LAD-group LASSO on both simulated and real data. 相似文献

6.

Average Collapsibility of Distribution Dependence and Quantile Regression Coefficients

P. VELLAISAMY 《Scandinavian Journal of Statistics》2012,39(1):153-165

Abstract. The Yule–Simpson paradox notes that an association between random variables can be reversed when averaged over a background variable. Cox and Wermuth introduced the concept of distribution dependence between two random variables X and Y, and gave two dependence conditions, each of which guarantees that reversal of qualitatively similar conditional dependences cannot occur after marginalizing over the background variable. Ma, Xie and Geng studied the uniform collapsibility of distribution dependence over a background variable W, under stronger homogeneity condition. Collapsibility ensures that associations are the same for conditional and marginal models. In this article, we use the notion of average collapsibility, which requires only the conditional effects average over the background variable to the corresponding marginal effect and investigate its conditions for distribution dependence and for quantile regression coefficients. 相似文献

7.

Multiply robust inference for statistical interactions

Vansteelandt S Vanderweele TJ Robins JM 《Journal of the American Statistical Association》2008,103(484):1693-1704

A primary focus of an increasing number of scientific studies is to determine whether two exposures interact in the effect that they produce on an outcome of interest. Interaction is commonly assessed by fitting regression models in which the linear predictor includes the product between those exposures. When the main interest lies in the interaction, this approach is not entirely satisfactory because it is prone to (possibly severe) bias when the main exposure effects or the association between outcome and extraneous factors are misspecified. In this article, we therefore consider conditional mean models with identity or log link which postulate the statistical interaction in terms of a finite-dimensional parameter, but which are otherwise unspecified. We show that estimation of the interaction parameter is often not feasible in this model because it would require nonparametric estimation of auxiliary conditional expectations given high-dimensional variables. We thus consider 'multiply robust estimation' under a union model that assumes at least one of several working submodels holds. Our approach is novel in that it makes use of information on the joint distribution of the exposures conditional on the extraneous factors in making inferences about the interaction parameter of interest. In the special case of a randomized trial or a family-based genetic study in which the joint exposure distribution is known by design or by Mendelian inheritance, the resulting multiply robust procedure leads to asymptotically distribution-free tests of the null hypothesis of no interaction on an additive scale. We illustrate the methods via simulation and the analysis of a randomized follow-up study. 相似文献

8.

A simulation based method for assessing the statistical significance of logistic regression models after common variable selection procedures

Tristan R. Grogan David A. Elashoff 《统计学通讯:模拟与计算》2017,46(9):7180-7193

Classification models can demonstrate apparent prediction accuracy even when there is no underlying relationship between the predictors and the response. Variable selection procedures can lead to false positive variable selections and overestimation of true model performance. A simulation study was conducted using logistic regression with forward stepwise, best subsets, and LASSO variable selection methods with varying total sample sizes (20, 50, 100, 200) and numbers of random noise predictor variables (3, 5, 10, 15, 20, 50). Using our critical values can help reduce needless follow-up on variables having no true association with the outcome. 相似文献

9.

Partial functional linear regression

Hyejin Shin 《Journal of statistical planning and inference》2009

It is frequently the case that a response will be related to both a vector of finite length and a function-valued random variable as predictor variables. In this paper, we propose new estimators for the parameters of a partial functional linear model which explores the relationship between a scalar response variable and mixed-type predictors. Asymptotic properties of the proposed estimators are established and finite sample behavior is studied through a small simulation experiment. 相似文献

10.

Testing the difference between two independent regression models

Mohammad Reza Mahmoudi Marziyeh Mahmoudi Elaheh Nahavandi 《统计学通讯:理论与方法》2013,42(21):6284-6289

ABSTRACT

In some situations, for example, in biology or psychology studies, we wish to determine whether the linear relationship between response variable and predictor variables differs in two populations. The analysis of the covariance (ANCOVA) or, equivalently, the partial F-test approaches are the commonly used methods. In this study, the asymptotic distribution for the difference between two independent regression coefficients was established. The proposed method was used to derive the asymptotic confidence set for the difference between coefficients and hypothesis testing for the equality of the two regression models. Then a simulation study was conducted to compare the proposed method with the partial F method. The performance of the new method was comparable with that of the partial F method. 相似文献

11.

Seemingly unrelated regression tree

Jaeoh Kim 《Journal of applied statistics》2019,46(7):1177-1195

Nonparametric seemingly unrelated regression provides a powerful alternative to parametric seemingly unrelated regression for relaxing the linearity assumption. The existing methods are limited, particularly with sharp changes in the relationship between the predictor variables and the corresponding response variable. We propose a new nonparametric method for seemingly unrelated regression, which adopts a tree-structured regression framework, has satisfiable prediction accuracy and interpretability, no restriction on the inclusion of categorical variables, and is less vulnerable to the curse of dimensionality. Moreover, an important feature is constructing a unified tree-structured model for multivariate data, even though the predictor variables corresponding to the response variable are entirely different. This unified model can offer revelatory insights such as underlying economic meaning. We propose the key factors of tree-structured regression, which are an impurity function detecting complex nonlinear relationships between the predictor variables and the response variable, split rule selection with negligible selection bias, and tree size determination solving underfitting and overfitting problems. We demonstrate our proposed method using simulated data and illustrate it using data from the Korea stock exchange sector indices. 相似文献

12.

Direction dependence in a regression line

Yadolah Dodge Valentin Rousson 《统计学通讯:理论与方法》2013,42(9-10):1957-1972

In this paper, we derive some simple formulae to express the association between two random variables in the case of a linear relationship, One of these representations, the cube of the correlation coefficient, is given as the ratio of the skewness of the response variable to that of the explanatory variable. This result, along with other expressions of the correlation coefficient presented in this paper, has implications for choosing the response variable in a linear regression modelling. 相似文献

13.

Review: Reversed low-rank ANOVA model for transforming high dimensional genetic data into low dimension

Yoonsuh Jung Jianhua Hu 《Journal of the Korean Statistical Society》2019,48(2):169-178

A general modeling procedure for analyzing genetic data is reviewed. We review ANOVA type model that can handle both the continuous and discrete genetic variables in one modeling framework. Unlike the regression type models which typically set the phenotype variable as a response, this ANOVA model treats the phenotype variable as an explanatory variable. By reversely treating the phenotype variable, usual high dimensional problem is turned into low dimension. Instead, the ANOVA model always includes interaction term between the genetic locations and phenotype variable to find potential association between them. The interaction term is designed to be low rank with the multiplication of bilinear terms so that the required number of parameters is kept in a manageable degree. We compare the performance of the reviewed ANOVA model to the other popular methods via microarray and SNP data sets. 相似文献

14.

Testing the equality of two independent regression models

Mohammad Reza Mahmoudi Mohsen Maleki Abbas Pak 《统计学通讯:理论与方法》2018,47(12):2919-2926

In some situations, for example in agriculture, biology, hydrology, and psychology, researchers wish to determine whether the relationship between response variable and predictor variables differs in two populations. In other words, we are interested in comparing two regression models for two independent datasets. In this work, we will use the parametric and nonparametric methods to establish hypothesis testing for the equality of two independent regression models. Then the simulation study is provided to investigate the performance of the proposed method. 相似文献

15.

Linear Transformations of Linear Mixed-Effects Models

Christopher H. Morrell Jay D. Pearson Larry J. Brant 《The American statistician》2013,67(4):338-343

A number of articles have discussed the way lower order polynomial and interaction terms should be handled in linear regression models. Only if all lower order terms are included in the model will the regression model be invariant with respect to coding transformations of the variables. If lower order terms are omitted, the regression model will not be well formulated. In this paper, we extend this work to examine the implications of the ordering of variables in the linear mixed-effects model. We demonstrate how linear transformations of the variables affect the model and tests of significance of fixed effects in the model. We show how the transformations modify the random effects in the model, as well as their covariance matrix and the value of the restricted log-likelihood. We suggest a variable selection strategy for the linear mixed-effects model. 相似文献

16.

Statistical analysis of rock-burst events in underground mines and excavations to present reasonable data-driven predictors

Sajjad Afraei Sayyed Hasan Madani 《Journal of Statistical Computation and Simulation》2017,87(17):3336-3376

Rock bursts are sudden and violent failures of surrounding rockmasses in underground mines and excavations. In this paper, a database consisting of 188 case histories was collected. Each case history contains some of the predictor variables ‘overburden thickness, maximum tangential stress, uniaxial compressive strength of rock, tensile strength of rock, stress ratio, brittleness ratio and elastic energy index’ and one of the four defined classes for the dependent variable ‘rock burst intensity’. A strategy, including ‘outlier detection and substitution, normality evaluation, deduction of distribution functions, estimation of mean and mean variation ranges, evaluation of mean-equality and distribution function-equality hypotheses, correlation analysis and factor analysis for in-review variables’, was implemented. The strategy led to conclude that some predictor variables with available case histories have no contributions for rock burst prediction. These inferences were in accordance with the results of regression techniques for qualitative dependent variables. Besides, many predictor variable arrangements were incompatible with factor analysis. In the case of compatible arrangements, the variation of the predictor variables cannot be considerably reflected. Application of nonlinear principal component analysis using auto-associative neural networks did not also lead to representative components. Therefore, the significant predictor variables can only be used to design new classifiers. 相似文献

17.

Instrumental variable estimation in ordinal probit models with mismeasured predictors

Jing Guan Hongjian Cheng Kenneth A. Bollen D. Roland Thomas Liqun Wang 《Revue canadienne de statistique》2019,47(4):653-667

Researchers in the medical, health, and social sciences routinely encounter ordinal variables such as self‐reports of health or happiness. When modelling ordinal outcome variables, it is common to have covariates, for example, attitudes, family income, retrospective variables, measured with error. As is well known, ignoring even random error in covariates can bias coefficients and hence prejudice the estimates of effects. We propose an instrumental variable approach to the estimation of a probit model with an ordinal response and mismeasured predictor variables. We obtain likelihood‐based and method of moments estimators that are consistent and asymptotically normally distributed under general conditions. These estimators are easy to compute, perform well and are robust against the normality assumption for the measurement errors in our simulation studies. The proposed method is applied to both simulated and real data. The Canadian Journal of Statistics 47: 653–667; 2019 © 2019 Statistical Society of Canada 相似文献

18.

Model selection for logistic regression via association rules analysis

Pannapa Changpetch Dennis K.J. Lin 《Journal of Statistical Computation and Simulation》2013,83(8):1415-1428

Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically ignored. We propose a model selection procedure by implementing an association rules analysis. We do this by (1) exploring the combinations of input variables which have significant impacts to response (via association rules analysis); (2) selecting the potential (low- and high-order) interactions; (3) converting these potential interactions into new dummy variables; and (4) performing variable selections among all the input variables and the newly created dummy variables (interactions) to build up the optimal logistic regression model. Our model selection procedure establishes the optimal combination of main effects and potential interactions. The comparisons are made through thorough simulations. It is shown that the proposed method outperforms the existing methods in all cases. A real-life example is discussed in detail to demonstrate the proposed method. 相似文献

19.

A unified approach to estimation of nonlinear mixed effects and Berkson measurement error models

Liqun Wang 《Revue canadienne de statistique》2007,35(2):233-248

Mixed effects models and Berkson measurement error models are widely used. They share features which the author uses to develop a unified estimation framework. He deals with models in which the random effects (or measurement errors) have a general parametric distribution, whereas the random regression coefficients (or unobserved predictor variables) and error terms have nonparametric distributions. He proposes a second-order least squares estimator and a simulation-based estimator based on the first two moments of the conditional response variable given the observed covariates. He shows that both estimators are consistent and asymptotically normally distributed under fairly general conditions. The author also reports Monte Carlo simulation studies showing that the proposed estimators perform satisfactorily for relatively small sample sizes. Compared to the likelihood approach, the proposed methods are computationally feasible and do not rely on the normality assumption for random effects or other variables in the model. 相似文献

20.

A Graphical Tool for Interpreting Regression Coefficients of Trinomial Logit Models

Flavio Santi Maria Michela Dickson Giuseppe Espa 《The American statistician》2019,73(2):200-207

Multinomial logit (also termed multi-logit) models permit the analysis of the statistical relation between a categorical response variable and a set of explicative variables (called covariates or regressors). Although multinomial logit is widely used in both the social and economic sciences, the interpretation of regression coefficients may be tricky, as the effect of covariates on the probability distribution of the response variable is nonconstant and difficult to quantify. The ternary plots illustrated in this article aim at facilitating the interpretation of regression coefficients and permit the effect of covariates (either singularly or jointly considered) on the probability distribution of the dependent variable to be quantified. Ternary plots can be drawn both for ordered and for unordered categorical dependent variables, when the number of possible outcomes equals three (trinomial response variable); these plots allow not only to represent the covariate effects over the whole parameter space of the dependent variable but also to compare the covariate effects of any given individual profile. The method is illustrated and discussed through analysis of a dataset concerning the transition of master’s graduates of the University of Trento (Italy) from university to employment. 相似文献