共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Carles M. Cuadras 《The American statistician》2013,67(4):256-258
This article provides a method of interpreting a surprising inequality in multiple linear regression: the squared multiple correlation can be greater than the sum of the simple squared correlations between the response variable and each of the predictor variables. The interpretation is obtained via principal component analysis by studying the influence of some components with small variance on the response variable. One example is used as an illustration and some conclusions are derived. 相似文献
3.
A method is described for determining the sample size required for a specified precision simultaneous confidence statement about the parameters of a multinomial population. The method is based on a simultaneous confidence interval procedure due to Goodman, and the results are compared with those obtained by separately considering each cell of the multinomial population as a binomial. 相似文献
4.
A recent article in this journal presented a variety of expressions for the coefficient of determination (R 2) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R 2 statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R 2 statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example. 相似文献
5.
A regression approach to principal component analysis is presented in this note. We provide an alternative interpretation of principal components that illustrates the relation between the extra sum of squares in regression analysis and the eigenvalues associated with the principal components. 相似文献
6.
A versatile graphical tool, the BLiP plot, was developed for displaying one-dimensional data. The basic building blocks are boxes, lines, and points. Like many standard one-dimensional distribution plots, the BLiP plot is capable of displaying individual data values in points or lines and grouped information in lines or boxes. In addition, the BLiP plot includes many new features such as variable-width plots and several choices of point patterns. The main advantage of the BLiP plot is that it provides users with basic graphical elements in a friendly and flexible environment so that users can, according to their needs, construct anything from a simple, standard plot to a complex, customized plot to best present their data. 相似文献
7.
Multinomial logit (also termed multi-logit) models permit the analysis of the statistical relation between a categorical response variable and a set of explicative variables (called covariates or regressors). Although multinomial logit is widely used in both the social and economic sciences, the interpretation of regression coefficients may be tricky, as the effect of covariates on the probability distribution of the response variable is nonconstant and difficult to quantify. The ternary plots illustrated in this article aim at facilitating the interpretation of regression coefficients and permit the effect of covariates (either singularly or jointly considered) on the probability distribution of the dependent variable to be quantified. Ternary plots can be drawn both for ordered and for unordered categorical dependent variables, when the number of possible outcomes equals three (trinomial response variable); these plots allow not only to represent the covariate effects over the whole parameter space of the dependent variable but also to compare the covariate effects of any given individual profile. The method is illustrated and discussed through analysis of a dataset concerning the transition of master’s graduates of the University of Trento (Italy) from university to employment. 相似文献
8.
The behavior of the sample coefficient of determination is examined for some arrangements of independent variable values in a simple linear regression with normally distributed error terms. Numerical values of means and standard deviations are presented that provide some insight into the influence of range and arrangement of independent variable values and sample size on the size of the sample coefficient of determination. Some asymptotic results are given. 相似文献
9.
《统计学通讯:理论与方法》2013,42(10):2409-2422
Abstract A simple method based on sliced inverse regression (SIR) is proposed to explore an effective dimension reduction (EDR) vector for the single index model. We avoid the principle component analysis step of the original SIR by using two sample mean vectors in two slices of the response variable and their difference vector. The theories become simpler, the method is equivalent to the multiple linear regression with dichotomized response, and the estimator can be expressed by a closed form, although the objective function might be an unknown nonlinear. It can be applied for the case when the number of covariates is large, and it requires no matrix operation or iterative calculation. 相似文献
10.
本文针对固定效应面板线性回归模型中特意误差项为任意形式序列相关情形,提出了移动分块经验似然估计方法,并给出了大样本性质。模拟研究表明:该方法适用于特意误差项序列相关形式已知和形式未知两种情形,较Baltagi和Li(1994)以及Gon?alves(2011)提出的方法有效。本文采用该方法对CO2排放量与城市化水平之间的关系进行了实证分析,结果表明:城市化水平对CO2排放量有显著影响,不同城市化阶段对CO2排放量影响不同。 相似文献
11.
David A. Freedman 《商业与经济统计学杂志》2013,31(1):131-133
This note describes a situation in which a simple mathematical model helped solve an important practical problem: how to price water fairly. It is intended as an example, rather than as a mathematical contribution to control theory. 相似文献
12.
In this article, we present a straightforward Bonferroni approach for determining sample size for estimating the mean vector of a multivariate population under two scenarios: (1) a pre-specified overall confidence level is desired; and (2) a pre-specified confidence level needs to be guaranteed for each individual variable. It is demonstrated that correlation between variables helps reduce the sample size. The formula to calculate the reduced sample size is derived. A binormal example is presented to illustrate the effect of correlation on sample size reduction for various values of the correlation coefficient. 相似文献
13.
Goodness-of-fit statistics for general multiple-linear-regression equations are reviewed for the case of replicated responses. A modification of the coefficient of determination is recommended. This statistic has 1.0 as its achievable upper bound and has the coefficient of determination as a special case. It indicates more effectively how close a general-linear-regression equation is relative to the best possible one and is particularly useful when the purpose is to ascertain whether higher-order terms of a given set of explanatory variables are required. Other goodness-of-fit statistics that take into account the variation within replicated responses are reviewed. An illustration example is presented. 相似文献
14.
Eva Elvers Carl Erik SRNDAL Jan H. Wretman Gran
RNBERG 《Revue canadienne de statistique》1985,13(3):185-199
In most surveys, inference for domains poses a difficult problem because of data shortage. This paper presents a probability sampling theory approach to some common types of statistical analysis for domains of a surveyed population. Simple and multiple regression analysis, and analysis of ratios are considered. Two new methods are constructed and explored which can improve substantially over the common method based on sample-weighted sums of squares and products. These new methods use auxiliary variables whose importance depends on the extent to which they succeed in explaining certain patterns in the regression residuals. The theoretical conclusions are supported by empirical results from Monte Carlo experiments. 相似文献
15.
《Journal of Statistical Computation and Simulation》2012,82(5):379-390
A basic graphical approach for checking normality is the Q - Q plot that compares sample quantiles against the population quantiles. In the univariate analysis, the probability plot correlation coefficient test for normality has been studied extensively. We consider testing the multivariate normality by using the correlation coefficient of the Q - Q plot. When multivariate normality holds, the sample squared distance should follow a chi-square distribution for large samples. The plot should resemble a straight line. A correlation coefficient test can be constructed by using the pairs of points in the probability plot. When the correlation coefficient test does not reject the null hypothesis, the sample data may come from a multivariate normal distribution or some other distributions. So, we use the following two steps to test multivariate normality. First, we check the multivariate normality by using the probability plot correction coefficient test. If the test does not reject the null hypothesis, then we test symmetry of the distribution and determine whether multivariate normality holds. This test procedure is called the combination test. The size and power of this test are studied, and it is found that the combination test, in general, is more powerful than other tests for multivariate normality. 相似文献
16.
Paul R. Rosenbaum 《The American statistician》2013,67(4):265-266
The history of the analysis of unbalanced factorial designs is traced from Yates's original papers (Yates 1933, 1934) to the beginning of the computational revolution in the 1960s. Emphasis is placed on putting the methods proposed during this period in perspective in view of our present understanding. 相似文献
17.
Sahu Sujit K. Dey Dipak K. Aslanidou Helen Sinha Debajyoti 《Lifetime data analysis》1997,3(2):123-137
Frequently in the analysis of survival data, survival times within the same group are correlated due to unobserved co-variates.
One way these co-variates can be included in the model is as frailties. These frailty random block effects generate dependency
between the survival times of the individuals which are conditionally independent given the frailty. Using a conditional proportional
hazards model, in conjunction with the frailty, a whole new family of models is introduced. By considering a gamma frailty
model, often the issue is to find an appropriate model for the baseline hazard function. In this paper a flexible baseline
hazard model based on a correlated prior process is proposed and is compared with a standard Weibull model. Several model
diagnostics methods are developed and model comparison is made using recently developed Bayesian model selection criteria.
The above methodologies are applied to the McGilchrist and Aisbett (1991) kidney infection data and the analysis is performed
using Markov Chain Monte Carlo methods.
This revised version was published online in July 2006 with corrections to the Cover Date. 相似文献
18.
Tsung-Shan Tsou 《统计学通讯:理论与方法》2013,42(9):1350-1360
A parametric robust test is proposed for comparing several coefficients of variation. This test is derived by properly correcting the normal likelihood function according to the technique suggested by Royall and Tsou. The proposed test statistic is asymptotically valid for general random variables, as long as their underlying distributions have finite fourth moments. Simulation studies and real data analyses are provided to demonstrate the effectiveness of the novel robust procedure. 相似文献
19.
In trying to establish the relationship between a yearly fisheries recruitment series and meteorological or oceanographic variables such as air pressure or sea surface temperature, we are often faced with the situation where the number of regressors exceeds the number of observations. In this paper we use the techniques of penalized least squares and principal-components regression to determine whether air pressure over the North Atlantic can be used to predict two North Atlantic cod recruitment series. The results suggest that penalized least squares can be very effective in these situations. 相似文献
20.
For mixed regression models, we define a variance decomposition including three terms, explained individual variance, unexplained individual variance and noise variance. In contrast to traditional variance decomposition, it is thus the unexplained , not the explained, variance that is split. It gives rise to a coefficient of individual determination (CID) defined as the estimated fraction of explained individual variance. We argue that in many applications CID is a valuable complement to R2, since it excludes noise variance (which can never be explained) and thus has one as a natural upper bound. 相似文献