首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Adjusted variable plots are useful in linear regression for outlier detection and for qualitative evaluation of the fit of a model. In this paper, we extend adjusted variable plots to Cox's proportional hazards model for possibly censored survival data. We propose three different plots: a risk level adjusted variable (RLAV) plot in which each observation in each risk set appears, a subject level adjusted variable (SLAV) plot in which each subject is represented by one point, and an event level adjusted variable (ELAV) plot in which the entire risk set at each failure event is represented by a single point. The latter two plots are derived from the RLAV by combining multiple points. In each point, the regression coefficient and standard error from a Cox proportional hazards regression is obtained by a simple linear regression through the origin fit to the coordinates of the pictured points. The plots are illustrated with a reanalysis of a dataset of 65 patients with multiple myeloma.  相似文献   

2.
Quantile-quantile plots are most commonly used to compare the shapes of distributions, but they may also be used in conjunction with partial orders on distributions to compare the level and dispersion of distributions that have different shapes. We discuss several easily recognized patterns in quantile-quantile plots that suffice to demonstrate that one distribution is smaller than another in terms of each of several partial orders. We illustrate with financial applications, proposing a quantile plot for comparing the risks and returns of portfolios of investments. As competing portfolios have distributions that differ in level, dispersion, and shape, it is not sufficient to compare portfolios using measures of location and dispersion, such as expected returns and variances; however, quantile plots, with suitable scaling, do aid in such comparisons. In two plots, we compare specific portfolios to the stock market as a whole, finding these portfolios to have higher returns, greater risks or dispersion, thicker tails than their greater dispersion alone would justify. Nonetheless, investors in these risky portfolios are more than adequately compensated for the risks undertaken.  相似文献   

3.
This paper examines the effect of randomisation restrictions, either to satisfy conditions for a balanced incomplete block design or to attain a higher level of partial neighbour balance, on the average variance of pair-wise treatment contrasts under a neighbour model discussed by Gleeson & Cullis (1987). Results suggest that smaller average pairwise variances can be obtained by ignoring requirements for incomplete block designs and concentrating on achieving a higher level of partial neighbour balance. Field layout of the design, although often determined by practical constraints, e.g. size, shape of site, minimum plot size and experimental husbandry, may markedly affect average pairwise variance. For the one-dimensional (row-wise) neighbour model considered here, investigation of three different layouts suggests that for a rectangular array of plots, smaller average pairwise variances can generally be obtained from layouts with fewer rows and more plots per row.  相似文献   

4.
We propose several diagnostic methods for checking the adequacy of marginal regression models for analyzing correlated binary data. We use a parametric marginal model based on latent variables and derive the projection (hat) matrix, Cook's distance, various residuals and Mahalanobis distance between the observed binary responses and the estimated probabilities for a cluster. Emphasized are several graphical methods including the simulated Q-Q plot, the half-normal probability plot with a simulated envelope, and the partial residual plot. The methods are illustrated with a real life example.  相似文献   

5.
Data‐analytic tools for models other than the normal linear regression model are relatively rare. Here we develop plots and diagnostic statistics for nonconstant variance for the random‐effects model (REM). REMs for longitudinal data include both within‐ and between‐subject variances. A basic assumption is that the two variance terms are constant across subjects. However, we often find that these variances are functions of covariates, and the data set has what we call explainable heterogeneity, which needs to be allowed for in the model. We characterize several types of heterogeneity of variance in REMs and develop three diagnostic tests using the score statistic: one for each of the two variance terms, and the third for a form of multivariate nonconstant variance. For each test we present an adjusted residual plot which can identify cases that are unusually influential on the outcome of the test.  相似文献   

6.
A normal quantile-quantile (QQ) plot is an important diagnostic for checking the assumption of normality. Though useful, these plots confuse students in my introductory statistics classes. A water-filling analogy, however, intuitively conveys the underlying concept. This analogy characterizes a QQ plot as a parametric plot of the water levels in two gradually filling vases. Each vase takes its shape from a probability distribution or sample. If the vases share a common shape, then the water levels match throughout the filling, and the QQ plot traces a diagonal line. An R package qqvases provides an interactive animation of this process and is suitable for classroom use.  相似文献   

7.
Inference for the general linear model makes several assumptions, including independence of errors, normality, and homogeneity of variance. Departure from the latter two of these assumptions may indicate the need for data transformation or removal of outlying observations. Informal procedures such as diagnostic plots of residuals are frequently used to assess the validity of these assumptions or to identify possible outliers. A simulation-based approach is proposed, which facilitates the interpretation of various diagnostic plots by adding simultaneous tolerance bounds. Several tests exist for normality or homoscedasticity in simple random samples. These tests are often applied to residuals from a linear model fit. The resulting procedures are approximate in that correlation among residuals is ignored. The simulation-based approach accounts for the correlation structure of residuals in the linear model and allows simultaneously checking for possible outliers, non normality, and heteroscedasticity, and it does not rely on formal testing.

[Supplementary materials are available for this article. Go to the publisher's online edition of Communications in Statistics—Simulation and Computation® for the following three supplemental resource: a word file containing figures illustrating the mode of operation for the bisectional algorithm, QQ-plots, and a residual plot for the mussels data.]  相似文献   

8.
A versatile graphical tool, the BLiP plot, was developed for displaying one-dimensional data. The basic building blocks are boxes, lines, and points. Like many standard one-dimensional distribution plots, the BLiP plot is capable of displaying individual data values in points or lines and grouped information in lines or boxes. In addition, the BLiP plot includes many new features such as variable-width plots and several choices of point patterns. The main advantage of the BLiP plot is that it provides users with basic graphical elements in a friendly and flexible environment so that users can, according to their needs, construct anything from a simple, standard plot to a complex, customized plot to best present their data.  相似文献   

9.
The heterogeneity of error variance often causes a huge interpretive problem in linear regression analysis. Before taking any remedial measures we first need to detect this problem. A large number of diagnostic plots are now available in the literature for detecting heteroscedasticity of error variances. Among them the ‘residuals’ and ‘fits’ (R–F) plot is very popular and commonly used. In the R–F plot residuals are plotted against the fitted responses, where both these components are obtained using the ordinary least squares (OLS) method. It is now evident that the OLS fits and residuals suffer a huge setback in the presence of unusual observations and hence the R–F plot may not exhibit the real scenario. The deletion residuals based on a data set free from all unusual cases should estimate the true errors in a better way than the OLS residuals. In this paper we propose ‘deletion residuals’ and the ‘deletion fits’ (DR–DF) plot for the detection of the heterogeneity of error variances in a linear regression model to get a more convincing and reliable graphical display. Examples show that this plot locates unusual observations more clearly than the R–F plot. The advantage of using deletion residuals in the detection of heteroscedasticity of error variance is investigated through Monte Carlo simulations under a variety of situations.  相似文献   

10.
Icicle Plots: Better Displays for Hierarchical Clustering   总被引:1,自引:0,他引:1  
An icicle plot is a method for presenting a hierarchical clustering. Compared with other methods of presentation, it is far easier in an icicle plot to read off which objects belong to which clusters, and which objects join or drop out from a cluster as we move up and down the levels of the hierarchy, though these benefits only appear when enough objects are being clustered. Icicle plots are described, and their benefits are illustrated using a clustering of 48 objects.  相似文献   

11.
Therneau et al (1990) used martingale residual plots to study the threshold effect of some covariates in a proportional hazard regression model for survival data subject to right censoring. We show that the maximum partial likelihood estimate provides an asymptotically consistent estimator for the unknown threshold. This procedure is illustrated by applying it to a data set from a cohort of patients with B-lineage leukemia treated at St. Jude Children's Research Hospital.  相似文献   

12.
Two diagnostic plots for selecting explanatory variables are introduced to assess the accuracy of a generalized beta-linear model. The added variable plot is developed to examine the need for adding a new explanatory variable to the model. The constructed variable plot is developed to identify the nonlinearity of the explanatory variable in the model. The two diagnostic procedures are also useful for detecting unusual observations that may affect the regression much. Simulation studies and analysis of two practical examples are conducted to illustrate the performances of the proposed plots.  相似文献   

13.
A new family of statistics is proposed to test for the presence of serial correlation in linear regression models. The tests are based on partial sums of lagged cross-products of regression residuals that define a class of interesting Gaussian processes. These processes are characterized in terms of regressor functions, the serial-correlation structure, the distribution of the noise process, and the order of the lag of the cross-products of residuals. It is shown that these four factors affect the lagged residual processes independently. Large-sample distributional results are presented for test statistics under the null hypothesis of no serial correlation or for alternatives from a range of interesting hypotheses. Some indication of the circumstances to which the asymptotic results apply in finite-sample situations and of those to which they should be applied with some caution are obtained through a simulation study. Tables of selected quantiles of the proposed tests are also given. The tests are illustrated with two examples taken from the empirical literature. It is also proposed that plots of lagged residual processes be used as diagnostic tools to gain insight into the correlation structure of residuals derived from regression fits.  相似文献   

14.
An added variable plot is a commonly used plot in regression diagnostics. The rationale for this plot is to provide information about the addition of a further explanatory variable to the model. In addition, an added variable plot is most often used for detecting high leverage points and influential data. So far as we know, this type of plot involves the least squares residuals which, we suspect, could produce a confusing picture when a group of unusual cases are present in the data. In this situation, added variable plots may not only fail to detect the unusual cases but also may fail to focus on the need for adding a further regressor to the model. We suggest that residuals from deletion should be more convincing and reliable in this type of plot. The usefulness of an added variable plot based on residuals from deletion is investigated through a few examples and a Monte Carlo simulation experiment in a variety of situations.  相似文献   

15.
Formulas for plotting probability and techniques for subjectively drawing lines on probability plots are reviewed. A method is presented for plotting data and drawing an objective line on the probability plot to obtain a test of the distributional assumption.  相似文献   

16.
Residual plots are a standard tool for assessing model fit. When some outcome data are censored, standard residual plots become less appropriate. Here, we develop a new procedure for producing residual plots for linear regression models where some or all of the outcome data are censored. We implement two approaches for incorporating parameter uncertainty. We illustrate our methodology by examining the model fit for an analysis of bacterial load data from a trial for chronic obstructive pulmonary disease. Simulated datasets show that the method can be used when the outcome data consist of a variety of types of censoring.  相似文献   

17.
This note provides a new explanation for Tukey's definition of “(inner) fences” for box-and-whiskers plots. The starting point is explicit bounds for the sample mean based only on the box plot. Starting from these bounds we define a dataset to contain outside values if at least one of the latter bounds is outside of the box. This leads to a new, yet simple definition of fences. They are symmetric around the box if, and only if, the median is in the middle of the box. In that case, the new definition coincides with Tukey's rule of 1.5 times the inter quartile range. To avoid instabilities for small (sub-) samples we propose to complement the box-and-whiskers plot of the original data with a box-and-whiskers plot of Walsh means.  相似文献   

18.
Waterfall plots are used to describe changes in tumor size observed in clinical studies. They are frequently used to illustrate the overall drug response in oncology clinical trials because of its simple representation of results. Unfortunately, this visual display suffers a number of limitations including (1) potential misguidance by masking the time dynamics of tumor size, (2) ambiguous labelling of the y‐axis, and (3) low data‐to‐ink ratio. We offer some alternatives to address these shortcomings and recommend moving away from waterfall plots to the benefit of plots showing the individual time profiles of sum of lesion diameters (according to RECIST). The spider plot presents the individual changes in tumor measurements over time relative to baseline tumor burden. Baseline tumor size is a well‐known confounding factor of drug effect which has to be accounted for when analyzing data in early clinical trials. While spider plots are conveniently correct for baseline tumor size, they cannot be presented in isolation. Indeed, percentage change from baseline has suboptimal statistical properties (including skewed distribution) and can be overly optimistic in favor of drug efficacy. We argued that plots of raw data (referred to as spaghetti plots) should always accompany spider plots to provide an equipoised illustration of the drug effect on lesion diameters.  相似文献   

19.
Subgroup analyses are a routine part of clinical trials to investigate whether treatment effects are homogeneous across the study population. Graphical approaches play a key role in subgroup analyses to visualise effect sizes of subgroups, to aid the identification of groups that respond differentially, and to communicate the results to a wider audience. Many existing approaches do not capture the core information and are prone to lead to a misinterpretation of the subgroup effects. In this work, we critically appraise existing visualisation techniques, propose useful extensions to increase their utility and attempt to develop an effective visualisation approach. We focus on forest plots, UpSet plots, Galbraith plots, subpopulation treatment effect pattern plot, and contour plots, and comment on other approaches whose utility is more limited. We illustrate the methods using data from a prostate cancer study.  相似文献   

20.
Normal probability plots for a simple random sample and normal probability plots for residuals from linear regression are not treated differently in statistical text books. In the statistical literature, 1 ? α simultaneous probability intervals for augmenting a normal probability plot for a simple random sample are available. The first purpose of this article is to demonstrate that the tests associated with the 1 ? α simultaneous probability intervals for a simple random sample may have a size substantially different from α when applied to the residuals from linear regression. This leads to the second purpose of this article: construction of four normal probability plot-based tests for residuals, which have size α exactly. We then compare the powers of these four graphical tests and a non-graphical test for residuals in order to assess the power performances of the graphical tests and to identify the ones that have better power. Finally, an example is provided to illustrate the methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号