首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Residual plots are a standard tool for assessing model fit. When some outcome data are censored, standard residual plots become less appropriate. Here, we develop a new procedure for producing residual plots for linear regression models where some or all of the outcome data are censored. We implement two approaches for incorporating parameter uncertainty. We illustrate our methodology by examining the model fit for an analysis of bacterial load data from a trial for chronic obstructive pulmonary disease. Simulated datasets show that the method can be used when the outcome data consist of a variety of types of censoring.  相似文献   

2.
Summary.  We present an approach to the construction of clusters of life course trajectories and use it to obtain ideal types of trajectories that can be interpreted and analysed meaningfully. We represent life courses as sequences on a monthly timescale and apply optimal matching analysis to compute dissimilarities between individuals. We introduce a new divisive clustering algorithm which has features that are in common with both Ward's agglomerative algorithm and classification and regression trees. We analyse British Household Panel Survey data on the employment and family trajectories of women. Our method produces clusters of sequences for which it is straightforward to determine who belongs to each cluster, making it easier to interpret the relative importance of life course factors in distinguishing subgroups of the population. Moreover our method gives guidance on selecting the number of clusters.  相似文献   

3.
An index plot of Cook's statistic is frequently used to highlight influential observations. In this article we illustrate how enhanced higher dimensional plots of Cook's statistic can provide further useful information about sets of influential observations. We provide examples using normal and generalized linear models.  相似文献   

4.
The term 'representation bias' is used to describe the disparities that exist between treatment effects estimated from field experiments, and those effects that would be seen if treatments were used in the field. In this paper we are specifically concerned with representation bias caused by disease inoculum travelling between plots, or out of the experimental area altogether. The scope for such bias is maximized in the case of airborne spread diseases. This paper extends the work of Deardon et al. (2004), using simulation methods to explore the relationship between design and representation bias. In doing so, we illustrate the importance of plot size and spacing, as well as treatment-to-plot allocation. We examine a novel class of designs, incomplete column designs, to develop an understanding of the mechanisms behind representation bias. We also introduce general methods of designing field trials, which can be used to limit representation bias by carefully controlling treatment to block allocation in both incomplete column and incomplete randomized block designs. Finally, we show how the commonly used practice of sampling from the centres of plots, rather than entire plots, can also help to control representation bias.  相似文献   

5.
Waterfall plots are used to describe changes in tumor size observed in clinical studies. They are frequently used to illustrate the overall drug response in oncology clinical trials because of its simple representation of results. Unfortunately, this visual display suffers a number of limitations including (1) potential misguidance by masking the time dynamics of tumor size, (2) ambiguous labelling of the y‐axis, and (3) low data‐to‐ink ratio. We offer some alternatives to address these shortcomings and recommend moving away from waterfall plots to the benefit of plots showing the individual time profiles of sum of lesion diameters (according to RECIST). The spider plot presents the individual changes in tumor measurements over time relative to baseline tumor burden. Baseline tumor size is a well‐known confounding factor of drug effect which has to be accounted for when analyzing data in early clinical trials. While spider plots are conveniently correct for baseline tumor size, they cannot be presented in isolation. Indeed, percentage change from baseline has suboptimal statistical properties (including skewed distribution) and can be overly optimistic in favor of drug efficacy. We argued that plots of raw data (referred to as spaghetti plots) should always accompany spider plots to provide an equipoised illustration of the drug effect on lesion diameters.  相似文献   

6.
We propose a flexible method to approximate the subjective cumulative distribution function of an economic agent about the future realization of a continuous random variable. The method can closely approximate a wide variety of distributions while maintaining weak assumptions on the shape of distribution functions. We show how moments and quantiles of general functions of the random variable can be computed analytically and/or numerically. We illustrate the method by revisiting the determinants of income expectations in the United States. A Monte Carlo analysis suggests that a quantile-based flexible approach can be used to successfully deal with censoring and possible rounding levels present in the data. Finally, our analysis suggests that the performance of our flexible approach matches that of a correctly specified parametric approach and is clearly better than that of a misspecified parametric approach.  相似文献   

7.
We describe an approach, termed reified analysis, for linking the behaviour of mathematical models with inferences about the physical systems which the models represent. We describe the logical basis for the approach, based on coherent assessment of the implications of deficiencies in the mathematical model. We show how the statistical analysis may be carried out by specifying stochastic relationships between the model that we have, improved versions of the model that we might construct, and the system itself. We illustrate our approach with an example concerning the potential shutdown of the Thermohaline circulation in the Atlantic Ocean.  相似文献   

8.
In sequential pattern analysis, the frequency of patterns is evaluated by the support. While computed efficiently from large databases, we show that the support cannot be compared between different databases, since it is influenced by the actual sequence length distribution. Models for this sequence length distribution are surveyed. One of these models, the Good distribution, appears to be sufficiently flexible for practice. It is used to exemplify an approach for adjusting the relative support such that the resulting adjusted support values are better comparable between different databases. We illustrate our findings with texts from the bilingual FinDe corpus.  相似文献   

9.
The forward search is a method of robust data analysis in which outlier free subsets of the data of increasing size are used in model fitting; the data are then ordered by closeness to the model. Here the forward search, with many random starts, is used to cluster multivariate data. These random starts lead to the diagnostic identification of tentative clusters. Application of the forward search to the proposed individual clusters leads to the establishment of cluster membership through the identification of non-cluster members as outlying. The method requires no prior information on the number of clusters and does not seek to classify all observations. These properties are illustrated by the analysis of 200 six-dimensional observations on Swiss banknotes. The importance of linked plots and brushing in elucidating data structures is illustrated. We also provide an automatic method for determining cluster centres and compare the behaviour of our method with model-based clustering. In a simulated example with eight clusters our method provides more stable and accurate solutions than model-based clustering. We consider the computational requirements of both procedures.  相似文献   

10.
The complex Watson distribution is an important simple distribution on the complex sphere which is used in statistical shape analysis. We describe the density, obtain the integrating constant and provide large sample approximations. Maximum likelihood estimation and hypothesis testing procedures for one and two samples are described. The particular connection with shape analysis is discussed and we consider an application examining shape differences between normal and schizophrenic brains. We make some observations about Bayesian shape inference and finally we describe a more general rotationally symmetric family of distributions.  相似文献   

11.
We demonstrate how Bayes linear methods, based on partial prior specifications, bring us quickly to the heart of otherwise complex problems, giving us natural and systematic tools for evaluating our analyses which are not readily available in the usual Bayes formalism. We illustrate the approach using an example concerning problems of prediction in a large brewery. We describe the computer language [B/D] (an acronym for beliefs adjusted by data), which implements the approach. [B/D] incorporates a natural graphical representation of the analysis, providing a powerful way of thinking about the process of knowledge formulation and criticism which is also accessible to non-technical users.  相似文献   

12.
In recent years there has been a rapid growth in the amount of DNA being sequenced and in its availability through genetic databases. Statistical techniques which identify structure within these sequences can be of considerable assistance to molecular biologists particularly when they incorporate the discrete nature of changes caused by evolutionary processes. This paper focuses on the detection of homogeneous segments within heterogeneous DNA sequences. In particular, we study an intron from the chimpanzee α-fetoprotein gene; this protein plays an important role in the embryonic development of mammals. We present a Bayesian solution to this segmentation problem using a hidden Markov model implemented by Markov chain Monte Carlo methods. We consider the important practical problem of specifying informative prior knowledge about sequences of this type. Two Gibbs sampling algorithms are contrasted and the sensitivity of the analysis to the prior specification is investigated. Model selection and possible ways to overcome the label switching problem are also addressed. Our analysis of intron 7 identifies three distinct homogeneous segment types, two of which occur in more than one region, and one of which is reversible.  相似文献   

13.
Bayesian analysis of dynamic magnetic resonance breast images   总被引:2,自引:0,他引:2  
Summary.  We describe an integrated methodology for analysing dynamic magnetic resonance images of the breast. The problems that motivate this methodology arise from a collaborative study with a tumour institute. The methods are developed within the Bayesian framework and comprise image restoration and classification steps. Two different approaches are proposed for the restoration. Bayesian inference is performed by means of Markov chain Monte Carlo algorithms. We make use of a Metropolis algorithm with a specially chosen proposal distribution that performs better than more commonly used proposals. The classification step is based on a few attribute images yielded by the restoration step that describe the essential features of the contrast agent variation over time. Procedures for hyperparameter estimation are provided, so making our method automatic. The results show the potential of the methodology to extract useful information from acquired dynamic magnetic resonance imaging data about tumour morphology and internal pathophysiological features.  相似文献   

14.
In a previous paper the authors proposed a simple method to extend results about almost sure convergence for weighted sums of real random variables to the case of Banach-valued random elements. The method arises from the extension of Skorohod's Representation Theorem for weakly convergent sequences due to Blackwell and Dubins, applied to the general framework of weakly equivalent tight sequences of probability measures. This provides a scheme which permits us to handle separately a problem that behaves like the Glivenko-Cantelli Theorem and a question on uniform integrability which generally is reduced to the real valued version of the general problem to be solved.

In this paper we prove that Wasserstein's metrics can play the same role as Skorohod's Representation Theorem in the preceding scheme. We also show that our method can be applied to obtain results with respect to various summability methods (Abel, Euler, …) even in the case in which the ‘weights’ are linear operators.  相似文献   


15.
We give a critical synopsis of classical and recent tests for univariate normality, our emphasis being on procedures which are consistent against all alternatives. The power performance of some selected tests (Anderson-Darling, Shapiro-Wilk, Shapiro-Francia, Epps-Pulley) is assessed in a simulation study. Numerical results are illuminated by plots of isodynes, i.e., lines of constant estimated power, for the Johnson-system of distributions.  相似文献   

16.
ON BOOTSTRAP HYPOTHESIS TESTING   总被引:2,自引:0,他引:2  
We describe methods for constructing bootstrap hypothesis tests, illustrating our approach using analysis of variance. The importance of pivotalness is discussed. Pivotal statistics usually result in improved accuracy of level. We note that hypothesis tests and confidence intervals call for different methods of resampling, so as to ensure that accurate critical point estimates are obtained in the former case even when data fail to comply with the null hypothesis. Our main points are illustrated by a simulation study and application to three real data sets.  相似文献   

17.
We describe estimation, learning, and prediction in a treatment-response model with two outcomes. The introduction of potential outcomes in this model introduces four cross-regime correlation parameters that are not contained in the likelihood for the observed data and thus are not identified. Despite this inescapable identification problem, we build upon the results of Koop and Poirier (1997) to describe how learning takes place about the four nonidentified correlations through the imposed positive definiteness of the covariance matrix. We then derive bivariate distributions associated with commonly estimated “treatment parameters” (including the Average Treatment Effect and effect of Treatment on the Treated), and use the learning that takes place about the nonidentified correlations to calculate these densities. We illustrate our points in several generated data experiments and apply our methods to estimate the joint impact of child labor on achievement scores in language and mathematics.  相似文献   

18.
We explore the application of dynamic graphics to the exploratory analysis of spatial data. We introduce a number of new tools and illustrate their use with prototype software, developed at Trinity College, Dublin. These tools are used to examine local variability—anomalies—through plots of the data that display its marginal and multivariate distributions, through interactive smoothers, and through plots motivated by the spatial auto-covariance ideas implicit in the variogram. We regard these as alternative and linked views of the data. We conclude that the most important single view of the data is the Map View: All other views must be cross-referred to this, and the software must encourage this. The view can be enriched by overlaying on other pertinent spatial information. We draw attention to the possibilities of one-many linking, and to the use of line-objects to link pairs of data points. We draw attention to the parallels with work on Geographical Information Systems.  相似文献   

19.
Chaotic systems are characterized by sensitivity to initial conditions, and this property can be measured by global Lyapunov exponents, which are measures of the average divergence rate of initially close trajectories. Wolff (1992) introduced local Lyapunov exponents and used them to obtain two diagnostic plots for differentiating between stochastic and deterministic time series. We extend the definition of the local Lyapunov exponent and the diagnostic plots to accommodate time series that arise from bivariate maps and investigate the behaviour of the local Lyapunov exponents and the corresponding diagnostic plots for some dynamical systems and stochastic time series. We consider the application of these diagnostic plots to some heart rate variability data.  相似文献   

20.
Summary.  We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools are developed systematically, by using an analogy between our spatial residuals and the usual residuals for (non-spatial) generalized linear models. The conditional intensity λ plays the role of the mean response. This makes it possible to adapt existing knowledge about model validation for generalized linear models to the spatial point process context, giving recommendations for diagnostic plots. A plot of smoothed residuals against spatial location, or against a spatial covariate, is effective in diagnosing spatial trend or co-variate effects. Q – Q -plots of the residuals are effective in diagnosing interpoint interaction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号