首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Computing location depth and regression depth in higher dimensions   总被引:3,自引:0,他引:3  
The location depth (Tukey 1975) of a point relative to a p-dimensional data set Z of size n is defined as the smallest number of data points in a closed halfspace with boundary through . For bivariate data, it can be computed in O(nlogn) time (Rousseeuw and Ruts 1996). In this paper we construct an exact algorithm to compute the location depth in three dimensions in O(n2logn) time. We also give an approximate algorithm to compute the location depth in p dimensions in O(mp3+mpn) time, where m is the number of p-subsets used.Recently, Rousseeuw and Hubert (1996) defined the depth of a regression fit. The depth of a hyperplane with coefficients (1,...,p) is the smallest number of residuals that need to change sign to make (1,...,p) a nonfit. For bivariate data (p=2) this depth can be computed in O(nlogn) time as well. We construct an algorithm to compute the regression depth of a plane relative to a three-dimensional data set in O(n2logn) time, and another that deals with p=4 in O(n3logn) time. For data sets with large n and/or p we propose an approximate algorithm that computes the depth of a regression fit in O(mp3+mpn+mnlogn) time. For all of these algorithms, actual implementations are made available.  相似文献   

2.
Summary: We describe depth–based graphical displays that show the interdependence of multivariate distributions. The plots involve one–dimensional curves or bivariate scatterplots, so they are easier to interpret than correlation matrices. The correlation curve, modelled on the scale curve of Liu et al. (1999), compares the volume of the observed central regions with the volume under independence. The correlation DD–plot is the scatterplot of depth values under a reference distribution against depth values under independence. The area of the plot gives a measure of distance from independence. Correlation curve and DD-plot require an independence model as a baseline: Besides classical parametric specifications, a nonparametric estimator, derived from the randomization principle, is used. Combining data depth and the notion of quadrant dependence, quadrant correlation trajectories are obtained which allow simultaneous representation of subsets of variables. The properties of the plots for the multivariate normal distribution are investigated. Some real data examples are illustrated. *This work was completed with the support of Ca Foscari University.  相似文献   

3.
We propose exploratory, easily implemented methods for diagnosing the appropriateness of an underlying copula model for bivariate failure time data, allowing censoring in either or both failure times. It is found that the proposed approach effectively distinguishes gamma from positive stable copula models when the sample is moderately large or the association is strong. Data from the Womens Health and Aging Study (WHAS, Guralnik et al., The Womenss Health and Aging Study: Health and Social Characterisitics of Older Women with Disability. National Institute on Aging: Bethesda, Mayland, 1995) are analyzed to demonstrate the proposed diagnostic methodology. The positive stable model gives a better overall fit to these data than the gamma frailty model, but it tends to underestimate association at the later time points. The finding is consistent with recent theory differentiating catastrophic from progressive disability onset in older adults. The proposed methods supply an interpretable quantity for copula diagnosis. We hope that they will usefully inform practitioners as to the reasonableness of their modeling choices.  相似文献   

4.
Let X, T, Y be random vectors such that the distribution of Y conditional on covariates partitioned into the vectors X = x and T = t is given by f(y; x, ), where = (, (t)). Here is a parameter vector and (t) is a smooth, real–valued function of t. The joint distribution of X and T is assumed to be independent of and . This semiparametric model is called conditionally parametric because the conditional distribution f(y; x, ) of Y given X = x, T = t is parameterized by a finite dimensional parameter = (, (t)). Severini and Wong (1992. Annals of Statistics 20: 1768–1802) show how to estimate and (·) using generalized profile likelihoods, and they also provide a review of the literature on generalized profile likelihoods. Under specified regularity conditions, they derive an asymptotically efficient estimator of and a uniformly consistent estimator of (·). The purpose of this paper is to provide a short tutorial for this method of estimation under a likelihood–based model, reviewing results from Stein (1956. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, University of California Press, Berkeley, pp. 187–196), Severini (1987. Ph.D Thesis, The University of Chicago, Department of Statistics, Chicago, Illinois), and Severini and Wong (op. cit.).  相似文献   

5.
Comparison of observed mortality with known, background, or standard rates has taken place for several hundred years. With the developments of regression models for survival data, an increasing interest has arisen in individualizing the standardisation using covariates of each individual. Also, account sometimes needs to be taken of random variation in the standard group.Emphasizing uses of the Cox regression model, this paper surveys a number of critical choices and pitfalls in this area. The methods are illustrated by comparing survival of liver patients after transplantation with survival after conservative treatment.  相似文献   

6.
The paper presents non-standard methods in evolutionary computation and discusses their applicability to various optimization problems. These methods maintain populations of individuals with nonlinear chromosomal structure and use genetic operators enhanced by the problem specific knowledge.  相似文献   

7.
In this paper, we reconsider the well-known oblique Procrustes problem where the usual least-squares objective function is replaced by a more robust discrepancy measure, based on the 1 norm or smooth approximations of it.We propose two approaches to the solution of this problem. One approach is based on convex analysis and uses the structure of the problem to permit a solution to the 1 norm problem. An alternative approach is to smooth the problem by working with smooth approximations to the 1 norm, and this leads to a solution process based on the solution of ordinary differential equations on manifolds. The general weighted Procrustes problem (both orthogonal and oblique) can also be solved by the latter approach. Numerical examples to illustrate the algorithms which have been developed are reported and analyzed.  相似文献   

8.
Implementing partial least squares   总被引:2,自引:0,他引:2  
Partial least squares (PLS) regression has been proposed as an alternative regression technique to more traditional approaches such as principal components regression and ridge regression. A number of algorithms have appeared in the literature which have been shown to be equivalent. Someone wishing to implement PLS regression in a programming language or within a statistical package must choose which algorithm to use. We investigate the implementation of univariate PLS algorithms within FORTRAN and the Matlab (1993) and Splus (1992) environments, comparing theoretical measures of execution speed based on flop counts with their observed execution times. We also comment on the ease with which the algorithms may be implemented in the different environments. Finally, we investigate the merits of using the orthogonal invariance of PLS regression to improve the algorithms.  相似文献   

9.
In this largely expository article, we highlight the significance of various types of dimension for obtaining uniform convergence results in probability theory and we demonstrate how these results lead to certain notions of generalization for classes of binary-valued and real-valued functions. We also present new results on the generalization ability of certain types of artificial neural networks with real output.  相似文献   

10.
Edgoose  T.  Allison  L. 《Statistics and Computing》1999,9(4):269-278
General purpose un-supervised classification programs have typically assumed independence between observations in the data they analyse. In this paper we report on an extension to the MML classifier Snob which enables the program to take advantage of some of the extra information implicit in ordered datasets (such as time-series). Specifically the data is modelled as if it were generated from a first order Markov process with as many states as there are classes of observation. The state of such a process at any point in the sequence determines the class from which the corresponding observation is generated. Such a model is commonly referred to as a Hidden Markov Model. The MML calculation for the expected length of a near optimal two-part message stating a specific model of this type and a dataset given this model is presented. Such an estimate enables us to fairly compare models which differ in the number of classes they specify which in turn can guide a robust un-supervised search of the model space. The new program, tSnob, is tested against both synthetic data and a large real world dataset and is found to make unbiased estimates of model parameters and to conduct an effective search of the extended model space.  相似文献   

11.
Principal curves revisited   总被引:15,自引:0,他引:15  
A principal curve (Hastie and Stuetzle, 1989) is a smooth curve passing through the middle of a distribution or data cloud, and is a generalization of linear principal components. We give an alternative definition of a principal curve, based on a mixture model. Estimation is carried out through an EM algorithm. Some comparisons are made to the Hastie-Stuetzle definition.  相似文献   

12.
Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. Training such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty.Although the original perceptron, developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each node, this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the hidden layers can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer).We examine the usual MLP objective function—the sum of squares—and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model.We conclude with some general comments on the relation between the MLP and latent variable models.  相似文献   

13.
Each cell of a two-dimensional lattice is painted one of colors, arranged in a color wheel. The colors advance (k tok+1 mod ) either automatically or by contact with at least a threshold number of successor colors in a prescribed local neighborhood. Discrete-time parallel systems of this sort in which color 0 updates by contact and the rest update automatically are called Greenberg-Hastings (GH) rules. A system in which all colors update by contact is called a cyclic cellular automation (CCA). Started from appropriate initial conditions, these models generate periodic traveling waves. Started from random configurations the same rules exhibit complex self-organization, typically characterized by nucleation of locally periodic ram's horns or spirals. Corresponding random processes give rise to a variety of forest fire equilibria that display large-scale stochastic wave fronts. This paper describes a framework, theoretically based, but relying on extensive interactive computer graphics experimentation, for investigation of the complex dynamics shared by excitable media in a broad spectrum of scientific contexts. By focusing on simple mathematical prototypes we hope to obtain a better understanding of the basic organizational principles underlying spatially distributed oscillating systems.  相似文献   

14.
When simulating a dynamical system, the computation is actually of a spatially discretized system, because finite machine arithmetic replaces continuum state space. For chaotic dynamical systems, the discretized simulations often have collapsing effects, to a fixed point or to short cycles. Statistical properties of these phenomena can be modelled with random mappings with an absorbing centre. The model gives results which are very much in line with computational experiments. The effects are discussed with special reference to the family of mappings f (x)=1-|1-2x|,x [0,1],1,<,,<,. Computer experiments show close agreement with predictions of the model.  相似文献   

15.
We present a new test for the presence of a normal mixture distribution, based on the posterior Bayes factor of Aitkin (1991). The new test has slightly lower power than the likelihood ratio test. It does not require the computation of the MLEs of the parameters or a search for multiple maxima, but requires computations based on classification likelihood assignments of observations to mixture components.  相似文献   

16.
We introduce a simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests. The scheme is based on Gray codes which give tours through various spaces, changing only one or two points at a time. We use updating algorithms to avoid recomputing statistics and achieve substantial speedups. Several practical examples and computer codes are given.  相似文献   

17.
I present a new Markov chain sampling method appropriate for distributions with isolated modes. Like the recently developed method of simulated tempering, the tempered transition method uses a series of distributions that interpolate between the distribution of interest and a distribution for which sampling is easier. The new method has the advantage that it does not require approximate values for the normalizing constants of these distributions, which are needed for simulated tempering, and can be tedious to estimate. Simulated tempering performs a random walk along the series of distributions used. In contrast, the tempered transitions of the new method move systematically from the desired distribution, to the easily-sampled distribution, and back to the desired distribution. This systematic movement avoids the inefficiency of a random walk, an advantage that is unfortunately cancelled by an increase in the number of interpolating distributions required. Because of this, the sampling efficiency of the tempered transition method in simple problems is similar to that of simulated tempering. On more complex distributions, however, simulated tempering and tempered transitions may perform differently. Which is better depends on the ways in which the interpolating distributions are deceptive.  相似文献   

18.
In the exponential regression model, Bayesian inference concerning the non-linear regression parameter has proved extremely difficult. In particular, standard improper diffuse priors for the usual parameters lead to an improper posterior for the non-linear regression parameter. In a recent paper Ye and Berger (1991) applied the reference prior approach of Bernardo (1979) and Berger and Bernardo (1989) yielding a proper informative prior for . This prior depends on the values of the explanatory variable, goes to 0 as goes to 1, and depends on the specification of a hierarchical ordering of importance of the parameters.This paper explains the failure of the uniform prior to give a proper posterior: the reason is the appearance of the determinant of the information matrix in the posterior density for . We apply the posterior Bayes factor approach of Aitkin (1991) to this problem; in this approach we integrate out nuisance parameters with respect to their conditional posterior density given the parameter of interest. The resulting integrated likelihood for requires only the standard diffuse prior for all the parameters, and is unaffected by orderings of importance of the parameters. Computation of the likelihood for is extremely simple. The approach is applied to the three examples discussed by Berger and Ye and the likelihoods compared with their posterior densities.  相似文献   

19.
Summary: Data depth is a concept that measures the centrality of a point in a given data cloud x 1, x 2,...,x n or in a multivariate distribution P X on d d . Every depth defines a family of so–called trimmed regions. The –trimmed region is given by the set of points that have a depth of at least . Data depth has been used to define multivariate measures of location and dispersion as well as multivariate dispersion orders.If the depth of a point can be represented as the minimum of the depths with respect to all unidimensional projections, we say that the depth satisfies the (weak) projection property. Many depths which have been proposed in the literature can be shown to satisfy the weak projection property. A depth is said to satisfy the strong projection property if for every the unidimensional projection of the –trimmed region equals the –trimmed region of the projected distribution.After a short introduction into the general concept of data depth we formally define the weak and the strong projection property and give necessary and sufficient criteria for the projection property to hold. We further show that the projection property facilitates the construction of depths from univariate trimmed regions. We discuss some of the depths proposed in the literature which possess the projection property and define a general class of projection depths, which are constructed from univariate trimmed regions by using the above method.Finally, algorithmic aspects of projection depths are discussed. We describe an algorithm which enables the approximate computation of depths that satisfy the projection property.  相似文献   

20.
The generalized odds-rate class of regression models for time to event data is indexed by a non-negative constant and assumes thatg(S(t|Z)) = (t) + Zwhere g(s) = log(-1(s-) for > 0, g0(s) = log(- log s), S(t|Z) is the survival function of the time to event for an individual with qx1 covariate vector Z, is a qx1 vector of unknown regression parameters, and (t) is some arbitrary increasing function of t. When =0, this model is equivalent to the proportional hazards model and when =1, this model reduces to the proportional odds model. In the presence of right censoring, we construct estimators for and exp((t)) and show that they are consistent and asymptotically normal. In addition, we show that the estimator for is semiparametric efficient in the sense that it attains the semiparametric variance bound.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号