期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Computation of exact confidence intervals from discrete data using studentized test statistics

Paul Kabaila 《Statistics and Computing》2005,15(1):71-78

A new area of research interest is the computation of exact confidence limits or intervals for a scalar parameter of interest from discrete data by inverting a hypothesis test based on a studentized test statistic. See, for example, Chan and Zhang (1999), Agresti and Min (2001) and Agresti (2003) who deal with a difference of binomial probabilities and Agresti and Min (2002) who deal with an odds ratio. However, neither (1) a detailed analysis of the computational issues involved nor (2) a reliable method of computation that deals effectively with these issues is currently available. In this paper we solve these two problems for a very broad class of discrete data models. We suppose that the distribution of the data is determined by (,) where is a nuisance parameter vector. We also consider six different studentized test statistics. Our contributions to (1) are as follows. We show that the P-value resulting from the hypothesis test, considered as a function of the null-hypothesized value of , has both jump and drop discontinuities. Numerical examples are used to demonstrate that these discontinuities lead to the failure of simple-minded approaches to the computation of the confidence limit or interval. We also provide a new method for efficiently computing the set of all possible locations of these discontinuities. Our contribution to (2) is to provide a new and reliable method of computing the confidence limit or interval, based on the knowledge of this set. 相似文献

2.

Computing location depth and regression depth in higher dimensions 总被引：3，自引：0，他引：3

Peter J. Rousseeuw Anja Struyf 《Statistics and Computing》1998,8(3):193-203

The location depth (Tukey 1975) of a point relative to a p-dimensional data set Z of size n is defined as the smallest number of data points in a closed halfspace with boundary through . For bivariate data, it can be computed in O(nlogn) time (Rousseeuw and Ruts 1996). In this paper we construct an exact algorithm to compute the location depth in three dimensions in O(n2logn) time. We also give an approximate algorithm to compute the location depth in p dimensions in O(mp3+mpn) time, where m is the number of p-subsets used.Recently, Rousseeuw and Hubert (1996) defined the depth of a regression fit. The depth of a hyperplane with coefficients (1,...,p) is the smallest number of residuals that need to change sign to make (1,...,p) a nonfit. For bivariate data (p=2) this depth can be computed in O(nlogn) time as well. We construct an algorithm to compute the regression depth of a plane relative to a three-dimensional data set in O(n2logn) time, and another that deals with p=4 in O(n3logn) time. For data sets with large n and/or p we propose an approximate algorithm that computes the depth of a regression fit in O(mp3+mpn+mnlogn) time. For all of these algorithms, actual implementations are made available. 相似文献

3.

Interpolation models with multiple hyperparameters

DAVID J. C. MACKAY RYO TAKEUCHI 《Statistics and Computing》1998,8(1):15-23

A traditional interpolation model is characterized by the choice of regularizer applied to the interpolant, and the choice of noise model. Typically, the regularizer has a single regularization constant , and the noise model has a single parameter . The ratio / alone is responsible for determining globally all these attributes of the interpolant: its complexity, flexibility, smoothness, characteristic scale length, and characteristic amplitude. We suggest that interpolation models should be able to capture more than just one flavour of simplicity and complexity. We describe Bayesian models in which the interpolant has a smoothness that varies spatially. We emphasize the importance, in practical implementation, of the concept of conditional convexity when designing models with many hyperparameters. We apply the new models to the interpolation of neuronal spike data and demonstrate a substantial improvement in generalization error. 相似文献

4.

Historical controls and modern survival analysis

Niels Keiding 《Lifetime data analysis》1995,1(1):19-25

Comparison of observed mortality with known, background, or standard rates has taken place for several hundred years. With the developments of regression models for survival data, an increasing interest has arisen in individualizing the standardisation using covariates of each individual. Also, account sometimes needs to be taken of random variation in the standard group.Emphasizing uses of the Cox regression model, this paper surveys a number of critical choices and pitfalls in this area. The methods are illustrated by comparing survival of liver patients after transplantation with survival after conservative treatment. 相似文献

5.

Simple boundary correction for kernel density estimation 总被引：8，自引：0，他引：8

M. C. Jones 《Statistics and Computing》1993,3(3):135-146

If a probability density function has bounded support, kernel density estimates often overspill the boundaries and are consequently especially biased at and near these edges. In this paper, we consider the alleviation of this boundary problem. A simple unified framework is provided which covers a number of straightforward methods and allows for their comparison: generalized jackknifing generates a variety of simple boundary kernel formulae. A well-known method of Rice (1984) is a special case. A popular linear correction method is another: it has close connections with the boundary properties of local linear fitting (Fan and Gijbels, 1992). Links with the optimal boundary kernels of Müller (1991) are investigated. Novel boundary kernels involving kernel derivatives and generalized reflection arise too. In comparisons, various generalized jackknifing methods perform rather similarly, so this, together with its existing popularity, make linear correction as good a method as any. In an as yet unsuccessful attempt to improve on generalized jackknifing, a variety of alternative approaches is considered. A further contribution is to consider generalized jackknife boundary correction for density derivative estimation. En route to all this, a natural analogue of local polynomial regression for density estimation is defined and discussed. 相似文献

6.

Threshold-range scaling of excitable cellular automata

Robert Fisch Janko Gravner David Griffeath 《Statistics and Computing》1991,1(1):23-39

Each cell of a two-dimensional lattice is painted one of colors, arranged in a color wheel. The colors advance (k tok+1 mod ) either automatically or by contact with at least a threshold number of successor colors in a prescribed local neighborhood. Discrete-time parallel systems of this sort in which color 0 updates by contact and the rest update automatically are called Greenberg-Hastings (GH) rules. A system in which all colors update by contact is called a cyclic cellular automation (CCA). Started from appropriate initial conditions, these models generate periodic traveling waves. Started from random configurations the same rules exhibit complex self-organization, typically characterized by nucleation of locally periodic ram's horns or spirals. Corresponding random processes give rise to a variety of forest fire equilibria that display large-scale stochastic wave fronts. This paper describes a framework, theoretically based, but relying on extensive interactive computer graphics experimentation, for investigation of the complex dynamics shared by excitable media in a broad spectrum of scientific contexts. By focusing on simple mathematical prototypes we hope to obtain a better understanding of the basic organizational principles underlying spatially distributed oscillating systems. 相似文献

7.

Posterior Bayes factor analysis for an exponential regression model

Murray Aitkin 《Statistics and Computing》1993,3(1):17-22

In the exponential regression model, Bayesian inference concerning the non-linear regression parameter has proved extremely difficult. In particular, standard improper diffuse priors for the usual parameters lead to an improper posterior for the non-linear regression parameter. In a recent paper Ye and Berger (1991) applied the reference prior approach of Bernardo (1979) and Berger and Bernardo (1989) yielding a proper informative prior for . This prior depends on the values of the explanatory variable, goes to 0 as goes to 1, and depends on the specification of a hierarchical ordering of importance of the parameters.This paper explains the failure of the uniform prior to give a proper posterior: the reason is the appearance of the determinant of the information matrix in the posterior density for . We apply the posterior Bayes factor approach of Aitkin (1991) to this problem; in this approach we integrate out nuisance parameters with respect to their conditional posterior density given the parameter of interest. The resulting integrated likelihood for requires only the standard diffuse prior for all the parameters, and is unaffected by orderings of importance of the parameters. Computation of the likelihood for is extremely simple. The approach is applied to the three examples discussed by Berger and Ye and the likelihoods compared with their posterior densities. 相似文献

8.

Applications of a general propagation algorithm for probabilistic expert systems 总被引：4，自引：0，他引：4

A. P. Dawid 《Statistics and Computing》1992,2(1):25-36

A probabilistic expert system provides a graphical representation of a joint probability distribution which can be used to simplify and localize calculations. Jensenet al. (1990) introduced a flow-propagation algorithm for calculating marginal and conditional distributions in such a system. This paper analyses that algorithm in detail, and shows how it can be modified to perform other tasks, including maximization of the joint density and simultaneous fast retraction of evidence entered on several variables. 相似文献

9.

Sampling based approach for one-hit and multi-hit models in quantal bioassay

CHU HUI-MAY KUO LYNN 《Statistics and Computing》1997,7(3):183-192

Bayesian methods for estimating the dose response curves with the one-hit model, the gamma multi-hit model, and their modified versions with Abbott's correction are studied. The Gibbs sampling approach with data augmentation and with the Metropolis algorithm is employed to compute the Bayes estimates of the potency curves. In addition, estimation of the relative additional risk and the virtually safe dose is studied. Model selection based on conditional predictive ordinates from cross-validated data is developed. 相似文献

10.

An explanation of generalized profile likelihoods

Joan G. Staniswalis Peter F. Thall 《Statistics and Computing》2001,11(4):293-298

Let X, T, Y be random vectors such that the distribution of Y conditional on covariates partitioned into the vectors X = x and T = t is given by f(y; x, ), where = (, (t)). Here is a parameter vector and (t) is a smooth, real–valued function of t. The joint distribution of X and T is assumed to be independent of and . This semiparametric model is called conditionally parametric because the conditional distribution f(y; x, ) of Y given X = x, T = t is parameterized by a finite dimensional parameter = (, (t)). Severini and Wong (1992. Annals of Statistics 20: 1768–1802) show how to estimate and (·) using generalized profile likelihoods, and they also provide a review of the literature on generalized profile likelihoods. Under specified regularity conditions, they derive an asymptotically efficient estimator of and a uniformly consistent estimator of (·). The purpose of this paper is to provide a short tutorial for this method of estimation under a likelihood–based model, reviewing results from Stein (1956. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, University of California Press, Berkeley, pp. 187–196), Severini (1987. Ph.D Thesis, The University of Chicago, Department of Statistics, Chicago, Illinois), and Severini and Wong (op. cit.). 相似文献

11.

Sampling from multimodal distributions using tempered transitions

Radford M. Neal 《Statistics and Computing》1996,6(4):353-366

I present a new Markov chain sampling method appropriate for distributions with isolated modes. Like the recently developed method of simulated tempering, the tempered transition method uses a series of distributions that interpolate between the distribution of interest and a distribution for which sampling is easier. The new method has the advantage that it does not require approximate values for the normalizing constants of these distributions, which are needed for simulated tempering, and can be tedious to estimate. Simulated tempering performs a random walk along the series of distributions used. In contrast, the tempered transitions of the new method move systematically from the desired distribution, to the easily-sampled distribution, and back to the desired distribution. This systematic movement avoids the inefficiency of a random walk, an advantage that is unfortunately cancelled by an increase in the number of interpolating distributions required. Because of this, the sampling efficiency of the tempered transition method in simple problems is similar to that of simulated tempering. On more complex distributions, however, simulated tempering and tempered transitions may perform differently. Which is better depends on the ways in which the interpolating distributions are deceptive. 相似文献

12.

Generating random numbers of prescribed distribution using physical sources

Daniel Neuenschwander Hansmartin Zeuner 《Statistics and Computing》2003,13(1):5-11

When constructing uniform random numbers in [0, 1] from the output of a physical device, usually n independent and unbiased bits B _j are extracted and combined into the machine number . In order to reduce the number of data used to build one real number, we observe that for independent and exponentially distributed random variables X _n (which arise for example as waiting times between two consecutive impulses of a Geiger counter) the variable U _n : = X _{2n – 1}/(X _{2n – 1} + X _2n) is uniform in [0, 1]. In the practical application X _n can only be measured up to a given precision (in terms of the expectation of the X _n); it is shown that the distribution function obtained by calculating U _n from these measurements differs from the uniform by less than /2.We compare this deviation with the error resulting from the use of biased bits B _j with P {B _j = 1{ = (where ] – [) in the construction of Y above. The influence of a bias is given by the estimate that in the p-total variation norm Q^TV _p = ( |Q()|^p)^1/p (p 1) we have P _Y – P ⁰ _Y^TV _p (c _n · )^1/p with c _n p for n . For the distribution function F _Y – F ⁰ _Y 2(1 – 2^–n)|| holds. 相似文献

13.

A Diagnostic for Association in Bivariate Survival Models

Chen MC Bandeen-Roche K 《Lifetime data analysis》2005,11(2):245-264

We propose exploratory, easily implemented methods for diagnosing the appropriateness of an underlying copula model for bivariate failure time data, allowing censoring in either or both failure times. It is found that the proposed approach effectively distinguishes gamma from positive stable copula models when the sample is moderately large or the association is strong. Data from the Womens Health and Aging Study (WHAS, Guralnik et al., The Womenss Health and Aging Study: Health and Social Characterisitics of Older Women with Disability. National Institute on Aging: Bethesda, Mayland, 1995) are analyzed to demonstrate the proposed diagnostic methodology. The positive stable model gives a better overall fit to these data than the gamma frailty model, but it tends to underestimate association at the later time points. The finding is consistent with recent theory differentiating catastrophic from progressive disability onset in older adults. The proposed methods supply an interpretable quantity for copula diagnosis. We hope that they will usefully inform practitioners as to the reasonableness of their modeling choices. 相似文献

14.

Semiparametric Efficient Estimation in the Generalized Odds-Rate Class of Regression Models for Right-Censored Time-to-Event Data

Scharfstein Daniel O. Tsiatis Anastasios A. Gilbert Peter B. 《Lifetime data analysis》1998,4(4):355-391

The generalized odds-rate class of regression models for time to event data is indexed by a non-negative constant and assumes thatg(S(t|Z)) = (t) + Zwhere g(s) = log(^-1(s-) for > 0, g₀(s) = log(- log s), S(t|Z) is the survival function of the time to event for an individual with qx1 covariate vector Z, is a qx1 vector of unknown regression parameters, and (t) is some arbitrary increasing function of t. When =0, this model is equivalent to the proportional hazards model and when =1, this model reduces to the proportional odds model. In the presence of right censoring, we construct estimators for and exp((t)) and show that they are consistent and asymptotically normal. In addition, we show that the estimator for is semiparametric efficient in the sense that it attains the semiparametric variance bound. 相似文献

15.

Non-standard methods in evolutionary computation

Zbigniew Michalewicz 《Statistics and Computing》1994,4(2):141-155

The paper presents non-standard methods in evolutionary computation and discusses their applicability to various optimization problems. These methods maintain populations of individuals with nonlinear chromosomal structure and use genetic operators enhanced by the problem specific knowledge. 相似文献

16.

Preisindikatoren für Wohnimmobilien in Deutschland

Hans-Albert Leifer 《Allgemeines Statistisches Archiv》2004,88(4):435-450

Zusammenfassung: Vermögenspreise im Allgemeinen und Immobilienpreise im Besonderen gewannen in den zurückliegenden Jahren mehr und mehr an Bedeutung. Während sie in den späten 80er Jahren (nach dem Börsencrash im Herbst 1987) und im vergangenen Jahrzehnt vornehmlich unter dem Schlagwort asset-price inflation/deflation betrachtet wurden, stehen neuerdings die Tragfähigkeit und Bestandsfestigkeit der Finanzsysteme im Vordergrund. In den Ausführungen geht es vor allem um die Frage, warum, seit wann und aufgrund welcher Grunddaten die Deutsche Bundesbank auf diesem Gebiet der Preisstatistik tätig geworden ist. Dabei wird nicht nur auf das hohe Maß an Unsicherheit in den vorgelegten Angaben hingewiesen, sondern auch der Second–Best–Charakter der Berechnungen hervorgehoben.

Summary: Asset prices in general and property prices in particular have gained increasing importance in recent years. Whereas in the late 1980s (after the stock market crash in autumn 1987) and in the last decade these prices mainly came under the heading of asset-price inflation/deflation, the focus has recently shifted to sustainable and viable financial systems. The notes primarily explain why the Bundesbank is involved in this area of price statistics, when this involvement began and what underlying data the Bundesbank uses. At the same time, they not only indicate the large degree of uncertainty in the reported data but also highlight the second-best nature of the calculations.

*Vortrag anlässlich der 9. Konferenz Messen der Teuerung am 17./18. Juni 2004 in Marburg. Der Verfasser gibt seine persönliche Auffassung wieder, die nicht unbedingt mit derjenigen der Deutschen Bundesbank übereinstimmen muss. 相似文献

17.

Coaching variables for regression and classification

ROBERT TIBSHIRANI GEOFFREY HINTON 《Statistics and Computing》1998,8(1):25-33

In a regression or classification setting where we wish to predict Y from x1,x2,..., xp, we suppose that an additional set of coaching variables z1,z2,..., zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,..., xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp. The relative merits of these approaches are discussed and compared in a number of examples. 相似文献

18.

The Statistics of Simulating Chaos

Phil Diamond Alexei Pokrovskii 《Statistics and Computing》2001,11(3):217-228

When simulating a dynamical system, the computation is actually of a spatially discretized system, because finite machine arithmetic replaces continuum state space. For chaotic dynamical systems, the discretized simulations often have collapsing effects, to a fixed point or to short cycles. Statistical properties of these phenomena can be modelled with random mappings with an absorbing centre. The model gives results which are very much in line with computational experiments. The effects are discussed with special reference to the family of mappings f (x)=1-|1-2x|,x [0,1],1,<,,<,. Computer experiments show close agreement with predictions of the model. 相似文献

19.

Principal points of univariate continuous distributions

Alice Zoppè 《Statistics and Computing》1995,5(2):127-132

The K principal points of a p-variate random variable X are defined as those points ₁,...,_K which minimize the expected squared distance of X from the nearest of the _k. This paper reviews some of the theory of principal points and presents a method of determining principal points of univariate continuous distributions. The method is applied to the uniform distribution, to the normal distribution and to the exponential distribution. 相似文献

20.

Data depths satisfying the projection property

Rainer Dyckerhoff 《Allgemeines Statistisches Archiv》2004,88(2):163-190

Summary: Data depth is a concept that measures the centrality of a point in a given data cloud x ₁, x ₂,...,x _n or in a multivariate distribution P ^X on ^d ^d. Every depth defines a family of so–called trimmed regions. The –trimmed region is given by the set of points that have a depth of at least . Data depth has been used to define multivariate measures of location and dispersion as well as multivariate dispersion orders.If the depth of a point can be represented as the minimum of the depths with respect to all unidimensional projections, we say that the depth satisfies the (weak) projection property. Many depths which have been proposed in the literature can be shown to satisfy the weak projection property. A depth is said to satisfy the strong projection property if for every the unidimensional projection of the –trimmed region equals the –trimmed region of the projected distribution.After a short introduction into the general concept of data depth we formally define the weak and the strong projection property and give necessary and sufficient criteria for the projection property to hold. We further show that the projection property facilitates the construction of depths from univariate trimmed regions. We discuss some of the depths proposed in the literature which possess the projection property and define a general class of projection depths, which are constructed from univariate trimmed regions by using the above method.Finally, algorithmic aspects of projection depths are discussed. We describe an algorithm which enables the approximate computation of depths that satisfy the projection property. 相似文献