期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Non-standard methods in evolutionary computation

Zbigniew Michalewicz 《Statistics and Computing》1994,4(2):141-155

The paper presents non-standard methods in evolutionary computation and discusses their applicability to various optimization problems. These methods maintain populations of individuals with nonlinear chromosomal structure and use genetic operators enhanced by the problem specific knowledge. 相似文献

2.

Applications of a general propagation algorithm for probabilistic expert systems 总被引：4，自引：0，他引：4

A. P. Dawid 《Statistics and Computing》1992,2(1):25-36

A probabilistic expert system provides a graphical representation of a joint probability distribution which can be used to simplify and localize calculations. Jensenet al. (1990) introduced a flow-propagation algorithm for calculating marginal and conditional distributions in such a system. This paper analyses that algorithm in detail, and shows how it can be modified to perform other tasks, including maximization of the joint density and simultaneous fast retraction of evidence entered on several variables. 相似文献

3.

Data depth and correlation

Mario Romanazzi 《Allgemeines Statistisches Archiv》2004,88(2):191-214

Summary: We describe depth–based graphical displays that show the interdependence of multivariate distributions. The plots involve one–dimensional curves or bivariate scatterplots, so they are easier to interpret than correlation matrices. The correlation curve, modelled on the scale curve of Liu et al. (1999), compares the volume of the observed central regions with the volume under independence. The correlation DD–plot is the scatterplot of depth values under a reference distribution against depth values under independence. The area of the plot gives a measure of distance from independence. Correlation curve and DD-plot require an independence model as a baseline: Besides classical parametric specifications, a nonparametric estimator, derived from the randomization principle, is used. Combining data depth and the notion of quadrant dependence, quadrant correlation trajectories are obtained which allow simultaneous representation of subsets of variables. The properties of the plots for the multivariate normal distribution are investigated. Some real data examples are illustrated. *This work was completed with the support of Ca Foscari University. 相似文献

4.

Interpolation models with multiple hyperparameters

DAVID J. C. MACKAY RYO TAKEUCHI 《Statistics and Computing》1998,8(1):15-23

A traditional interpolation model is characterized by the choice of regularizer applied to the interpolant, and the choice of noise model. Typically, the regularizer has a single regularization constant , and the noise model has a single parameter . The ratio / alone is responsible for determining globally all these attributes of the interpolant: its complexity, flexibility, smoothness, characteristic scale length, and characteristic amplitude. We suggest that interpolation models should be able to capture more than just one flavour of simplicity and complexity. We describe Bayesian models in which the interpolant has a smoothness that varies spatially. We emphasize the importance, in practical implementation, of the concept of conditional convexity when designing models with many hyperparameters. We apply the new models to the interpolation of neuronal spike data and demonstrate a substantial improvement in generalization error. 相似文献

5.

Sampling based approach for one-hit and multi-hit models in quantal bioassay

CHU HUI-MAY KUO LYNN 《Statistics and Computing》1997,7(3):183-192

Bayesian methods for estimating the dose response curves with the one-hit model, the gamma multi-hit model, and their modified versions with Abbott's correction are studied. The Gibbs sampling approach with data augmentation and with the Metropolis algorithm is employed to compute the Bayes estimates of the potency curves. In addition, estimation of the relative additional risk and the virtually safe dose is studied. Model selection based on conditional predictive ordinates from cross-validated data is developed. 相似文献

6.

Semiparametric Efficient Estimation in the Generalized Odds-Rate Class of Regression Models for Right-Censored Time-to-Event Data

Scharfstein Daniel O. Tsiatis Anastasios A. Gilbert Peter B. 《Lifetime data analysis》1998,4(4):355-391

The generalized odds-rate class of regression models for time to event data is indexed by a non-negative constant and assumes thatg(S(t|Z)) = (t) + Zwhere g(s) = log(^-1(s-) for > 0, g₀(s) = log(- log s), S(t|Z) is the survival function of the time to event for an individual with qx1 covariate vector Z, is a qx1 vector of unknown regression parameters, and (t) is some arbitrary increasing function of t. When =0, this model is equivalent to the proportional hazards model and when =1, this model reduces to the proportional odds model. In the presence of right censoring, we construct estimators for and exp((t)) and show that they are consistent and asymptotically normal. In addition, we show that the estimator for is semiparametric efficient in the sense that it attains the semiparametric variance bound. 相似文献

7.

Principal curves revisited 总被引：15，自引：0，他引：15

Robert Tibshirani 《Statistics and Computing》1992,2(4):183-190

A principal curve (Hastie and Stuetzle, 1989) is a smooth curve passing through the middle of a distribution or data cloud, and is a generalization of linear principal components. We give an alternative definition of a principal curve, based on a mixture model. Estimation is carried out through an EM algorithm. Some comparisons are made to the Hastie-Stuetzle definition. 相似文献

8.

An explanation of generalized profile likelihoods

Joan G. Staniswalis Peter F. Thall 《Statistics and Computing》2001,11(4):293-298

Let X, T, Y be random vectors such that the distribution of Y conditional on covariates partitioned into the vectors X = x and T = t is given by f(y; x, ), where = (, (t)). Here is a parameter vector and (t) is a smooth, real–valued function of t. The joint distribution of X and T is assumed to be independent of and . This semiparametric model is called conditionally parametric because the conditional distribution f(y; x, ) of Y given X = x, T = t is parameterized by a finite dimensional parameter = (, (t)). Severini and Wong (1992. Annals of Statistics 20: 1768–1802) show how to estimate and (·) using generalized profile likelihoods, and they also provide a review of the literature on generalized profile likelihoods. Under specified regularity conditions, they derive an asymptotically efficient estimator of and a uniformly consistent estimator of (·). The purpose of this paper is to provide a short tutorial for this method of estimation under a likelihood–based model, reviewing results from Stein (1956. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, University of California Press, Berkeley, pp. 187–196), Severini (1987. Ph.D Thesis, The University of Chicago, Department of Statistics, Chicago, Illinois), and Severini and Wong (op. cit.). 相似文献

9.

Historical controls and modern survival analysis

Niels Keiding 《Lifetime data analysis》1995,1(1):19-25

Comparison of observed mortality with known, background, or standard rates has taken place for several hundred years. With the developments of regression models for survival data, an increasing interest has arisen in individualizing the standardisation using covariates of each individual. Also, account sometimes needs to be taken of random variation in the standard group.Emphasizing uses of the Cox regression model, this paper surveys a number of critical choices and pitfalls in this area. The methods are illustrated by comparing survival of liver patients after transplantation with survival after conservative treatment. 相似文献

10.

MML Markov classification of sequential data

Edgoose T. Allison L. 《Statistics and Computing》1999,9(4):269-278

General purpose un-supervised classification programs have typically assumed independence between observations in the data they analyse. In this paper we report on an extension to the MML classifier Snob which enables the program to take advantage of some of the extra information implicit in ordered datasets (such as time-series). Specifically the data is modelled as if it were generated from a first order Markov process with as many states as there are classes of observation. The state of such a process at any point in the sequence determines the class from which the corresponding observation is generated. Such a model is commonly referred to as a Hidden Markov Model. The MML calculation for the expected length of a near optimal two-part message stating a specific model of this type and a dataset given this model is presented. Such an estimate enables us to fairly compare models which differ in the number of classes they specify which in turn can guide a robust un-supervised search of the model space. The new program, tSnob, is tested against both synthetic data and a large real world dataset and is found to make unbiased estimates of model parameters and to conduct an effective search of the extended model space. 相似文献

11.

Threshold-range scaling of excitable cellular automata

Robert Fisch Janko Gravner David Griffeath 《Statistics and Computing》1991,1(1):23-39

Each cell of a two-dimensional lattice is painted one of colors, arranged in a color wheel. The colors advance (k tok+1 mod ) either automatically or by contact with at least a threshold number of successor colors in a prescribed local neighborhood. Discrete-time parallel systems of this sort in which color 0 updates by contact and the rest update automatically are called Greenberg-Hastings (GH) rules. A system in which all colors update by contact is called a cyclic cellular automation (CCA). Started from appropriate initial conditions, these models generate periodic traveling waves. Started from random configurations the same rules exhibit complex self-organization, typically characterized by nucleation of locally periodic ram's horns or spirals. Corresponding random processes give rise to a variety of forest fire equilibria that display large-scale stochastic wave fronts. This paper describes a framework, theoretically based, but relying on extensive interactive computer graphics experimentation, for investigation of the complex dynamics shared by excitable media in a broad spectrum of scientific contexts. By focusing on simple mathematical prototypes we hope to obtain a better understanding of the basic organizational principles underlying spatially distributed oscillating systems. 相似文献

12.

The German Administrative Record Census – An object identification problem

Mattis Neiling Hans-J. Lenz 《Allgemeines Statistisches Archiv》2004,88(3):259-277

Summary: The next German census will be an Administrative Record Census. Data from several administrative registers about persons will be merged. Object identification has to be applied, since no unique identification number exists in the registers. We present a two–step procedure. We briefly discuss questions like correctness and completeness of the Administrative Record Census. Then we focus on the object identification problem, that can be perceived as a special classification problem. Pairs of records are to be classified as matched or not matched. To achieve computational efficiency a preselection technique of pairs is applied. Our approach is illustrated with a database containing a large set of consumer addresses.*This work was partially supported by the Berlin–Brandenburg Graduate School in Distributed Information Systems (DFG grant no. GRK 316). The authors thank Michael Fürnrohr for previewing the paper. We would like to thank also for the helpful comments of an anonymous reviewer. 相似文献

13.

Probabilistic ‘generalization’ of functions and dimension-based uniform convergence results

MARTIN ANTHONY 《Statistics and Computing》1998,8(1):5-14

In this largely expository article, we highlight the significance of various types of dimension for obtaining uniform convergence results in probability theory and we demonstrate how these results lead to certain notions of generalization for classes of binary-valued and real-valued functions. We also present new results on the generalization ability of certain types of artificial neural networks with real output. 相似文献

14.

The ℓ1 oblique procrustes problem

Trendafilov Nickolay T. Watson G. A. 《Statistics and Computing》2004,14(1):39-51

In this paper, we reconsider the well-known oblique Procrustes problem where the usual least-squares objective function is replaced by a more robust discrepancy measure, based on the ₁ norm or smooth approximations of it.We propose two approaches to the solution of this problem. One approach is based on convex analysis and uses the structure of the problem to permit a solution to the ₁ norm problem. An alternative approach is to smooth the problem by working with smooth approximations to the ₁ norm, and this leads to a solution process based on the solution of ordinary differential equations on manifolds. The general weighted Procrustes problem (both orthogonal and oblique) can also be solved by the latter approach. Numerical examples to illustrate the algorithms which have been developed are reported and analyzed. 相似文献

15.

Computation of exact confidence intervals from discrete data using studentized test statistics

Paul Kabaila 《Statistics and Computing》2005,15(1):71-78

A new area of research interest is the computation of exact confidence limits or intervals for a scalar parameter of interest from discrete data by inverting a hypothesis test based on a studentized test statistic. See, for example, Chan and Zhang (1999), Agresti and Min (2001) and Agresti (2003) who deal with a difference of binomial probabilities and Agresti and Min (2002) who deal with an odds ratio. However, neither (1) a detailed analysis of the computational issues involved nor (2) a reliable method of computation that deals effectively with these issues is currently available. In this paper we solve these two problems for a very broad class of discrete data models. We suppose that the distribution of the data is determined by (,) where is a nuisance parameter vector. We also consider six different studentized test statistics. Our contributions to (1) are as follows. We show that the P-value resulting from the hypothesis test, considered as a function of the null-hypothesized value of , has both jump and drop discontinuities. Numerical examples are used to demonstrate that these discontinuities lead to the failure of simple-minded approaches to the computation of the confidence limit or interval. We also provide a new method for efficiently computing the set of all possible locations of these discontinuities. Our contribution to (2) is to provide a new and reliable method of computing the confidence limit or interval, based on the knowledge of this set. 相似文献

16.

Firm size distributions and stochastic growth models: a comparison between ICT and Mechanical Italian Companies

Piero?Ganugi Email author Luigi?Grossi Lisa?Crosato 《Statistical Methods and Applications》2004,12(3):391-414

In this paper we analyze the relationship between the distribution of firm size and stochastic processes of growth. Three main models have been suggested by Gibrat (1931), Kalecki (1945) and Champernowne (1973). The first two lead to lognormal distribution and the last to Pareto distribution. We fitted lognormal and Pareto distribution to two Italian sectors: ICT and mechanical. For ICT we found that lognormal distribution must be rejected and Pareto fits reasonably well to the last 30% of largest companies. For mechanical sector we can not reject lognormal distribution. Furthermore, we perform some experiments to corroborate the theoretical models. By means of transition matrices we found that ICT shows features very close to Gibrats and Champernownes models, while Kaleckis model strongly fits to mechanical.JEL Classification: L00, L25, D21Correspondence to: Luigi GrossiThis research was partially supported by grants from Ministero dellIstruzione, dellUniversitá e della Ricerca (MIUR). Despite being the results of a joint work, Sects. 1, 4, 8 and 10 should be attributed to Ganugi, Sects. 3, 6, and 7 to Grossi and Sects. 2, 5, and 9 to Crosato. 相似文献

17.

Sampling from multimodal distributions using tempered transitions

Radford M. Neal 《Statistics and Computing》1996,6(4):353-366

I present a new Markov chain sampling method appropriate for distributions with isolated modes. Like the recently developed method of simulated tempering, the tempered transition method uses a series of distributions that interpolate between the distribution of interest and a distribution for which sampling is easier. The new method has the advantage that it does not require approximate values for the normalizing constants of these distributions, which are needed for simulated tempering, and can be tedious to estimate. Simulated tempering performs a random walk along the series of distributions used. In contrast, the tempered transitions of the new method move systematically from the desired distribution, to the easily-sampled distribution, and back to the desired distribution. This systematic movement avoids the inefficiency of a random walk, an advantage that is unfortunately cancelled by an increase in the number of interpolating distributions required. Because of this, the sampling efficiency of the tempered transition method in simple problems is similar to that of simulated tempering. On more complex distributions, however, simulated tempering and tempered transitions may perform differently. Which is better depends on the ways in which the interpolating distributions are deceptive. 相似文献

18.

The calibration of P-values,posterior Bayes factors and the AIC from the posterior distribution of the likelihood

Aitkin Murray 《Statistics and Computing》1997,7(4):253-261

The posterior distribution of the likelihood is used to interpret the evidential meaning of P-values, posterior Bayes factors and Akaike's information criterion when comparing point null hypotheses with composite alternatives. Asymptotic arguments lead to simple re-calibrations of these criteria in terms of posterior tail probabilities of the likelihood ratio. (Prior) Bayes factors cannot be calibrated in this way as they are model-specific. 相似文献

19.

Computing location depth and regression depth in higher dimensions 总被引：3，自引：0，他引：3

Peter J. Rousseeuw Anja Struyf 《Statistics and Computing》1998,8(3):193-203

The location depth (Tukey 1975) of a point relative to a p-dimensional data set Z of size n is defined as the smallest number of data points in a closed halfspace with boundary through . For bivariate data, it can be computed in O(nlogn) time (Rousseeuw and Ruts 1996). In this paper we construct an exact algorithm to compute the location depth in three dimensions in O(n2logn) time. We also give an approximate algorithm to compute the location depth in p dimensions in O(mp3+mpn) time, where m is the number of p-subsets used.Recently, Rousseeuw and Hubert (1996) defined the depth of a regression fit. The depth of a hyperplane with coefficients (1,...,p) is the smallest number of residuals that need to change sign to make (1,...,p) a nonfit. For bivariate data (p=2) this depth can be computed in O(nlogn) time as well. We construct an algorithm to compute the regression depth of a plane relative to a three-dimensional data set in O(n2logn) time, and another that deals with p=4 in O(n3logn) time. For data sets with large n and/or p we propose an approximate algorithm that computes the depth of a regression fit in O(mp3+mpn+mnlogn) time. For all of these algorithms, actual implementations are made available. 相似文献

20.

Probabilistic text understanding

Robert P. Goldman Eugene Charniak 《Statistics and Computing》1992,2(2):105-114

We discuss a new framework for text understanding. Three major design decisions characterize this approach. First, we take the problem of text understanding to be a particular case of the general problem of abductive inference. Second, we use probability theory to handle the uncertainty which arises in this abductive inference process. Finally, all aspects of natural language processing are treated in the same framework, allowing us to integrate syntactic, semantic and pragmatic constraints. In order to apply probability theory to this problem, we have developed a probabilistic model of text understanding. To make it practical to use this model, we have devised a way of incrementally constructing and evaluating belief networks. We have written a program,wimp3, to experiment with this framework. To evaluate this program, we have developed a simple single-blind testing method. 相似文献