期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Semiparametric Efficient Estimation in the Generalized Odds-Rate Class of Regression Models for Right-Censored Time-to-Event Data

Scharfstein Daniel O. Tsiatis Anastasios A. Gilbert Peter B. 《Lifetime data analysis》1998,4(4):355-391

The generalized odds-rate class of regression models for time to event data is indexed by a non-negative constant and assumes thatg(S(t|Z)) = (t) + Zwhere g(s) = log(^-1(s-) for > 0, g₀(s) = log(- log s), S(t|Z) is the survival function of the time to event for an individual with qx1 covariate vector Z, is a qx1 vector of unknown regression parameters, and (t) is some arbitrary increasing function of t. When =0, this model is equivalent to the proportional hazards model and when =1, this model reduces to the proportional odds model. In the presence of right censoring, we construct estimators for and exp((t)) and show that they are consistent and asymptotically normal. In addition, we show that the estimator for is semiparametric efficient in the sense that it attains the semiparametric variance bound. 相似文献

2.

Interpolation models with multiple hyperparameters

DAVID J. C. MACKAY RYO TAKEUCHI 《Statistics and Computing》1998,8(1):15-23

A traditional interpolation model is characterized by the choice of regularizer applied to the interpolant, and the choice of noise model. Typically, the regularizer has a single regularization constant , and the noise model has a single parameter . The ratio / alone is responsible for determining globally all these attributes of the interpolant: its complexity, flexibility, smoothness, characteristic scale length, and characteristic amplitude. We suggest that interpolation models should be able to capture more than just one flavour of simplicity and complexity. We describe Bayesian models in which the interpolant has a smoothness that varies spatially. We emphasize the importance, in practical implementation, of the concept of conditional convexity when designing models with many hyperparameters. We apply the new models to the interpolation of neuronal spike data and demonstrate a substantial improvement in generalization error. 相似文献

3.

Computation of exact confidence intervals from discrete data using studentized test statistics

Paul Kabaila 《Statistics and Computing》2005,15(1):71-78

A new area of research interest is the computation of exact confidence limits or intervals for a scalar parameter of interest from discrete data by inverting a hypothesis test based on a studentized test statistic. See, for example, Chan and Zhang (1999), Agresti and Min (2001) and Agresti (2003) who deal with a difference of binomial probabilities and Agresti and Min (2002) who deal with an odds ratio. However, neither (1) a detailed analysis of the computational issues involved nor (2) a reliable method of computation that deals effectively with these issues is currently available. In this paper we solve these two problems for a very broad class of discrete data models. We suppose that the distribution of the data is determined by (,) where is a nuisance parameter vector. We also consider six different studentized test statistics. Our contributions to (1) are as follows. We show that the P-value resulting from the hypothesis test, considered as a function of the null-hypothesized value of , has both jump and drop discontinuities. Numerical examples are used to demonstrate that these discontinuities lead to the failure of simple-minded approaches to the computation of the confidence limit or interval. We also provide a new method for efficiently computing the set of all possible locations of these discontinuities. Our contribution to (2) is to provide a new and reliable method of computing the confidence limit or interval, based on the knowledge of this set. 相似文献

4.

Convergence assessment techniques for Markov chain Monte Carlo 总被引：7，自引：0，他引：7

BROOKS STEPHEN P. ROBERTS GARETH O. 《Statistics and Computing》1998,8(4):319-335

MCMC methods have effectively revolutionised the field of Bayesian statistics over the past few years. Such methods provide invaluable tools to overcome problems with analytic intractability inherent in adopting the Bayesian approach to statistical modelling.However, any inference based upon MCMC output relies critically upon the assumption that the Markov chain being simulated has achieved a steady state or converged. Many techniques have been developed for trying to determine whether or not a particular Markov chain has converged, and this paper aims to review these methods with an emphasis on the mathematics underpinning these techniques, in an attempt to summarise the current state-of-play for convergence assessment techniques and to motivate directions for future research in this area. 相似文献

5.

Sampling based approach for one-hit and multi-hit models in quantal bioassay

CHU HUI-MAY KUO LYNN 《Statistics and Computing》1997,7(3):183-192

Bayesian methods for estimating the dose response curves with the one-hit model, the gamma multi-hit model, and their modified versions with Abbott's correction are studied. The Gibbs sampling approach with data augmentation and with the Metropolis algorithm is employed to compute the Bayes estimates of the potency curves. In addition, estimation of the relative additional risk and the virtually safe dose is studied. Model selection based on conditional predictive ordinates from cross-validated data is developed. 相似文献

6.

Corrected p-values for tests based on estimated nuisance parameters

Martin Crowder 《Statistics and Computing》2001,11(4):359-365

In some situations the asymptotic distribution of a random function T _n() that depends on a nuisance parameter is tractable when has known value. In that case it can be used as a test statistic, if suitably constructed, for some hypothesis. However, in practice, often needs to be replaced by an estimator S _n. In this paper general results are given concerning the asymptotic distribution of T _n(S _n) that include special cases previously dealt with. In particular, some situations are covered where the usual likelihood theory is nonregular and extreme values are employed to construct estimators and test statistics. 相似文献

7.

Preisindikatoren für Wohnimmobilien in Deutschland

Hans-Albert Leifer 《Allgemeines Statistisches Archiv》2004,88(4):435-450

Zusammenfassung: Vermögenspreise im Allgemeinen und Immobilienpreise im Besonderen gewannen in den zurückliegenden Jahren mehr und mehr an Bedeutung. Während sie in den späten 80er Jahren (nach dem Börsencrash im Herbst 1987) und im vergangenen Jahrzehnt vornehmlich unter dem Schlagwort asset-price inflation/deflation betrachtet wurden, stehen neuerdings die Tragfähigkeit und Bestandsfestigkeit der Finanzsysteme im Vordergrund. In den Ausführungen geht es vor allem um die Frage, warum, seit wann und aufgrund welcher Grunddaten die Deutsche Bundesbank auf diesem Gebiet der Preisstatistik tätig geworden ist. Dabei wird nicht nur auf das hohe Maß an Unsicherheit in den vorgelegten Angaben hingewiesen, sondern auch der Second–Best–Charakter der Berechnungen hervorgehoben.

Summary: Asset prices in general and property prices in particular have gained increasing importance in recent years. Whereas in the late 1980s (after the stock market crash in autumn 1987) and in the last decade these prices mainly came under the heading of asset-price inflation/deflation, the focus has recently shifted to sustainable and viable financial systems. The notes primarily explain why the Bundesbank is involved in this area of price statistics, when this involvement began and what underlying data the Bundesbank uses. At the same time, they not only indicate the large degree of uncertainty in the reported data but also highlight the second-best nature of the calculations.

*Vortrag anlässlich der 9. Konferenz Messen der Teuerung am 17./18. Juni 2004 in Marburg. Der Verfasser gibt seine persönliche Auffassung wieder, die nicht unbedingt mit derjenigen der Deutschen Bundesbank übereinstimmen muss. 相似文献

8.

Analysis of fracture toughness data

Martin Crowder 《Lifetime data analysis》1995,1(1):59-71

In studies of the fracture toughness of irradiated weld metal, specimens are subjected to an increasing load. The test on any one specimen might be terminated by choice or because the specimen ruptures. Prior to termination, ductile tearing might or might not have occurred. The situation is thus basically one of competing risks, with different types of termination, but there are additional features. The major purpose of statistical analysis is to estimate probabilities concerning the values of toughness and crack length. The analysis has been based on a model developed for the joint survivor function of these quantities. 相似文献

9.

An explanation of generalized profile likelihoods

Joan G. Staniswalis Peter F. Thall 《Statistics and Computing》2001,11(4):293-298

Let X, T, Y be random vectors such that the distribution of Y conditional on covariates partitioned into the vectors X = x and T = t is given by f(y; x, ), where = (, (t)). Here is a parameter vector and (t) is a smooth, real–valued function of t. The joint distribution of X and T is assumed to be independent of and . This semiparametric model is called conditionally parametric because the conditional distribution f(y; x, ) of Y given X = x, T = t is parameterized by a finite dimensional parameter = (, (t)). Severini and Wong (1992. Annals of Statistics 20: 1768–1802) show how to estimate and (·) using generalized profile likelihoods, and they also provide a review of the literature on generalized profile likelihoods. Under specified regularity conditions, they derive an asymptotically efficient estimator of and a uniformly consistent estimator of (·). The purpose of this paper is to provide a short tutorial for this method of estimation under a likelihood–based model, reviewing results from Stein (1956. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, University of California Press, Berkeley, pp. 187–196), Severini (1987. Ph.D Thesis, The University of Chicago, Department of Statistics, Chicago, Illinois), and Severini and Wong (op. cit.). 相似文献

10.

Jerome H. Friedman and Nicholas I. Fisher 总被引：1，自引：0，他引：1

Friedman Jerome H. Fisher Nicholas I. 《Statistics and Computing》1999,9(2):123-143

Many data analytic questions can be formulated as (noisy) optimization problems. They explicitly or implicitly involve finding simultaneous combinations of values for a set of (input) variables that imply unusually large (or small) values of another designated (output) variable. Specifically, one seeks a set of subregions of the input variable space within which the value of the output variable is considerably larger (or smaller) than its average value over the entire input domain. In addition it is usually desired that these regions be describable in an interpretable form involving simple statements (rules) concerning the input values. This paper presents a procedure directed towards this goal based on the notion of patient rule induction. This patient strategy is contrasted with the greedy ones used by most rule induction methods, and semi-greedy ones used by some partitioning tree techniques such as CART. Applications involving scientific and commercial data bases are presented. 相似文献

11.

Information preserving statistical obfuscation

Burridge Jim 《Statistics and Computing》2003,13(4):321-327

The problem of limiting the disclosure of information gathered on a set of companies or individuals (the respondents) is considered, the aim being to provide useful information while preserving confidentiality of sensitive information. The paper proposes a method which explicitly preserves certain information contained in the data. The data are assumed to consist of two sets of information on each respondent: public data and specific survey data. It is assumed in this paper that both sets of data are liable to be released for a subset of respondents. However, the public data will be altered in some way to preserve confidentiality whereas the specific survey data is to be disclosed without alteration. The paper proposes a model based approach to this problem by utilizing the information contained in the sufficient statistics obtained from fitting a model to the public data by conditioning on the survey data. Deterministic and stochastic variants of the method are considered. 相似文献

12.

Foundational issues concerning the analysis of censored data

Richard E. Barlow Peisung Tsai 《Lifetime data analysis》1995,1(1):27-34

The common approach to analyzing censored data utilizes competing risk models; a class of distribution is first chosen and then the sufficient statistics are identified! An operational Bayesian approach (Barlow 1993) for analyzing censored data would require a somewhat different methodology. In this approach, we first determine potentially observable parameters of interest. We then determine the data summaries (sufficient statistics) for these parameters. Tsai (1994) suggests that the observed sample frequency is sufficient for predicting the population frequency. Invariant probability measures (likelihoods), conditional on the parameters of interest, are then derived based on the principle of sufficiency and the principle of insufficient reason.Research partially supported by the Army Research Office (DAAL03-91-G-0046) grant to the University of California at Berkeley. 相似文献

13.

Some recent developments for regression analysis of multivariate failure time data

Kung-Yee Liang Steven G. Self Karen J. Bandeen-Roche Scott L. Zeger 《Lifetime data analysis》1995,1(4):403-415

Cox's seminal 1972 paper on regression methods for possibly censored failure time data popularized the use of time to an event as a primary response in prospective studies. But one key assumption of this and other regression methods is that observations are independent of one another. In many problems, failure times are clustered into small groups where outcomes within a group are correlated. Examples include failure times for two eyes from one person or for members of the same family.This paper presents a survey of models for multivariate failure time data. Two distinct classes of models are considered: frailty and marginal models. In a frailty model, the correlation is assumed to derive from latent variables (frailties) common to observations from the same cluster. Regression models are formulated for the conditional failure time distribution given the frailties. Alternatively, marginal models describe the marginal failure time distribution of each response while separately modelling the association among responses from the same cluster.We focus on recent extensions of the proportional hazards model for multivariate failure time data. Model formulation, parameter interpretation and estimation procedures are considered. 相似文献

14.

A Diagnostic for Association in Bivariate Survival Models

Chen MC Bandeen-Roche K 《Lifetime data analysis》2005,11(2):245-264

We propose exploratory, easily implemented methods for diagnosing the appropriateness of an underlying copula model for bivariate failure time data, allowing censoring in either or both failure times. It is found that the proposed approach effectively distinguishes gamma from positive stable copula models when the sample is moderately large or the association is strong. Data from the Womens Health and Aging Study (WHAS, Guralnik et al., The Womenss Health and Aging Study: Health and Social Characterisitics of Older Women with Disability. National Institute on Aging: Bethesda, Mayland, 1995) are analyzed to demonstrate the proposed diagnostic methodology. The positive stable model gives a better overall fit to these data than the gamma frailty model, but it tends to underestimate association at the later time points. The finding is consistent with recent theory differentiating catastrophic from progressive disability onset in older adults. The proposed methods supply an interpretable quantity for copula diagnosis. We hope that they will usefully inform practitioners as to the reasonableness of their modeling choices. 相似文献

15.

Procrustes shape analysis of triangulations of a two coloured point pattern

Mohammed Reza Faghihi Charles C. Taylor Ian L. Dryden 《Statistics and Computing》1999,9(1):43-53

Consider a set of points in the plane with Gaussian perturbations about a regular mean configuration in which a Delaunay triangulation of the mean of the process is comprised of equilateral triangles of the same size. The points are labelled at random as black or white with variances of the perturbations possibly dependent on the colour. By investigating triangle subsets (with four sets of possible colour labels for the vertices) in detail we propose various test statistics based on a Procrustes shape analysis. A simulation study is carried out to investigate the relative merits and the adequacy of the approximations used in the distributional results, as well as a comparison with simulation methods based on nearest-neighbour distances. The methodology is applied to an investigation of regularity in human muscle fibre cross-sections. 相似文献

16.

Generating random numbers of prescribed distribution using physical sources

Daniel Neuenschwander Hansmartin Zeuner 《Statistics and Computing》2003,13(1):5-11

When constructing uniform random numbers in [0, 1] from the output of a physical device, usually n independent and unbiased bits B _j are extracted and combined into the machine number . In order to reduce the number of data used to build one real number, we observe that for independent and exponentially distributed random variables X _n (which arise for example as waiting times between two consecutive impulses of a Geiger counter) the variable U _n : = X _{2n – 1}/(X _{2n – 1} + X _2n) is uniform in [0, 1]. In the practical application X _n can only be measured up to a given precision (in terms of the expectation of the X _n); it is shown that the distribution function obtained by calculating U _n from these measurements differs from the uniform by less than /2.We compare this deviation with the error resulting from the use of biased bits B _j with P {B _j = 1{ = (where ] – [) in the construction of Y above. The influence of a bias is given by the estimate that in the p-total variation norm Q^TV _p = ( |Q()|^p)^1/p (p 1) we have P _Y – P ⁰ _Y^TV _p (c _n · )^1/p with c _n p for n . For the distribution function F _Y – F ⁰ _Y 2(1 – 2^–n)|| holds. 相似文献

17.

Statistical modelling of artificial neural networks using the multi-layer perceptron

Murray Aitkin Rob Foxall 《Statistics and Computing》2003,13(3):227-239

Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. Training such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty.Although the original perceptron, developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each node, this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the hidden layers can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer).We examine the usual MLP objective function—the sum of squares—and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model.We conclude with some general comments on the relation between the MLP and latent variable models. 相似文献

18.

Calculating the density and distribution function for the singly and doubly noncentral F

Ronald W. Butler Marc S. Paolella 《Statistics and Computing》2002,12(1):9-16

Simple, closed form saddlepoint approximations for the distribution and density of the singly and doubly noncentral F distributions are presented. Their overwhelming accuracy is demonstrated numerically using a variety of parameter values. The approximations are shown to be uniform in the right tail and the associated limitating relative error is derived. Difficulties associated with some algorithms used for exact computation of the singly noncentral F are noted. 相似文献

19.

The Statistics of Simulating Chaos

Phil Diamond Alexei Pokrovskii 《Statistics and Computing》2001,11(3):217-228

When simulating a dynamical system, the computation is actually of a spatially discretized system, because finite machine arithmetic replaces continuum state space. For chaotic dynamical systems, the discretized simulations often have collapsing effects, to a fixed point or to short cycles. Statistical properties of these phenomena can be modelled with random mappings with an absorbing centre. The model gives results which are very much in line with computational experiments. The effects are discussed with special reference to the family of mappings f (x)=1-|1-2x|,x [0,1],1,<,,<,. Computer experiments show close agreement with predictions of the model. 相似文献

20.

Fitting Weibull duration models with random effects 总被引：1，自引：0，他引：1

Carl Morris Cindy Christiansen 《Lifetime data analysis》1995,1(4):347-359

Duration time models often should include correlated failure times, due to clustered data. These random effects hierarchical models sometimes are called frailty models when used for survival analyses. The data analyzed here involve such correlations because patient level outcomes (the times until graft failure following kidney transplantation) are observed, but patients are clustered in different transplant centers. We describe fitting such models by combining two kinds of software, one for parametric survival regression models, and the other for doing Poisson regression in a hierarchical setting. The latter is implemented by using PRIMM (Poisson Regression and Interactive Multilevel Modeling) methods and software (Christiansen & Morris, 1994a). An illustrative example for profiling data is included withk=11 kidney transplant centers andN=412 patients. 相似文献