首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A new area of research interest is the computation of exact confidence limits or intervals for a scalar parameter of interest from discrete data by inverting a hypothesis test based on a studentized test statistic. See, for example, Chan and Zhang (1999), Agresti and Min (2001) and Agresti (2003) who deal with a difference of binomial probabilities and Agresti and Min (2002) who deal with an odds ratio. However, neither (1) a detailed analysis of the computational issues involved nor (2) a reliable method of computation that deals effectively with these issues is currently available. In this paper we solve these two problems for a very broad class of discrete data models. We suppose that the distribution of the data is determined by (,) where is a nuisance parameter vector. We also consider six different studentized test statistics. Our contributions to (1) are as follows. We show that the P-value resulting from the hypothesis test, considered as a function of the null-hypothesized value of , has both jump and drop discontinuities. Numerical examples are used to demonstrate that these discontinuities lead to the failure of simple-minded approaches to the computation of the confidence limit or interval. We also provide a new method for efficiently computing the set of all possible locations of these discontinuities. Our contribution to (2) is to provide a new and reliable method of computing the confidence limit or interval, based on the knowledge of this set.  相似文献   

2.
When simulating a dynamical system, the computation is actually of a spatially discretized system, because finite machine arithmetic replaces continuum state space. For chaotic dynamical systems, the discretized simulations often have collapsing effects, to a fixed point or to short cycles. Statistical properties of these phenomena can be modelled with random mappings with an absorbing centre. The model gives results which are very much in line with computational experiments. The effects are discussed with special reference to the family of mappings f (x)=1-|1-2x|,x [0,1],1,<,,<,. Computer experiments show close agreement with predictions of the model.  相似文献   

3.
In some situations the asymptotic distribution of a random function T n() that depends on a nuisance parameter is tractable when has known value. In that case it can be used as a test statistic, if suitably constructed, for some hypothesis. However, in practice, often needs to be replaced by an estimator S n. In this paper general results are given concerning the asymptotic distribution of T n(S n) that include special cases previously dealt with. In particular, some situations are covered where the usual likelihood theory is nonregular and extreme values are employed to construct estimators and test statistics.  相似文献   

4.
Jerome H. Friedman and Nicholas I. Fisher   总被引:1,自引:0,他引:1  
Many data analytic questions can be formulated as (noisy) optimization problems. They explicitly or implicitly involve finding simultaneous combinations of values for a set of (input) variables that imply unusually large (or small) values of another designated (output) variable. Specifically, one seeks a set of subregions of the input variable space within which the value of the output variable is considerably larger (or smaller) than its average value over the entire input domain. In addition it is usually desired that these regions be describable in an interpretable form involving simple statements (rules) concerning the input values. This paper presents a procedure directed towards this goal based on the notion of patient rule induction. This patient strategy is contrasted with the greedy ones used by most rule induction methods, and semi-greedy ones used by some partitioning tree techniques such as CART. Applications involving scientific and commercial data bases are presented.  相似文献   

5.
The K principal points of a p-variate random variable X are defined as those points 1,..., K which minimize the expected squared distance of X from the nearest of the k . This paper reviews some of the theory of principal points and presents a method of determining principal points of univariate continuous distributions. The method is applied to the uniform distribution, to the normal distribution and to the exponential distribution.  相似文献   

6.
In a regression or classification setting where we wish to predict Y from x1,x2,..., xp, we suppose that an additional set of coaching variables z1,z2,..., zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,..., xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp. The relative merits of these approaches are discussed and compared in a number of examples.  相似文献   

7.
Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. Training such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty.Although the original perceptron, developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each node, this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the hidden layers can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer).We examine the usual MLP objective function—the sum of squares—and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model.We conclude with some general comments on the relation between the MLP and latent variable models.  相似文献   

8.
Hougaard's (1986) bivariate Weibull distribution with positive stable frailties is applied to matched pairs survival data when either or both components of the pair may be censored and covariate vectors may be of arbitrary fixed length. When there is no censoring, we quantify the corresponding gain in Fisher information over a fixed-effects analysis. With the appropriate parameterization, the results take a simple algebraic form. An alternative marginal (independence working model) approach to estimation is also considered. This method ignores the correlation between the two survival times in the derivation of the estimator, but provides a valid estimate of standard error. It is shown that when both the correlation between the two survival times is high, and the ratio of the within-pair variability to the between-pair variability of the covariates is high, the fixed-effects analysis captures most of the information about the regression coefficient but the independence working model does badly. When the correlation is low, and/or most of the variability of the covariates occurs between pairs, the reverse is true. The random effects model is applied to data on skin grafts, and on loss of visual acuity among diabetics. In conclusion some extensions of the methods are indicated and they are placed in a wider context of Generalized Estimating Equation methodology.  相似文献   

9.
In the exponential regression model, Bayesian inference concerning the non-linear regression parameter has proved extremely difficult. In particular, standard improper diffuse priors for the usual parameters lead to an improper posterior for the non-linear regression parameter. In a recent paper Ye and Berger (1991) applied the reference prior approach of Bernardo (1979) and Berger and Bernardo (1989) yielding a proper informative prior for . This prior depends on the values of the explanatory variable, goes to 0 as goes to 1, and depends on the specification of a hierarchical ordering of importance of the parameters.This paper explains the failure of the uniform prior to give a proper posterior: the reason is the appearance of the determinant of the information matrix in the posterior density for . We apply the posterior Bayes factor approach of Aitkin (1991) to this problem; in this approach we integrate out nuisance parameters with respect to their conditional posterior density given the parameter of interest. The resulting integrated likelihood for requires only the standard diffuse prior for all the parameters, and is unaffected by orderings of importance of the parameters. Computation of the likelihood for is extremely simple. The approach is applied to the three examples discussed by Berger and Ye and the likelihoods compared with their posterior densities.  相似文献   

10.
In testing product reliability, there is often a critical cutoff level that determines whether a specimen is classified as failed. One consequence is that the number of degradation data collected varies from specimen to specimen. The information of random sample size should be included in the model, and our study shows that it can be influential in estimating model parameters. Two-stage least squares (LS) and maximum modified likelihood (MML) estimation, which both assume fixed sample sizes, are commonly used for estimating parameters in the repeated measurements models typically applied to degradation data. However, the LS estimate is not consistent in the case of random sample sizes. This article derives the likelihood for the random sample size model and suggests using maximum likelihood (ML) for parameter estimation. Our simulation studies show that ML estimates have smaller biases and variances compared to the LS and MML estimates. All estimation methods can be greatly improved if the number of specimens increases from 5 to 10. A data set from a semiconductor application is used to illustrate our methods.  相似文献   

11.
The common approach to analyzing censored data utilizes competing risk models; a class of distribution is first chosen and then the sufficient statistics are identified! An operational Bayesian approach (Barlow 1993) for analyzing censored data would require a somewhat different methodology. In this approach, we first determine potentially observable parameters of interest. We then determine the data summaries (sufficient statistics) for these parameters. Tsai (1994) suggests that the observed sample frequency is sufficient for predicting the population frequency. Invariant probability measures (likelihoods), conditional on the parameters of interest, are then derived based on the principle of sufficiency and the principle of insufficient reason.Research partially supported by the Army Research Office (DAAL03-91-G-0046) grant to the University of California at Berkeley.  相似文献   

12.
Over the last few years many studies have been carried out in Italy to identify reliable small area labour force indicators. Considering the rotated sample design of the Italian Labour Force Survey, the aim of this work is to derive a small area estimator which borrows strength from individual temporal correlation, as well as from related areas. Two small area estimators are derived as extensions of an estimation strategies proposed by Fuller (1990) for partial overlap samples. A simulation study is carried out to evaluate the gain in efficiency provided by our solutions. Results obtained for different levels of autocorrelation between repeated measurements on the same outcome and different population settings show that these estimators are always more reliable than the traditional composite one, and in some circumstances they are extremely advantageous.The present paper is financially supported by Murst-Cofin (2001) Lutilizzo di informazioni di tipo amministrativo nella stima per piccole aree e per sottoinsiemi della popolazione (National Coordinator Prof. Carlo Filippucci).  相似文献   

13.
Consider a set of points in the plane with Gaussian perturbations about a regular mean configuration in which a Delaunay triangulation of the mean of the process is comprised of equilateral triangles of the same size. The points are labelled at random as black or white with variances of the perturbations possibly dependent on the colour. By investigating triangle subsets (with four sets of possible colour labels for the vertices) in detail we propose various test statistics based on a Procrustes shape analysis. A simulation study is carried out to investigate the relative merits and the adequacy of the approximations used in the distributional results, as well as a comparison with simulation methods based on nearest-neighbour distances. The methodology is applied to an investigation of regularity in human muscle fibre cross-sections.  相似文献   

14.
The problem of limiting the disclosure of information gathered on a set of companies or individuals (the respondents) is considered, the aim being to provide useful information while preserving confidentiality of sensitive information. The paper proposes a method which explicitly preserves certain information contained in the data. The data are assumed to consist of two sets of information on each respondent: public data and specific survey data. It is assumed in this paper that both sets of data are liable to be released for a subset of respondents. However, the public data will be altered in some way to preserve confidentiality whereas the specific survey data is to be disclosed without alteration. The paper proposes a model based approach to this problem by utilizing the information contained in the sufficient statistics obtained from fitting a model to the public data by conditioning on the survey data. Deterministic and stochastic variants of the method are considered.  相似文献   

15.
A traditional interpolation model is characterized by the choice of regularizer applied to the interpolant, and the choice of noise model. Typically, the regularizer has a single regularization constant , and the noise model has a single parameter . The ratio / alone is responsible for determining globally all these attributes of the interpolant: its complexity, flexibility, smoothness, characteristic scale length, and characteristic amplitude. We suggest that interpolation models should be able to capture more than just one flavour of simplicity and complexity. We describe Bayesian models in which the interpolant has a smoothness that varies spatially. We emphasize the importance, in practical implementation, of the concept of conditional convexity when designing models with many hyperparameters. We apply the new models to the interpolation of neuronal spike data and demonstrate a substantial improvement in generalization error.  相似文献   

16.
In studies of the fracture toughness of irradiated weld metal, specimens are subjected to an increasing load. The test on any one specimen might be terminated by choice or because the specimen ruptures. Prior to termination, ductile tearing might or might not have occurred. The situation is thus basically one of competing risks, with different types of termination, but there are additional features. The major purpose of statistical analysis is to estimate probabilities concerning the values of toughness and crack length. The analysis has been based on a model developed for the joint survivor function of these quantities.  相似文献   

17.
Convergence assessment techniques for Markov chain Monte Carlo   总被引:7,自引:0,他引:7  
MCMC methods have effectively revolutionised the field of Bayesian statistics over the past few years. Such methods provide invaluable tools to overcome problems with analytic intractability inherent in adopting the Bayesian approach to statistical modelling.However, any inference based upon MCMC output relies critically upon the assumption that the Markov chain being simulated has achieved a steady state or converged. Many techniques have been developed for trying to determine whether or not a particular Markov chain has converged, and this paper aims to review these methods with an emphasis on the mathematics underpinning these techniques, in an attempt to summarise the current state-of-play for convergence assessment techniques and to motivate directions for future research in this area.  相似文献   

18.
Let X, T, Y be random vectors such that the distribution of Y conditional on covariates partitioned into the vectors X = x and T = t is given by f(y; x, ), where = (, (t)). Here is a parameter vector and (t) is a smooth, real–valued function of t. The joint distribution of X and T is assumed to be independent of and . This semiparametric model is called conditionally parametric because the conditional distribution f(y; x, ) of Y given X = x, T = t is parameterized by a finite dimensional parameter = (, (t)). Severini and Wong (1992. Annals of Statistics 20: 1768–1802) show how to estimate and (·) using generalized profile likelihoods, and they also provide a review of the literature on generalized profile likelihoods. Under specified regularity conditions, they derive an asymptotically efficient estimator of and a uniformly consistent estimator of (·). The purpose of this paper is to provide a short tutorial for this method of estimation under a likelihood–based model, reviewing results from Stein (1956. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, University of California Press, Berkeley, pp. 187–196), Severini (1987. Ph.D Thesis, The University of Chicago, Department of Statistics, Chicago, Illinois), and Severini and Wong (op. cit.).  相似文献   

19.
The generalized odds-rate class of regression models for time to event data is indexed by a non-negative constant and assumes thatg(S(t|Z)) = (t) + Zwhere g(s) = log(-1(s-) for > 0, g0(s) = log(- log s), S(t|Z) is the survival function of the time to event for an individual with qx1 covariate vector Z, is a qx1 vector of unknown regression parameters, and (t) is some arbitrary increasing function of t. When =0, this model is equivalent to the proportional hazards model and when =1, this model reduces to the proportional odds model. In the presence of right censoring, we construct estimators for and exp((t)) and show that they are consistent and asymptotically normal. In addition, we show that the estimator for is semiparametric efficient in the sense that it attains the semiparametric variance bound.  相似文献   

20.
Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian estimate of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号