Galton's first work on regression probably led him to think of it as a unidirectional, genetic process, which he called “reversion.” A subsequent experiment on family heights made him realize that the phenomenon was symmetric and nongenetic. Galton then abandoned “reversion” in favor of “regression.” Final confirmation was provided through Dickson's mathematical analysis and Galton's examination of height data on brothers.  相似文献   

The tabled significance values of the Kolmogorov-Smirnov goodness-of-fit statistic determined for continuous underlying distributions are conservative for applications involving discrete underlying distributions. Conover (1972) proposed an efficient method for computing the exact significance level of the Kolmogorov-Smirnov test for discrete distributions; however, he warned against its use for large sample sizes because “the calculations become too difficult.”

In this work we explore the relationship between sample size and the computational effectiveness of Conover's formulas, where “computational effectiveness” is taken to mean the accuracy attained with a fixed precision of machine arithmetic. The nature of the difficulties in calculations is pointed out. It is indicated that, despite these difficulties, Conover's method of computing the Kolmogorov-Smirnov significance level for discrete distributions can still be a useful tool for a wide range of sample sizes.  相似文献   

The change from the z of “Student's” 1908 paper to the t of present day statistical theory and practice is traced and documented. It is shown that the change was brought about by the extension of “Student's” approach, by R.A. Fisher, to a broader class of problems, in response to a direct appeal from “Student” for a solution to one of these problems.  相似文献   

The promising methodology of the “Statistical Learning Theory” for the estimation of multimodal distribution is thoroughly studied. The “tail” is estimated through Hill's, UH and moment methods. The threshold value is determined by nonparametric bootstrap and the minimum mean square error criterion. Further, the “body” is estimated by the nonparametric structural risk minimization method of the empirical distribution function under the regression set-up. As an illustration, rainfall data for the meteorological subdivision of Orissa, India during the period 1871–2006 are used. It is shown that Hill's method has performed the best for tail density. Finally, the combined estimated “body” and “tail” of the multimodal distribution is shown to capture the multimodality present in the data.  相似文献   

A regular supply of applicants to Queen's University in Kingston, Ontario is provided by 65 high schools. Each high school can be characterized by a series of grading standards which change from year to year. To aid admissions decisions, it is desirable to forecast the current year's grading standards for all 65 high schools using grading standards estimated from past year's data. We develop and apply a Bayesian break-point time-series model that generates forecasts which involve smoothing across time for each school and smoothing across schools. “Break point” refers to a point in time which divides the past into the “old past” and the “recent past” where the yearly observations in the recent past are exchangeable with the observations in the year to be forecast. We show that this model works fairly well when applied to 11 years of Queen's University data. The model can be applied to other data sets with the parallel time-series structure and short history, and can be extended in several ways to more complicated structures.  相似文献   

This paper extends the result of Padgett (1981) and gives a Bayes estimate of the reliability function of two-parameter inverse Gaussian distribution using Jeffreys' non-informative joint prior and a squared error loss fun ction . A numerical example is given. Based on a Monte Carlo simulation, Bayes estimator of reliability is compared with its maximum likelihood counterpart.  相似文献   

The test of variance components of possibly correlated random effects in generalized linear mixed models (GLMMs) can be used to examine if there exists heterogeneous effects. The Bayesian test with Bayes factors offers a flexible method. In this article, we focus on the performance of Bayesian tests under three reference priors and a conjugate prior: an approximate uniform shrinkage prior, modified approximate Jeffreys' prior, half-normal unit information prior and Wishart prior. To compute Bayes factors, we propose a hybrid approximation approach combining a simulated version of Laplace's method and importance sampling techniques to test the variance components in GLMMs.  相似文献   

The confidence interval of the Kaplan–Meier estimate of the survival probability at a fixed time point is often constructed by the Greenwood formula. This normal approximation-based method can be looked as a Wald type confidence interval for a binomial proportion, the survival probability, using the “effective” sample size defined by Cutler and Ederer. Wald-type binomial confidence interval has been shown to perform poorly comparing to other methods. We choose three methods of binomial confidence intervals for the construction of confidence interval for survival probability: Wilson's method, Agresti–Coull's method, and higher-order asymptotic likelihood method. The methods of “effective” sample size proposed by Peto et al. and Dorey and Korn are also considered. The Greenwood formula is far from satisfactory, while confidence intervals based on the three methods of binomial proportion using Cutler and Ederer's “effective” sample size have much better performance.  相似文献   

The two-parameter lognormal distribution with density function f(y: γ, σ2) = [(2πσ2)1/2y] 1exp[?(ln y ? γ)2/2σ2], y > 0, is important as a failure-time model in life testing. In this paper, Bayesian lower bounds for the reliability function R(t: γ, σ2) = ?[(γ ? ln t)/σ] are obtained for two cases. First, it is assumed that γ is known and σ2 has either an inverted gamma or “general uniform” prior distribution. Then, for the case that both γ and σ2 are unknown, the normal-gamma prior and Jeffreys' vague prior are considered. Some Monte Carlo simulations are given to indicate some of the properties of the Bayesian lower bounds.  相似文献   


This work deals with the problem of Bayesian estimation of the transition probabilities associated with multistate Markov chain. The model is based on the Jeffreys' noninformative prior. The Bayesian estimator is approximated by means of MCMC techniques. A numerical study by simulation is done in order to compare the Bayesian estimator with the maximum likelihood estimator.  相似文献   

Noninformative priors are used for estimating the reliability of a stress-strength system. Several reference priors (cf. Berger and Bernardo 1989, 1992) are derived. A class of priors is found by matching the coverage probabilities of one-sided Bayesian credible intervals with the corresponding frequentist coverage probabilities. It turns out that none of the reference priors is a matching prior. Sufficient conditions for propriety of posteriors under reference priors and matching priors are provided. A simple matching prior is compared with three reference priors when sample sizes are small. The study shows that the matching prior performs better than Jeffreys's prior and reference priors in meeting the target coverage probabilities.  相似文献   

In a recent paper Yager (1988) introduced a special type of aggregation operators which he called OWA-operators and applied it to multicriteria decisionmaking. In our paper we give a more general definition of such operators and investigate in some detail the structural properties of them. In particular we introduce a well motivated ordering relation on weight systems which is different but compatible with Yager's degree of “orness”. Our results have immediate applications to the theory of generalized quantifiers as also discussed by Yager. Details concerning this topic will appear in a future paper.  相似文献   

The author shows how geostatistical data that contain measurement errors can be analyzed objectively by a Bayesian approach using Gaussian random fields. He proposes a reference prior and two versions of Jeffreys' prior for the model parameters. He studies the propriety and the existence of moments for the resulting posteriors. He also establishes the existence of the mean and variance of the predictive distributions based on these default priors. His reference prior derives from a representation of the integrated likelihood that is particularly convenient for computation and analysis. He further shows that these default priors are not very sensitive to some aspects of the design and model, and that they have good frequentist properties. Finally, he uses a data set of carbon/nitrogen ratios from an agricultural field to illustrate his approach.  相似文献   

We derive best-possible bounds on the class of copulas with known values at several points, under the assumption that the points are either in “increasing order” or in “decreasing order”. These bounds may be used to establish best-possible bounds on Kendall's τ and Spearman's ρ, for such copulas. An important special case is when the values of a copula are known at several diagonal points. We also use our results to establish best-possible bounds on the distribution function of the sum of two random variables with known marginal distributions when the values of the joint distribution function are known at several points.  相似文献   

Katti [1] et Mehta et Srinivasan [2] ont considérés le probléme d'estimer la moyenne μ d'une population quand de l'information a priori sur μ est disponible. Dans [1], la supposition est que cette information a priori est sous forme d'un “estimé deviné”. Un estimateur à deux degres de deux échantillons est présenté. Quand à [2], un estimateur “shrinkage” basé sur un seul échantillon et une “valeur à priori” est considéré. En suivant les techniques présentées dans ces deux articles, nous suggérons des estimateurs du paramètre quand les données viennent d'une population Binomiale ou Poisson.  相似文献   

A new approach to inference on variance components is propounded. This approach not only gives a new justification for Fisher's fiducial, distribution for the “between classes” component, but also leads to a new distribution for the “within classes” component. This latter distribution is studied, and has some intuitively very reasonable properties. Numerical results are given.  相似文献   

The Akaike Information Criterion (AIC) is developed for selecting the variables of the nested error regression model where an unobservable random effect is present. Using the idea of decomposing the likelihood into two parts of “within” and “between” analysis of variance, we derive the AIC when the number of groups is large and the ratio of the variances of the random effects and the random errors is an unknown parameter. The proposed AIC is compared, using simulation, with Mallows' C p , Akaike's AIC, and Sugiura's exact AIC. Based on the rates of selecting the true model, it is shown that the proposed AIC performs better.  相似文献   

The gist of the quickest change-point detection problem is to detect the presence of a change in the statistical behavior of a series of sequentially made observations, and do so in an optimal detection-speed-versus-“false-positive”-risk manner. When optimality is understood either in the generalized Bayesian sense or as defined in Shiryaev's multi-cyclic setup, the so-called Shiryaev–Roberts (SR) detection procedure is known to be the “best one can do”, provided, however, that the observations’ pre- and post-change distributions are both fully specified. We consider a more realistic setup, viz. one where the post-change distribution is assumed known only up to a parameter, so that the latter may be misspecified. The question of interest is the sensitivity (or robustness) of the otherwise “best” SR procedure with respect to a possible misspecification of the post-change distribution parameter. To answer this question, we provide a case study where, in a specific Gaussian scenario, we allow the SR procedure to be “out of tune” in the way of the post-change distribution parameter, and numerically assess the effect of the “mistuning” on Shiryaev's (multi-cyclic) Stationary Average Detection Delay delivered by the SR procedure. The comprehensive quantitative robustness characterization of the SR procedure obtained in the study can be used to develop the respective theory as well as to provide a rational for practical design of the SR procedure. The overall qualitative conclusion of the study is an expected one: the SR procedure is less (more) robust for less (more) contrast changes and for lower (higher) levels of the false alarm risk.  相似文献   

Hill stated that “An interesting open problem is to determine which common distributions (or mixtures thereof) satisfy Benford's law …”. This article quantifies compliance with Benford's law for several popular survival distributions. The traditional analysis of Benford's law considers its applicability to datasets. This article switches the emphasis to probability distributions that obey Benford's law.  相似文献   

Modified chi-squared and some newly developed tests for the Poisson, binomial, and an approximated Feller's distribution are discussed. A reanalysis of the classical Rutherford's experimental data on alpha decay is done. Previous analyses of the data were not correct from the point of view of the theory of statistical testing. Tests used show that the data contradict to both Poisson and binomial distribution and do not contradict to a precise “binomial” approximation of Feller's distribution that takes into account a counter's dead time. This gives a plausible statistically correct confirmation of the well-established exponential law of radioactive decay.  相似文献   

