Fisher's A statistic, often called the adjusted R2 statistic, is shown to be a close approximation to the maximum likelihood estimate of the multiple correlation coefficient, p2, based on the marginal distribution of R2. Expansions for the estimate are obtained. The same methods lead to maximum marginal likelihood estimators for the noncentrality parameters for noncentral X2 and F.   

This paper investigates the roles of partial correlation and conditional correlation as measures of the conditional independence of two random variables. It first establishes a sufficient condition for the coincidence of the partial correlation with the conditional correlation. The condition is satisfied not only for multivariate normal but also for elliptical, multivariate hypergeometric, multivariate negative hypergeometric, multinomial and Dirichlet distributions. Such families of distributions are characterized by a semigroup property as a parametric family of distributions. A necessary and sufficient condition for the coincidence of the partial covariance with the conditional covariance is also derived. However, a known family of multivariate distributions which satisfies this condition cannot be found, except for the multivariate normal. The paper also shows that conditional independence has no close ties with zero partial correlation except in the case of the multivariate normal distribution; it has rather close ties to the zero conditional correlation. It shows that the equivalence between zero conditional covariance and conditional independence for normal variables is retained by any monotone transformation of each variable. The results suggest that care must be taken when using such correlations as measures of conditional independence unless the joint distribution is known to be normal. Otherwise a new concept of conditional independence may need to be introduced in place of conditional independence through zero conditional correlation or other statistics.   

Parity refers to the number of (live) births that a woman (or man) has had. Birth order refers to whether a birth is the first, second, third or higher‐order birth of the parent. In the context of low and shifting fertility, parity and birth‐order statistics are becoming increasingly important for understanding fertility trends and patterns, for policy, and for carrying out projections of future fertility. In Australia, the main sources of demographic data are birth, death and marriage registers, and the five‐yearly national census. Both the birth registers and the census are ideally placed to collect data required to calculate parity and birth‐order statistics. However not all Australian states and territories collect or code the necessary information in the birth registers, and the parity question 'For each female, how many babies has she ever had?' is only asked every second census; that is, once every 10 years. In this paper, we outline the importance and uses of parity and birth‐order statistics. We discuss the Australian data available at present and their gaps and shortcomings. We then describe the 'gold standard' of parity and birth‐order statistics and how Australia can achieve this standard through some minor changes to the data collection process.   

The Department of Mathematical Statistics of the University of Sydney invited Professor Yu. V. Linnik, of Leningrad University, to deliver a public address to be attended by, among others, members of the 36th Congress of the International Statistical Institute. A summary of this address, delivered at 8.30 p.m. on Thursday, August 30, 1967, at the University of Sydney, is given below. Professor Linnik referred to related work of other authors, including Professors C. R. Rao and P. Whittle, who were present. Professor Rao later gave a short account of his recent work. Professor A. T. James moved a vote of thanks, which was carried by acclamation by a large audience. Reprints of the paper are being sent to those members of the International Statistical Institute present at the meeting.—EDITOR.   

A positive random variable X with a finite mean has an induced length-biased law represented by Y, and Y is stochastically larger than X. An independent uniform random contraction of Y, UY, has the same law as X if and only if the latter is exponential. This property is extended to non-uniform contractions and a more general notion of length-biasing. The distributional equality of X and W leads to a functional equation for the moment function of X, which has either Infinitely many solutions or none. When U is constant, X can have a log-normal law, but it can also have laws with the same moment sequence as this log-nod law. The case where U has a certain beta, or generalized beta, law give t3 characterizations of generalized gamma laws, or to products of independent copies of them. This occurs even when these laws are not determined by their moment sequences.   

A second order process with mean zero and covariance is asymptotically stationary if lim ds exists for every; this limit then defines the covariance function of the process. The paper establishes the spectral representation for the covariance function and a mean ergodic theorem for the process. When stationarity is assumed, the results reduce to the well-known corresponding theorems for stationary processes.   

A non-parametric estimator of a density at a particular quantile is based on sample quantiles. The optimal (to minimize M.S.E.) choice of these quantiles is considered and a method of removing the bias is suggested.   

Statistics as data is ancient, but as a discipline of study and research it has a short history. Courses leading to degrees in statistics have been introduced in universities some sixty to seventy years ago. They were not considered to constitute a basic discipline with a subject matter of its own. However, during the last seventy five years, it has developed as a powerful blend of science, technology and art for solving problems in all areas of human endeavor. Now-a-days statistics is used in scientific research, economic development through optimum use of resources, increasing industrial productivity, medical diagnosis, legal practice, disputed authorship, and optimum decision making at individual and institutional levels. What is the future of statistics in the coming millennium dominated by information technology encompassing the whole of communications, interaction with intelligent systems, massive data bases, and complex information processing networks? The current statistical methodology based on probabilistic models applied on small data sets appears to be inadequate to meet the needs of the society in terms of quick processing of data and making the information available for practical purposes. Adhoc methods are being put forward under the title Data Mining by computer scientists and engineers to meet the needs of customers. The paper reviews the current state of the art in statistics and discusses possible future developments considering the availability of large data sets, enormous computing power and efficient optimization techniques using genetic algorithms and neural networks.


In this paper we assume that in a random sample of size ndrawn from a population having the pdf f(x; θ) the smallest r1 observations and the largest r2 observations are censored (r10, r20). We consider the problem of estimating θ on the basis of the middle n-r1-r2 observations when either f(x;θ)=θ-1f(x/θ) or f(x;θ) = (aθ)1f(x-θ)/aθ) where f(·) is a known pdf, a (<0) is known and θ (>0) is unknown. The minimum mean square error (MSE) linear estimator of θ proposed in this paper is a "shrinkage" of the minimum variance linear unbiased estimator of θ. We obtain explicit expressions of these estimators and their mean square errors when (i) f(·) is the uniform pdf defined on an interval of length one and (ii) f(·) is the standard exponential pdf, i.e., f(x) = exp(–x), x0. Various special cases of censoring from the left (right) and no censoring are considered.   

This paper reviews some interesting but scattered results that are known about correlation in bivariate Poisson distributions and processes and presents some new results. A particular concern in both contexts is with results and examples relating to negative correlation.   

Recent research has extended standard methods for meta‐analysis to more general forms of evidence synthesis, where the aim is to combine different data types or data summaries that contain information about functions of multiple parameters to make inferences about the parameters of interest. We consider one such scenario in which the goal is to make inferences about the association between a primary binary exposure and continuously valued outcome in the context of several confounding exposures, and where the data are available in various different forms: individual participant data (IPD) with repeated measures, sample means that have been aggregated over strata, and binary data generated by thresholding the underlying continuously valued outcome measure. We show that an estimator of the population mean of a continuously valued outcome can be constructed using binary threshold data provided that a separate estimate of the outcome standard deviation is available. The results of a simulation study show that this estimator has negligible bias but is less efficient than the sample mean – the minimum variance ratio is based on a Taylor series expansion. Combining this estimator with sample means and IPD from different sources (such as a series of published studies) using both linear and probit regression does, however, improve the precision of estimation considerably by incorporating data that would otherwise have been excluded for being in the wrong format. We apply these methods to investigate the association between the G277S mutation in the transferrin gene and serum ferritin (iron) levels separately in pre‐ and post‐menopausal women based on data from three published studies.   

We present an explicit characterization of the joint dependency structure of an n×p matrix normal random matrix such that the p-dimensional sample mean vector is independent of all translation invariant statistics.   

A limiting expression is derived for the tail of the distribution of the maximum of a set of product moment correlation coefficients. The technique used is quite general and may be applied to non-normal observations as well as to rank correlation coefficients. The result obtained for the latter leads to a test procedure for multiple comparisons of these non-parametric measures of dependence.   

