期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SAMPLING ERRORS OF GENETIC CORRELATION COEFFICIENTS CALCULATED FROM ANALYSES OF VARIANCE AND COVARIANCE

G. M. Tallis 《Australian & New Zealand Journal of Statistics》1959,1(2):35-43

Summary. In this paper a formula is developed for estimating the sampling variance of a genetic correlation estimated from analyses of variance and covariance. The formula holds provided the heritability estimate of neither character is zero. However, the development assumes a constant number of offspring per sire, k , and the effect of varying values of k is discussed briefly. The efficiency of experiments from which genetic parameters are to be estimated has also been investigated and optimum values of k are given for various combinations of phenotypic and genetic parameters. 相似文献

2.

Predicting distances using a linear model: the case of varietal distinctness

G. Nuel S. Robin C. P. Baril 《Journal of applied statistics》2001,28(5):607-621

Differences between plant varieties are based on phenotypic observations, which are both space and time consuming. Moreover, the phenotypic data result from the combined effects of genotype and environment. On the contrary, molecular data are easier to obtain and give a direct access to the genotype. In order to save experimental trials and to concentrate efforts on the relevant comparisons between varieties, the relationship between phenotypic and genetic distances is studied. It appears that the classical genetic distances based on molecular data are not appropriate for predicting phenotypic distances. In the linear model framework, we define a new pseudo genetic distance, which is a prediction of the phenotypic one. The distribution of this distance given the pseudo genetic distance is established. Statistical properties of the predicted distance are derived when the parameters of the model are either given or estimated. We finally apply these results to distinguishing between 144 maize lines. This case study is very satisfactory because the use of anonymous molecular markers (RFLP) leads to saving 29% of the trials with an acceptable error risk. These results need to be confirmed on other varieties and species and would certainly be improved by using genes coding for phenotypic traits. 相似文献

3.

ON THE COVERAGE PROBABILITY OF CONFIDENCE INTERVALS IN REGRESSION AFTER VARIABLE SELECTION 总被引：1，自引：1，他引：0

Paul Kabaila 《Australian & New Zealand Journal of Statistics》2005,47(4):549-562

This paper considers a linear regression model with regression parameter vector β. The parameter of interest is θ= a^Tβ where a is specified. When, as a first step, a data‐based variable selection (e.g. minimum Akaike information criterion) is used to select a model, it is common statistical practice to then carry out inference about θ, using the same data, based on the (false) assumption that the selected model had been provided a priori. The paper considers a confidence interval for θ with nominal coverage 1 ‐ α constructed on this (false) assumption, and calls this the naive 1 ‐ α confidence interval. The minimum coverage probability of this confidence interval can be calculated for simple variable selection procedures involving only a single variable. However, the kinds of variable selection procedures used in practice are typically much more complicated. For the real‐life data presented in this paper, there are 20 variables each of which is to be either included or not, leading to 2²⁰ different models. The coverage probability at any given value of the parameters provides an upper bound on the minimum coverage probability of the naive confidence interval. This paper derives a new Monte Carlo simulation estimator of the coverage probability, which uses conditioning for variance reduction. For these real‐life data, the gain in efficiency of this Monte Carlo simulation due to conditioning ranged from 2 to 6. The paper also presents a simple one‐dimensional search strategy for parameter values at which the coverage probability is relatively small. For these real‐life data, this search leads to parameter values for which the coverage probability of the naive 0.95 confidence interval is 0.79 for variable selection using the Akaike information criterion and 0.70 for variable selection using Bayes information criterion, showing that these confidence intervals are completely inadequate. 相似文献

4.

A simple method for deriving the confidence regions for the penalized Cox’s model via the minimand perturbation

Chen-Yen Lin 《统计学通讯:理论与方法》2017,46(10):4791-4808

相似文献

5.

ESTIMATING THE NUMBER OF VISITS TO THE DOCTOR

Andreas Berzel Gillian Z. Heller Walter Zucchini 《Australian & New Zealand Journal of Statistics》2006,48(2):213-224

The frequency of doctor consultations has direct consequences for health care budgets, yet little statistical analysis of the determinants of doctor visits has been reported. We consider the distribution of the number of visits to the doctor and, in particular, we model its dependence on a number of demographic factors. Examination of the Australian 1995 National Health Survey data reveals that generalized linear Poisson or negative binomial models are inadequate for modelling the mean as a function of covariates, because of excessive zero counts, and a mean‐variance relationship that varies enormously over covariate values. A negative binomial model is used, with parameter values estimated in subgroups according to the discrete combinations of the covariate values. Smoothing splines are then used to smooth and interpolate the parameter values. In effect the mean and the shape parameters are each modelled as (different) functions of gender, age and geographical factors. The estimated regressions for the mean have simple and intuitive interpretations. However, the dependence of the (negative binomial) shape parameter on the covariates is more difficult to interpret and is subject to influence by extreme observations. We illustrate the use of the model by estimating the distribution of the number of doctor consultations in the Statistical Local Area of Ryde, based on population numbers from the 1996 census. 相似文献

6.

Product-limit survival functions with correlated survival times 总被引：1，自引：1，他引：0

Rick L. Williams 《Lifetime data analysis》1995,1(2):171-186

A simple variance estimator for product-limit survival functions is demonstrated for survival times with nested errors. Such data arise whenever survival times are observed within clusters of related observations. Greenwood's formula, which assumes independent observations, is not appropriate in this situation. A robust variance estimator is developed using Taylor series linearized values and the between-cluster variance estimator commonly used in multi-stage sample surveys. A simulation study shows that the between-cluster variance estimator is approximately unbiased and yields confidence intervals that maintain the nominal level for several patterns of correlated survival times. The simulation study also shows that Greenwood's formula underestimates the variance when the survival times are positively correlated within a cluster and yields confidence intervals that are too narrow. Extension to life table methods is also discussed. 相似文献

7.

ASYMPTOTIC STABILITY OF THE OSCV SMOOTHING PARAMETER SELECTION

《统计学通讯:理论与方法》2013,42(10):2033-2044

The smoothing parameter selection by the one-sided cross-validation (OSCV) method is completely automatic in that it does not require extra parameters estimation. Also it reduces the variability comparable to that of plug-in rules. In this paper we derive analytically the asymptotic variance of the smoothing parameter selected by OSCV. It shows the dependency of the stability on the one-sided kerenl and tells the possibility of the optimal one-sided kernel which minimizes the asymptotic variability. 相似文献

8.

Selecting Closest to Control

Eve Bofinger & Wei Liu 《Australian & New Zealand Journal of Statistics》2001,43(4):421-430

Consider the usual one-way fixed effect analysis of variance model where the populations Π_i ( I = 0, 1, . . . , k ) have independent normal distributions with unknown means and common unknown variance. Let Π₀ be a control population with which the other (treatment) populations are to be compared. The basic problem is to select the treatment that is closest to the control mean. This situation occurs when one of the Π_i must be chosen, regardless of how many are equivalent to the control in the sense of having means sufficiently close. This paper follows the approach of Hsu (1996) and is based on a set of simultaneous confidence intervals. It provides a table of critical values which allows direct implementation of the new inference procedure. The applications given are of the balanced cross-over design type with negligible carry-over effects, for which the results of this paper may be used. One of the applications refers to the selection of a drug, which may not be bioequivalent to a reference formulation but is the closest of those drugs that are readily available to the group of patients considered. 相似文献

9.

Least significant spacing for ‘one versus the rest’ normal populations

Eve Bofinger 《统计学通讯:理论与方法》2013,42(5):1697-1716

Consider sample means from k(≥2) normal populations where the variances and sample sizes are equal. The problem is to find the ‘least significant difference’ or ‘spacing’ (LSS) between the two largest means, so that if an observed spacing is larger we have confidence 1 - α that the population with largest sample mean also has the largest population mean.

When the variance is known it is shown that the maximum LSS occurs when k = 2, provided a < .2723. In other words, for any value of k we may use the usual (one-tailed) least significant difference to demonstrate that one population has a population mean greater than (or equal to) the rest.

When the variance is estimated bounds are obtained for the confidence which indicate that this last result is approximately correct. 相似文献

10.

Probability density estimation with data missing at random when covariables are present

Qihua Wang 《Journal of statistical planning and inference》2008

This paper addresses the problem of the probability density estimation in the presence of covariates when data are missing at random (MAR). The inverse probability weighted method is used to define a nonparametric and a semiparametric weighted probability density estimators. A regression calibration technique is also used to define an imputed estimator. It is shown that all the estimators are asymptotically normal with the same asymptotic variance as that of the inverse probability weighted estimator with known selection probability function and weights. Also, we establish the mean squared error (MSE) bounds and obtain the MSE convergence rates. A simulation is carried out to assess the proposed estimators in terms of the bias and standard error. 相似文献

11.

Statistical Analysis for the Inverse Trinomial Distribution

Y. N. Phang S. Z. Sim S. H. Ong 《统计学通讯:模拟与计算》2013,42(9):2073-2085

This article considers parameter estimation, goodness of fit, likelihood ratio and score tests, and model selection by Akaike information criterion for the inverse trinomial (IT) distribution, a classical one-dimensional random walk distribution. The IT distribution has a cubic variance function of the mean and is a generalization of the negative binomial distribution. Basic distributional properties and expressions for the probability mass function, recurrence formula, moments, and score functions are also presented. 相似文献

12.

THE EXACT BIAS OF THE LOG-PERIODOGRAM REGRESSION ESTIMATOR

Offer Lieberman 《Econometric Reviews》2001,20(3):369-383

The paper makes two contributions. First, we provide a formula for the exact distribution of the periodogram evaluated at any arbitrary frequency, when the sample is taken from any zero-mean stationary Gaussian process. The inadequacy of the asymptotic distribution is demonstrated through an example in which the observations are generated by a fractional Gaussian noise process. The results are then applied in deriving the exact bias of the log-periodogram regression estimator (Geweke and Porter-Hudak (1983), Robinson (1995)). The formula is computable. Practical bounds on this bias are developed and their arithmetic mean is shown to be accurate and useful. 相似文献

13.

THE EXACT BIAS OF THE LOG-PERIODOGRAM REGRESSION ESTIMATOR

《Econometric Reviews》2013,32(3):369-383

The paper makes two contributions. First, we provide a formula for the exact distribution of the periodogram evaluated at any arbitrary frequency, when the sample is taken from any zero-mean stationary Gaussian process. The inadequacy of the asymptotic distribution is demonstrated through an example in which the observations are generated by a fractional Gaussian noise process. The results are then applied in deriving the exact bias of the log-periodogram regression estimator (Geweke and Porter-Hudak (1983), Robinson (1995)). The formula is computable. Practical bounds on this bias are developed and their arithmetic mean is shown to be accurate and useful. 相似文献

14.

BAYESIAN SUBSET SELECTION AND MODEL AVERAGING USING A CENTRED AND DISPERSED PRIOR FOR THE ERROR VARIANCE 总被引：1，自引：0，他引：1

Edward Cripps Robert Kohn David Nott 《Australian & New Zealand Journal of Statistics》2006,48(2):237-252

This article proposes a new data‐based prior distribution for the error variance in a Gaussian linear regression model, when the model is used for Bayesian variable selection and model averaging. For a given subset of variables in the model, this prior has a mode that is an unbiased estimator of the error variance but is suitably dispersed to make it uninformative relative to the marginal likelihood. The advantage of this empirical Bayes prior for the error variance is that it is centred and dispersed sensibly and avoids the arbitrary specification of hyperparameters. The performance of the new prior is compared to that of a prior proposed previously in the literature using several simulated examples and two loss functions. For each example our paper also reports results for the model that orthogonalizes the predictor variables before performing subset selection. A real example is also investigated. The empirical results suggest that for both the simulated and real data, the performance of the estimators based on the prior proposed in our article compares favourably with that of a prior used previously in the literature. 相似文献

15.

On selecting from k finite populations the population with the largest α -quantile

Khursheed Alam M. Haseeb Rizvi 《统计学通讯:理论与方法》2013,42(3):355-362

Several procedures for ranking populations according to the quantile of a given order have been discussed in the literature. These procedures deal with continuous distributions. This paper deals with the problem of selecting a population with the largest α-quantile from k ≥ 2 finite populatins, where the size of each population is known. A selection rule is given based on the sample quantiles, where he samples are drawn without replacement. A formula for the minimum probability of a correct selection for the given rule, for a certain configuration of the population α-quantiles, is given in terms of the sample numbers. 相似文献

16.

A NOTE ON THE HOMOGENETIC ESTIMATE FOR THE VARIANCE OF THE KAPLAN–MEIER ESTIMATE

《统计学通讯:理论与方法》2013,42(9):1595-1603

ABSTRACT

The Greenwood estimate (GE) is commonly employed for estimating the variance of the Kaplan–Meier estimate (KME) even though it underestimates the variance. To reduce the bias of the GE, Zhao (1996) proposed an alternative, called the homogenetic estimate (HE). In this note, we point out that the HE actually esimates the variance of the reduced sample estimate (RE) and can seriously overestimate that of the KME. We also derive the explict relationship between the HE and the GE and discuss the use of the HE. 相似文献

17.

Asymptotic efficiency of model selection criteria: the nonzero mean gaussian ar(∞) case

Alexandros Karagrigoriou 《统计学通讯:理论与方法》2013,42(4):911-930

Motivated by Shibata’s (1980) asymptotic efficiency results this paper dis-cusses the asymptotic efficiency of the order selected by a selection procedure for an infinite order autoregressive process with nonzero mean and unob servable errors that constitute a sequence of independent Gaussian random variables with mean zero and variance σ² The asymptotic efficiency is established for AIC–type selection criteria such as AIC’, FPE, and S_n(k). In addition, some asymptotic results about the estimators of the parameters of the process and the error–sequence are presented. 相似文献

18.

DISTRIBUTION OF THE NUMBER OF MARKOVIAN RENEWALS IN AN ARBITRARY INTERVAL

A. M. Kshirsagar Y. P. Gupta 《Australian & New Zealand Journal of Statistics》1970,12(1):58-63

A Markov Renewal Process (M.R.P.) is one which records at each time t , the number of times a system visits each of m states in time t , if the transitions from state to state are according to a Markov chain and if the time required for each successive move is a random variable whose distribution function (d.f.) depends on the two states between which the move is made. In this paper, the distribution of the number of times each state is visited in an arbitrary interval (t₀, t₀+t) is derived. Asymptotic expressions for the mean and variance of this distribution are also obtained. 相似文献

19.

Some multiple decision problems in analysis of variance

Shanti S. Gupta Deng-Yuan Huang 《统计学通讯:理论与方法》2013,42(11):1035-1054

In most practical situations to which the analysis of variance tests are applied, they do not supply the information that the experimenter aims at. If, for example, in one-way ANOVA the hypothesis is rejected in actual application of the F-test, the resulting conclusion that the true means θ₁,…,θ_k are not all equal, would by itself usually be insufficient to satisfy the experimenter. In fact his problems would begin at this stage. The experimenter may desire to select the “best” population or a subset of the “good” populations; he may like to rank the populations in order of “goodness” or he may like to draw some other inferences about the parameters of interest.

The extensive literature on selection and ranking procedures depends heavily on the use of independence between populations (block, treatments, etc.) in the analysis of variance. In practical applications, it is desirable to drop this assumption or independence and consider cases more general than the normal.

In the present paper, we derive a method to construct optimal (in some sense) selection procedures to select a nonempty subset of the k populations containing the best population as ranked in terms of θ_i’s which control the size of the selected subset and which maximizes the minimum average probability of selecting the best. We also consider the usual selection procedures in one-way ANOVA based on the generalized least squares estimates and apply the method to two-way layout case. Some examples are discussed and some results on comparisons with other procedures are also obtained. 相似文献

20.

A mathematical model of a population genetics: Effects of genetic variation on homosexuality

Madhu Jain G.C. Sharma Sudheer Kumar Sharma 《Journal of the Korean Statistical Society》2009,38(3):267-276

This study deals with simple mathematical models for the purpose of generating testable prediction in genetics. The genetic effects in human population are because of the following four factors, (i) paternal, (ii) maternal, (iii) environmental and (iv) idiopathic. For the measurement of genetic characteristics, the environment also plays an important role. In the present paper we analyze this linkage for homosexuality for the first three factors because the fourth one affects a very little population. This study provides genetics and evolutionary basis for making generating testable predictions specifically where the environmental factors affect the characteristics of genes which also influence homosexuality. The effect of environment on homosexuality is highlighted because earlier studies confined only up to the paternal and maternal linkages. We consider two types of selections (i) direct selection with environmental effects (ii) combined selection with direct, maternal and environmental effects. The objective of this investigation is to provide the conditions for the maintenance of genetic variation of genotype of male and female. We concentrate on highlighting homosexuality as a result of variation in genes. Numerical results are obtained by taking illustration. The sensitivity analysis is carried out to explore homosexuality in terms of fitness loss and fitness gain. 相似文献