首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the problem of estimating the proportion θ of true null hypotheses in a multiple testing context. The setup is classically modelled through a semiparametric mixture with two components: a uniform distribution on interval [0,1] with prior probability θ and a non‐parametric density f . We discuss asymptotic efficiency results and establish that two different cases occur whether f vanishes on a non‐empty interval or not. In the first case, we exhibit estimators converging at a parametric rate, compute the optimal asymptotic variance and conjecture that no estimator is asymptotically efficient (i.e. attains the optimal asymptotic variance). In the second case, we prove that the quadratic risk of any estimator does not converge at a parametric rate. We illustrate those results on simulated data.  相似文献   

2.
It is generally assumed that the likelihood ratio statistic for testing the null hypothesis that data arise from a homoscedastic normal mixture distribution versus the alternative hypothesis that data arise from a heteroscedastic normal mixture distribution has an asymptotic χ 2 reference distribution with degrees of freedom equal to the difference in the number of parameters being estimated under the alternative and null models under some regularity conditions. Simulations show that the χ 2 reference distribution will give a reasonable approximation for the likelihood ratio test only when the sample size is 2000 or more and the mixture components are well separated when the restrictions suggested by Hathaway (Ann. Stat. 13:795–800, 1985) are imposed on the component variances to ensure that the likelihood is bounded under the alternative distribution. For small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or a normal mixture with unequal variances.  相似文献   

3.
In this paper, we study the problem of testing the hypothesis on whether the density f of a random variable on a sphere belongs to a given parametric class of densities. We propose two test statistics based on the L2 and L1 distances between a non‐parametric density estimator adapted to circular data and a smoothed version of the specified density. The asymptotic distribution of the L2 test statistic is provided under the null hypothesis and contiguous alternatives. We also consider a bootstrap method to approximate the distribution of both test statistics. Through a simulation study, we explore the moderate sample performance of the proposed tests under the null hypothesis and under different alternatives. Finally, the procedure is illustrated by analysing a real data set based on wind direction measurements.  相似文献   

4.
We revisit the problem of estimating the proportion π of true null hypotheses where a large scale of parallel hypothesis tests are performed independently. While the proportion is a quantity of interest in its own right in applications, the problem has arisen in assessing or controlling an overall false discovery rate. On the basis of a Bayes interpretation of the problem, the marginal distribution of the p-value is modeled in a mixture of the uniform distribution (null) and a non-uniform distribution (alternative), so that the parameter π of interest is characterized as the mixing proportion of the uniform component on the mixture. In this article, a nonparametric exponential mixture model is proposed to fit the p-values. As an alternative approach to the convex decreasing mixture model, the exponential mixture model has the advantages of identifiability, flexibility, and regularity. A computation algorithm is developed. The new approach is applied to a leukemia gene expression data set where multiple significance tests over 3,051 genes are performed. The new estimate for π with the leukemia gene expression data appears to be about 10% lower than the other three estimates that are known to be conservative. Simulation results also show that the new estimate is usually lower and has smaller bias than the other three estimates.  相似文献   

5.
Given two independent samples of size n and m drawn from univariate distributions with unknown densities f and g, respectively, we are interested in identifying subintervals where the two empirical densities deviate significantly from each other. The solution is built by turning the nonparametric density comparison problem into a comparison of two regression curves. Each regression curve is created by binning the original observations into many small size bins, followed by a suitable form of root transformation to the binned data counts. Turned as a regression comparison problem, several nonparametric regression procedures for detection of sparse signals can be applied. Both multiple testing and model selection methods are explored. Furthermore, an approach for estimating larger connected regions where the two empirical densities are significantly different is also derived, based on a scale-space representation. The proposed methods are applied on simulated examples as well as real-life data from biology.  相似文献   

6.
The L1 and L2-errors of the histogram estimate of a density f from a sample X1,X2,…,Xn using a cubic partition are shown to be asymptotically normal without any unnecessary conditions imposed on the density f. The asymptotic variances are shown to depend on f only through the corresponding norm of f. From this follows the asymptotic null distribution of a goodness-of-fit test based on the total variation distance, introduced by Györfi and van der Meulen (1991). This note uses the idea of partial inversion for obtaining characteristic functions of conditional distributions, which goes back at least to Bartlett (1938).  相似文献   

7.
The existing process capability indices (PCI's) assume that the distribution of the process being investigated is normal. For non-normal distributions, PCI's become unreliable in that PCI's may indicate the process is capable when in fact it is not. In this paper, we propose a new index which can be applied to any distribution. The proposed indexCf:, is directly related to the probability of non-conformance of the process. For a given random sample, the estimation of Cf boils down to estimating non-parametrically the tail probabilities of an unknown distribution. The approach discussed in this paper is based on the works by Pickands (1975) and Smith (1987). We also discuss the construction of bootstrap confidence intervals of Cf: based on the so-called accelerated bias correction method (BC a:). Several simulations are carried out to demonstrate the flexibility and applicability of Cf:. Two real life data sets are analyzed using the proposed index.  相似文献   

8.
This paper derives Akaike information criterion (AIC), corrected AIC, the Bayesian information criterion (BIC) and Hannan and Quinn’s information criterion for approximate factor models assuming a large number of cross-sectional observations and studies the consistency properties of these information criteria. It also reports extensive simulation results comparing the performance of the extant and new procedures for the selection of the number of factors. The simulation results show the di?culty of determining which criterion performs best. In practice, it is advisable to consider several criteria at the same time, especially Hannan and Quinn’s information criterion, Bai and Ng’s ICp2 and BIC3, and Onatski’s and Ahn and Horenstein’s eigenvalue-based criteria. The model-selection criteria considered in this paper are also applied to Stock and Watson’s two macroeconomic data sets. The results differ considerably depending on the model-selection criterion in use, but evidence suggesting five factors for the first data and five to seven factors for the second data is obtainable.  相似文献   

9.
The Benjamini–Hochberg procedure is widely used in multiple comparisons. Previous power results for this procedure have been based on simulations. This article produces theoretical expressions for expected power. To derive them, we make assumptions about the number of hypotheses being tested, which null hypotheses are true, which are false, and the distributions of the test statistics under each null and alternative. We use these assumptions to derive bounds for multiple dimensional rejection regions. With these bounds and a permanent based representation of the joint density function of the largest p-values, we use the law of total probability to derive the distribution of the total number of rejections. We derive the joint distribution of the total number of rejections and the number of rejections when the null hypothesis is true. We give an analytic expression for the expected power for a false discovery rate procedure that assumes the hypotheses are independent.  相似文献   

10.
The null distribution of Wilks' likelihood-ratio criterion for testing independence of several groups of variables in a multivariate normal population is derived. Percentage points are tabulated for various values of the sample sizeN and partitions of p, the number of variables. This paper extends Mathai and Katiya's (1979) “sphericity” results and tables.  相似文献   

11.
Hartigan (1975) defines the number q of clusters in a d ‐variate statistical population as the number of connected components of the set {f > c}, where f denotes the underlying density function on Rd and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this parameter which is based on the computation of the number of connected components of an estimate of {f > c}. This set estimator is constructed as a union of balls with centres at an appropriate subsample which is selected via a nonparametric density estimator of f. The asymptotic behaviour of the proposed method is analyzed. A simulation study and an example with real data are also included.  相似文献   

12.
The likelihood-ratio test (LRT) is considered as a goodness-of-fit test for the null hypothesis that several distribution functions are uniformly stochastically ordered. Under the null hypothesis, H1 : F1 ? F2 ?···? FN, the asymptotic distribution of the LRT statistic is a convolution of several chi-bar-square distributions each of which depends upon the location parameter. The least-favourable parameter configuration for the LRT is not unique. It can be two different types and depends on the number of distributions, the number of intervals and the significance level α. This testing method is illustrated with a data set of survival times of five groups of male fruit flies.  相似文献   

13.
It is shown that the exact null distribution of the likelihood ratio criterion for sphericity test in the p-variate normal case and the marginal distribution of the first component of a (p ? 1)-variate generalized Dirichlet model with a given set of parameters are identical. The exact distribution of the likelihood ratio criterion so obtained has a general format for every p. A novel idea is introduced here through which the complicated exact null distribution of the sphericity test criterion in multivariate statistical analysis is converted into an easily tractable marginal density in a generalized Dirichlet model. It provides a direct and easiest method of computation of p-values. The computation of p-values and a table of critical points corresponding to p = 3 and 4 are also presented.  相似文献   

14.
The negative binomial (NB) is frequently used to model overdispersed Poisson count data. To study the effect of a continuous covariate of interest in an NB model, a flexible procedure is used to model the covariate effect by fixed-knot cubic basis-splines or B-splines with a second-order difference penalty on the adjacent B-spline coefficients to avoid undersmoothing. A penalized likelihood is used to estimate parameters of the model. A penalized likelihood ratio test statistic is constructed for the null hypothesis of the linearity of the continuous covariate effect. When the number of knots is fixed, its limiting null distribution is the distribution of a linear combination of independent chi-squared random variables, each with one degree of freedom. The smoothing parameter value is determined by setting a specified value equal to the asymptotic expectation of the test statistic under the null hypothesis. The power performance of the proposed test is studied with simulation experiments.  相似文献   

15.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

16.
Suppose we have n observations from X = Y + Z, where Z is a noise component with known distribution, and Y has an unknown density f. When the characteristic function of Z is nonzero almost everywhere, we show that it is possible to construct a density estimate fn such that for all f, Iimn| |=0.  相似文献   

17.
Cubic B-splines are used to estimate the nonparametric component of a semiparametric generalized linear model. A penalized log-likelihood ratio test statistic is constructed for the null hypothesis of the linearity of the nonparametric function. When the number of knots is fixed, its limiting null distribution is the distribution of a linear combination of independent chi-squared random variables, each with one df. The smoothing parameter is determined by giving a specified value for its asymptotically expected value under the null hypothesis. A simulation study is conducted to evaluate its power performance; a real-life dataset is used to illustrate its practical use.  相似文献   

18.
The asymptotic null distribution of the locally best invariant (LBI) test criterion for testing the random effect in the one-way multivariable analysis of variance model is derived under normality and non-normality. The error of the approximation is characterized as O(1/n). The non-null asymptotic distribution is also discussed. In addition to providing a way of obtaining percentage points and p-values, the results of this paper are useful in assessing the robustness of the LBI criterion. Numerical results are presented to illustrate the accuracy of the approximation.  相似文献   

19.
The estimation of the distribution functon of a random variable X measured with error is studied. Let the i-th observation on X be denoted by YiXii where εi is the measuremen error. Let {Yi} (i=1,2,…,n) be a sample of independent observations. It is assumed that {Xi} and {∈i} are mutually independent and each is identically distributed. As is standard in the literature for this problem, the distribution of e is assumed known in the development of the methodology. In practice, the measurement error distribution is estimated from replicate observations.

The proposed semiparametric estimator is derived by estimating the quantises of X on a set of n transformed V-values and smoothing the estimated quantiles using a spline function. The number of parameters of the spline function is determined by the data with a simple criterion, such as AIC. In a simulation study, the semiparametric estimator dominates an optimal kernel estimator and a normal mixture estimator for a wide class of densities.

The proposed estimator is applied to estimate the distribution function of the mean pH value in a field plot. The density function of the measurement error is estimated from repeated measurements of the pH values in a plot, and is treated as known for the estimation of the distribution function of the mean pH value.  相似文献   

20.
In some long-term studies, a series of dependent and possibly censored failure times may be observed. Suppose that the failure times have a common continuous distribution function F. A popular stochastic measure of the distance between the density function f of the failure times and its kernel estimate f n is the integrated square error(ISE). In this article, we derive a central limit theorem for the integrated square error of the kernel density estimators under a censored dependent model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号