首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 362 毫秒
1.
In the analysis of variance, the efficiency of a statistical procedure depends on the values of some function, g, chosen appropriately. For example, g can be the variance of an estimator of a variance component, or the power of a test associated with a given model. Of particular interest in this concern is the behavior of g under a variety of experimental conditions. For unbalanced data, g depends in a complex way on the cell frequencies (design), and on a set of parameters denoted by θ. It is therefore quite difficult, in general, to determine how imbalance affects the value of g.The degree of imbalance of a data set can be quantified by the value of a measure, denoted by ø, as shown in Khuri (Biometrical J. 29 (1987), 383–396). Studying the behavior of the function g under different patterns of imbalance would be greatly simplified if g were directly related to ø. Unfortunately, the dependence of g on the cell frequencies is not, in general, expressible explicitly in terms of ø. However, if the functional form of g can be adequately approximated by an empirical model given in terms of ø and θ, then it is possible to use ø to assess the effect of imbalance on g. The derivation of such a model can be conveniently achieved by means of response surface techniques. This is demonstrated using, as an example of g, the variance of an analysis of variance (ANOVA) estimator. A key element in the development of this demonstration is an algorithm for generating cell frequencies having a specified degree of imbalance. A numerical example is given to illustrate the proposed methodology.  相似文献   

2.
We study the problem of classification with multiple q-variate observations with and without time effect on each individual. We develop new classification rules for populations with certain structured and unstructured mean vectors and under certain covariance structures. The new classification rules are effective when the number of observations is not large enough to estimate the variance–covariance matrix. Computational schemes for maximum likelihood estimates of required population parameters are given. We apply our findings to two real data sets as well as to a simulated data set.  相似文献   

3.
This paper extends the ideas in Giommi (Proc. 45th Session of the Internat. Statistical Institute, Vol. 2 (1985) 577–578; Techniques d'enquête 13(2) (1987) 137–144) and, in Särndal and Swenson (Bull. Int. Statist. Inst. 15(2) (1985) 1–16; Int. Statist. Rev. 55(1987) 279–294). Given the parallel between a ‘three-phase sampling’ and a ‘sampling with subsequent unit and item nonresponse’, we apply results from three-phase sampling theory to nonresponse situation. To handle the practical problem of unknown distributions at the second and the third phases of selection (the response mechanisms) in the nonresponse case, we use two approaches of response probability estimation: response homogeneity groups (RHG) model (Särndal and Swenson, 1985, 1987) and the nonparametric estimation (Giommi, 1985, 1987). To motivate the three-phase selection, imputation procedures for item nonresponse are used with the RHG model for unit nonresponse. By means of a Monte Carlo study, we find that the regression-type estimators are the most precise of those studied under the two approaches of response probability estimation in terms of lower bias, mean square error and variance; variance estimator close to the true variance and achieved coverage rates closer to the nominal levels. The simulation study shows how poor the variance estimators are under the single imputation approach currently used to handle the problem of missing values.  相似文献   

4.
In this paper, order statistics from independent and non identically distributed random variables is used to obtain ordered ranked set sampling (ORSS). Bayesian inference of unknown parameters under a squared error loss function of the Pareto distribution is determined. We compute the minimum posterior expected loss (the posterior risk) of the derived estimates and compare them with those based on the corresponding simple random sample (SRS) to assess the efficiency of the obtained estimates. Two-sample Bayesian prediction for future observations is introduced by using SRS and ORSS for one- and m-cycle. A simulation study and real data are applied to show the proposed results.  相似文献   

5.
Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

6.
A family of robust estimators for coefficients of Gaussian AR(p) time series under simultaneously influencing distortions of two types: outliers and missing values, is proposed. The estimators are based on special properties of the Cauchy probability distribution; consistency and the asymptotic normality of these estimators are proven. An approximate solution of the problem of minimization of the asymptotic variance within the proposed family of estimators is found. Performance of the proposed estimators is illustrated for simulated time series and for real data sets.  相似文献   

7.
In this paper, we construct a new ranked set sampling protocol that maximizes the Pitman asymptotic efficiency of the signed rank test. The new sampling design is a function of the set size and independent order statistics. If the set size is odd and the underlying distribution is symmetric and unimodal, then the new sampling protocol quantifies only the middle observation. On the other hand, if the set size is even, the new sampling design quantifies the two middle observations. This data collection procedure for use in the signed rank test outperforms the data collection procedure in the standard ranked set sample. We show that the exact null distribution of the signed rank statistic WRSS+ based on a data set generated by the new ranked set sample design for odd set sizes is the same as the null distribution of the simple random sample signed rank statistic WSRS+ based on the same number of measured observations. For even set sizes, the exact null distribution of WRSS+ is simulated.  相似文献   

8.
In this article we study a linear discriminant function of multiple m-variate observations at u-sites and over v-time points under the assumption of multivariate normality. We assume that the m-variate observations have a separable mean vector structure and a “jointly equicorrelated covariance” structure. The new discriminant function is very effective in discriminating individuals in a small sample scenario. No closed-form expression exists for the maximum likelihood estimates of the unknown population parameters, and their direct computation is nontrivial. An iterative algorithm is proposed to calculate the maximum likelihood estimates of these unknown parameters. A discriminant function is also developed for unstructured mean vectors. The new discriminant functions are applied to simulated data sets as well as to a real data set. Results illustrating the benefits of the new classification methods over the traditional one are presented.  相似文献   

9.
The well-known chi-squared goodness-of-fit test for a multinomial distribution is generally biased when the observations are subject to misclassification. In Pardo and Zografos (2000) the problem was considered using a double sampling scheme and ø-divergence test statistics. A new problem appears if the null hypothesis is not simple because it is necessary to give estimators for the unknown parameters. In this paper the minimum ø-divergence estimators are considered and some of their properties are established. The proposed ø-divergence test statistics are obtained by calculating ø-divergences between probability density functions and by replacing parameters by their minimum ø-divergence estimators in the derived expressions. Asymptotic distributions of the new test statistics are also obtained. The testing procedure is illustrated with an example.  相似文献   

10.
The usual definition of R2 (variance of the predicted values divided by the variance of the data) has a problem for Bayesian fits, as the numerator can be larger than the denominator. We propose an alternative definition similar to one that has appeared in the survival analysis literature: the variance of the predicted values divided by the variance of predicted values plus the expected variance of the errors.  相似文献   

11.
In this article, we investigate techniques for constructing tolerance limits such that the probability is γ that at least p proportion of the population would exceed that limit. We consider the unbalanced case and study the behavior of the limit as a function of ni 's (where ni is the number of observations in the ith batch), as well as that of the variance ratio. To construct the tolerance limits we use the approximation given in Thomas and Hultquist (1978). We also discuss the procedure for constructing the tolerance limits when the variance ratio is unknown. An example is given to illustrate the results.  相似文献   

12.
A confidence interval for the generalized variance of a matrix normal distribution with unknown mean is constructed which improves on the usual minimum size (i.e., minimum length or minimum ratio of endpoints) interval based on the sample generalized variance alone in terms of both coverage probability and size. The method is similar to the univariate case treated by Goutis and Casella (Ann. Statist. 19 (1991) 2015–2031).  相似文献   

13.
In the presence of non-normality, we consider testing for the significance of the variance components in the unbalanced two-way random model without interaction. The approximate test is based on the F-statistic for this model. The asymptotic distribution of the F-statistic is derived as the number of treatments tends to infinity while the number of observations for a treatment in any block takes value from a finite set of positive integers. Robustness of the approximate test is given.  相似文献   

14.
Many applications require efficient sampling from Gaussian distributions. The method of choice depends on the dimension of the problem as well as the structure of the covariance- (Σ) or precision matrix (Q). The most common black-box routine for computing a sample is based on Cholesky factorization. In high dimensions, computing the Cholesky factor of Σ or Q may be prohibitive due to accumulation of more non-zero entries in the factor than is possible to store in memory. We compare different methods for computing the samples iteratively adapting ideas from numerical linear algebra. These methods assume that matrix vector products, Qv, are fast to compute. We show that some of the methods are competitive and faster than Cholesky sampling and that a parallel version of one method on a Graphical Processing Unit (GPU) using CUDA can introduce a speed-up of up to 30x. Moreover, one method is used to sample from the posterior distribution of petroleum reservoir parameters in a North Sea field, given seismic reflection data on a large 3D grid.  相似文献   

15.
The problem of approximating an interval null or imprecise hypothesis test by a point null or precise hypothesis test under a Bayesian framework is considered. In the literature, some of the methods for solving this problem have used the Bayes factor for testing a point null and justified it as an approximation to the interval null. However, many authors recommend evaluating tests through the posterior odds, a Bayesian measure of evidence against the null hypothesis. It is of interest then to determine whether similar results hold when using the posterior odds as the primary measure of evidence. For the prior distributions under which the approximation holds with respect to the Bayes factor, it is shown that the posterior odds for testing the point null hypothesis does not approximate the posterior odds for testing the interval null hypothesis. In fact, in order to obtain convergence of the posterior odds, a number of restrictive conditions need to be placed on the prior structure. Furthermore, under a non-symmetrical prior setup, neither the Bayes factor nor the posterior odds for testing the imprecise hypothesis converges to the Bayes factor or posterior odds respectively for testing the precise hypothesis. To rectify this dilemma, it is shown that constraints need to be placed on the priors. In both situations, the class of priors constructed to ensure convergence of the posterior odds are not practically useful, thus questioning, from a Bayesian perspective, the appropriateness of point null testing in a problem better represented by an interval null. The theories developed are also applied to an epidemiological data set from White et al. (Can. Veterinary J. 30 (1989) 147–149.) in order to illustrate and study priors for which the point null hypothesis test approximates the interval null hypothesis test. AMS Classification: Primary 62F15; Secondary 62A15  相似文献   

16.
In this paper, we consider a Bayesian mixture model that allows us to integrate out the weights of the mixture in order to obtain a procedure in which the number of clusters is an unknown quantity. To determine clusters and estimate parameters of interest, we develop an MCMC algorithm denominated by sequential data-driven allocation sampler. In this algorithm, a single observation has a non-null probability to create a new cluster and a set of observations may create a new cluster through the split-merge movements. The split-merge movements are developed using a sequential allocation procedure based in allocation probabilities that are calculated according to the Kullback–Leibler divergence between the posterior distribution using the observations previously allocated and the posterior distribution including a ‘new’ observation. We verified the performance of the proposed algorithm on the simulated data and then we illustrate its use on three publicly available real data sets.  相似文献   

17.
The problem considered in this paper is that of unbiased estimation of the variance of an exponential distribution using a ranked set sample (RSS). We propose some unbiased estimators each of which is better than the non-parametric minimum variance quadratic unbiased estimator based on a balanced ranked set sample as well as the uniformly minimum variance unbiased estimator based on a simple random sample (SRS) of the same size. Relative performances of the proposed estimators and a few other properties of the estimators including their robustness under imperfect ranking have also been studied.  相似文献   

18.
In sampling from a continuous distribution with unknown mean μ and variance σ2 the problem of estimation of μ, when it is known that μ∈(a, ∞) (or μ∈(-∞, b)), is considered. The estimators proposed here lie in the interval (a, ∞) (or (-∞, b)) almost surely. The performance of these estimators is compared to that of some known estimators in the case of sampling from a normal, exponential and a weighted difference of two independent chi-square distributions.  相似文献   

19.
This paper considers the problem of devising a single stage procedure for selecting the treatment combination associated with the largest interaction in a two-factor rfx× c experiment having independent normally distributed observations with common known variance. The intuitive procedure based on the best linear unbiased estimators of the population interactions is employed. Initially an indifference zone formulation is used; the problem of determining the least favorable configuration is reduced to a nonlinear programming problem with log concave objective function and a convex polytope as feasible region. A solution technique is introduced in the context of an illustrative example. The problem is also considered using a preferred population formulation; this approach requires a strengthened version of the indifference zone probability requirement. It is shown that the same sample size guarantees this requirement as does the earlier one.  相似文献   

20.
We consider the bandit problem with an infinite number of Bernoulli arms, of which the unknown parameters are assumed to be i.i.d. random variables with a common distribution F. Our goal is to construct optimal strategies of choosing “arms” so that the expected long-run failure rate is minimized. We first review a class of strategies and establish their asymptotic properties when F is known. Based on the results, we propose a new strategy and prove that it is asymptotically optimal when F is unknown. Finally, we show that the proposed strategy performs well for a number of simulation scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号