期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Lower bounds of the wrap-around -discrepancy and relationships between MLHD and uniform design with a large size

Yong-Dao Zhou Jian-Hui Ning 《Journal of statistical planning and inference》2008,138(8):2330-2339

The wrap-around (WD) L₂-discrepancy has been commonly used in experimental designs. In this paper, some lower bounds of the WD L₂-discrepancy for asymmetrical U-type designs are given and the expectation and variance of midpoint Latin hypercube designs (LHD) are also obtained. Relationships between midpoint LHD and uniform designs for symmetrical and asymmetrical cases are discussed in the sense of comparing the lower bound and the expectation of squared wrap-around L₂-discrepancy of U-type designs. Some comparisons between simple random sampling and the lower bounds of U-type designs are given. 相似文献

2.

VARIANCE ESTIMATION IN TWO-PHASE SAMPLING

M.A. Hidiroglou J.N.K. Rao David Haziza 《Australian & New Zealand Journal of Statistics》2009,51(2):127-141

Two‐phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s₁ is drawn according to a specific sampling design p(s₁) , and auxiliary data x are observed for the units i∈s₁ . Given the first‐phase sample s₁ , a second‐phase sample s₂ is selected from s₁ according to a specified sampling design {p(s₂∣s₁) } , and (y, x) is observed for the units i∈s₂ . In some cases, the population totals of some components of x may also be known. Two‐phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson‐type variance estimators are used for variance estimation. However, the Horvitz–Thompson ( Horvitz & Thompson, J. Amer. Statist. Assoc. 1952 ) variance estimator in uni‐phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non‐negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy ( Sen , J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy , J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two‐phase sampling, assuming fixed first‐phase sample size and fixed second‐phase sample size given the first‐phase sample. We apply the new variance estimators to two‐phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy‐type variance estimators of the two‐phase regression estimators that make use of the first‐phase auxiliary data and known population totals of some of the auxiliary variables. 相似文献

3.

Bounds on Bivariate Distribution Functions with Given Margins and Known Values at Several Points

H. A. Mardani-Fard S. M. Sadooghi-Alvandi Z. Shishebor 《统计学通讯:理论与方法》2013,42(20):3596-3621

Let H(x, y) be a continuous bivariate distribution function with known marginal distribution functions F(x) and G(y). Suppose the values of H are given at several points, H(x _i, y _i) = θ_i, i = 1, 2,…, n. We first discuss conditions for the existence of a distribution satisfying these conditions, and present a procedure for checking if such a distribution exists. We then consider finding lower and upper bounds for such distributions. These bounds may be used to establish bounds on the values of Spearman's ρ and Kendall's τ. For n = 2, we present necessary and sufficient conditions for existence of such a distribution function and derive best-possible upper and lower bounds for H(x, y). As shown by a counter-example, these bounds need not be proper distribution functions, and we find conditions for these bounds to be (proper) distribution functions. We also present some results for the general case, where the values of H(x, y) are known at more than two points. In view of the simplification in notation, our results are presented in terms of copulas, but they may easily be expressed in terms of distribution functions. 相似文献

4.

Quadratic subspaces and construction of Bayes invariant quadratic estimators of variance components in mixed linear models

Mariusz Grządziel 《Statistical Papers》2008,49(3):399-419

Gnot et al. (J Statist Plann Inference 30(1):223–236, 1992) have presented the formulae for computing Bayes invariant quadratic estimators of variance components in normal mixed linear models of the form where the matrices V _i, 1 ≤ i ≤ k − 1, are symmetric and nonnegative definite and V _k is an identity matrix. These formulae involve a basis of a quadratic subspace containing MV ₁ M,...,MV _k-1 M,M, where M is an orthogonal projector on the null space of X′. In the paper we discuss methods of construction of such a basis. We survey Malley’s algorithms for finding the smallest quadratic subspace including a given set of symmetric matrices of the same order and propose some modifications of these algorithms. We also consider a class of matrices sharing some of the symmetries common to MV ₁ M,...,MV _k-1 M,M. We show that the matrices from this class constitute a quadratic subspace and describe its explicit basis, which can be directly used for computing Bayes invariant quadratic estimators of variance components. This basis can be also used for improving the efficiency of Malley’s algorithms when applied to finding a basis of the smallest quadratic subspace containing the matrices MV ₁ M,...,MV _k-1 M,M. Finally, we present the results of a numerical experiment which confirm the potential usefulness of the proposed methods. Dedicated to the memory of Professor Stanisław Gnot. 相似文献

5.

Estimation of a finite population distribution function based on a linear model with unknown heteroscedastic errors

María‐Jos Lombardí Wenceslao Gonzlez‐Manteiga Jos‐Manuel Prada‐Snchez 《Revue canadienne de statistique》2005,33(2):181-200

The authors consider a finite population ρ = {(Y_k, x_k), k = 1,…,N} conforming to a linear superpopulation model with unknown heteroscedastic errors, the variances of which are values of a smooth enough function of the auxiliary variable X for their nonparametric estimation. They describe a method of the Chambers‐Dunstan type for estimation of the distribution of {Y_k, k = 1,…, N} from a sample drawn from without replacement, and determine the asymptotic distribution of its estimation error. They also consider estimation of its mean squared error in particular cases, evaluating both the analytical estimator derived by “plugging‐in” the asymptotic variance, and a bootstrap approach that is also applicable to estimation of parameters other than mean squared error. These proposed methods are compared with some common competitors in simulation studies. 相似文献

6.

Inference on finite population categorical response: nonparametric regression-based predictive approach

Sumanta Adhya Tathagata Banerjee Gaurangadeb Chattopadhyay 《AStA Advances in Statistical Analysis》2012,96(1):69-98

Suppose that a finite population consists of N distinct units. Associated with the ith unit is a polychotomous response vector, d _i, and a vector of auxiliary variable x _i. The values x _i’s are known for the entire population but d _i’s are known only for the units selected in the sample. The problem is to estimate the finite population proportion vector P. One of the fundamental questions in finite population sampling is how to make use of the complete auxiliary information effectively at the estimation stage. In this article a predictive estimator is proposed which incorporates the auxiliary information at the estimation stage by invoking a superpopulation model. However, the use of such estimators is often criticized since the working superpopulation model may not be correct. To protect the predictive estimator from the possible model failure, a nonparametric regression model is considered in the superpopulation. The asymptotic properties of the proposed estimator are derived and also a bootstrap-based hybrid re-sampling method for estimating the variance of the proposed estimator is developed. Results of a simulation study are reported on the performances of the predictive estimator and its re-sampling-based variance estimator from the model-based viewpoint. Finally, a data survey related to the opinions of 686 individuals on the cause of addiction is used for an empirical study to investigate the performance of the nonparametric predictive estimator from the design-based viewpoint. 相似文献

7.

Comparison of PLS algorithms when number of objects is much larger than number of variables

Aylin Alin 《Statistical Papers》2009,50(4):711-720

NIPALS and SIMPLS algorithms are the most commonly used algorithms for partial least squares analysis. When the number of objects, N, is much larger than the number of explanatory, K, and/or response variables, M, the NIPALS algorithm can be time consuming. Even though the SIMPLS is not as time consuming as the NIPALS and can be preferred over the NIPALS, there are kernel algorithms developed especially for the cases where N is much larger than number of variables. In this study, the NIPALS, SIMPLS and some kernel algorithms have been used to built partial least squares regression model. Their performances have been compared in terms of the total CPU time spent for the calculations of latent variables, leave-one-out cross validation and bootstrap methods. According to the numerical results, one of the kernel algorithms suggested by Dayal and MacGregor (J Chemom 11:73–85, 1997) is the fastest algorithm. 相似文献

8.

A Pitman measure of similarity in k-means for clustering heavy-tailed data

Arman Reybod Javad Etminan Adel Mohammadpour 《统计学通讯:模拟与计算》2019,48(6):1595-1605

One of the most popular methods and algorithms to partition data to k clusters is k-means clustering algorithm. Since this method relies on some basic conditions such as, the existence of mean and finite variance, it is unsuitable for data that their variances are infinite such as data with heavy tailed distribution. Pitman Measure of Closeness (PMC) is a criterion to show how much an estimator is close to its parameter with respect to another estimator. In this article using PMC, based on k-means clustering, a new distance and clustering algorithm is developed for heavy tailed data. 相似文献

9.

The discrepancy of P-values and posterior probability: Two-sided hypothesis of variance of normal distribution with unknown mean

Rahim Chinipardaz Behzad Falahifard Mohammad Reza Akhoond 《统计学通讯:理论与方法》2017,46(11):5544-5555

This article is concerned with the comparison of P-value and Bayesian measure in point null hypothesis for the variance of Normal distribution with unknown mean. First, using fixed prior for test parameter, the posterior probability is obtained and compared with the P-value when an appropriate prior is used for the mean parameter. In the second, lower bounds of the posterior probability of H₀ under a reasonable class of prior are compared with the P-value. It has been shown that even in the presence of nuisance parameters, these two approaches can lead to different results in the statistical inference. 相似文献

10.

Lower bound of average centered L2-discrepancy for U-type designs

Xue Yang Gui-Jun Yang 《统计学通讯:理论与方法》2019,48(4):995-1008

Uniform designs are widely used in various scientific investigations and industrial applications. By considering all possible level permutation of the factors, a connection between average centered L₂-discrepancy and generalized wordlength pattern for asymmetrical fractional factorial designs is derived. Moreover, we present new lower bounds to the average centered L₂-discrepancy for symmetrical and asymmetrical U-type designs. For illustration of the theoretical results, the lower bounds for symmetrical and asymmetrical U-type designs are tabulated, and numerical results indicate that our lower bounds behave well and can be recommended for use in practice. 相似文献

11.

Maximum variance of order statistics from symmetric populations revisited

Krzysztof Jasiński 《Statistics》2013,47(2):422-438

We consider i.i.d. samples of size n with symmetric non-degenerate parent distributions and finite variances. Papadatos [A note on maximum variance of order statistics from symmetric populations, Ann. Inst. Statist. Math. 48 (1997), pp. 117–121] proved that the maximal variance of each non-extreme order statistic, expressed in the population variance units, is attained in a one-parametric family of symmetric two- and three-point distributions. The parameters of the extreme variance distributions coincide with the arguments maximizing some polynomials of degree 2n?1 over a finite interval. The bounds for variances are equal to the maximal values of the polynomials. We present a more precise solution to the problem by applying the variation diminishing property of Bernstein polynomials. 相似文献

12.

Standard and robust orthogonal regression

Larry Ammann John Van Ness 《统计学通讯:模拟与计算》2013,42(1):145-162

A fast routine for converting regression algorithms into corresponding orthogonal regression (OR) algorithms was introduced in Ammann and Van Ness (1988). The present paper discusses the properties of various ordinary and robust OR procedures created using this routine. OR minimizes the sum of the orthogonal distances from the regression plane to the data points. OR has three types of applications. First, L ₂ OR is the maximum likelihood solution of the Gaussian errors-in-variables (EV) regression problem. This L ₂ solution is unstable, thus the robust OR algorithms created from robust regression algorithms should prove very useful. Secondly, OR is intimately related to principal components analysis. Therefore, the routine can also be used to create L ₁, robust, etc. principal components algorithms. Thirdly, OR treats the x and y variables symmetrically which is important in many modeling problems. Using Monte Carlo studies this paper compares the performance of standard regression, robust regression, OR, and robust OR on Gaussian EV data, contaminated Gaussian EV data, heavy-tailed EV data, and contaminated heavy-tailed EV data. 相似文献

13.

The Likelihood Ratio Test with the Box–Cox Transformation for the Normal Mixture Problem: Power and Sample Size Study

《统计学通讯:模拟与计算》2013,42(3):553-565

Abstract

Through simulation and regression, we study the alternative distribution of the likelihood ratio test in which the null hypothesis postulates that the data are from a normal distribution after a restricted Box–Cox transformation and the alternative hypothesis postulates that they are from a mixture of two normals after a restricted (possibly different) Box–Cox transformation. The number of observations in the sample is called N. The standardized distance between components (after transformation) is D = (μ₂ ? μ₁)/σ, where μ₁ and μ₂ are the component means and σ² is their common variance. One component contains the fraction π of observed, and the other 1 ? π. The simulation results demonstrate a dependence of power on the mixing proportion, with power decreasing as the mixing proportion differs from 0.5. The alternative distribution appears to be a non-central chi-squared with approximately 2.48 + 10N ^?0.75 degrees of freedom and non-centrality parameter 0.174N(D ? 1.4)² × [π(1 ? π)]. At least 900 observations are needed to have power 95% for a 5% test when D = 2. For fixed values of D, power, and significance level, substantially more observations are necessary when π ≥ 0.90 or π ≤ 0.10. We give the estimated powers for the alternatives studied and a table of sample sizes needed for 50%, 80%, 90%, and 95% power. 相似文献

14.

CRITICAL VALUES FOR INFERENCE ABOUT NORMAL DISPERSION1

S. John 《Australian & New Zealand Journal of Statistics》1973,15(2):71-79

Let F_N(.) be the density function of X²_N. Values of C₁/N, i= 1, 2, satisfying the twin conditions Pr (C₁≤X²_N≤C₂)=1-α and the conditional expectation of X²_N given C₁≤X²_N≤C₂ is N are tabulated for α=.2, .1, .05, .01, .005, .001, N=1(1)20(2)50(5)150(10)350. The second condition may be replaced by the condition f_N+2(C₁)=f_N+2V(C₂). The author has with him a bigger table giving C₁ and C₂ for α=.2, .1, .05, .01, .005, .001, N=1(1)350 to three decimals (to three significant digits, if some decimals are not significant). Several applications are mentioned. A practical application that is perhaps not obvious is to test whether two or more counts are distributed as independent Poisson variables. The new simple formulae used in the construction of the table are given and should prove useful in obtaining accurate values for omitted entries and in increasing the accuracy of entries. 相似文献

15.

On the sample size of curtailed tests

Bennett Eisenberg 《统计学通讯:理论与方法》2013,42(21):2177-2196

Formulas are given for the asymptotic distribution, mean, and variance of m^-1N_m,where NN_m is the random sample size of the curtailed version of a fixed-sample most powerful test based on sample size m. The adequacy of the formulas is numerically investigated in some important applications where exact formulas can also be derived 相似文献

16.

Remarks on the L1 distance in statistical data analysis

Robert J. Budzyński Witold Kondracki 《统计学通讯:理论与方法》2017,46(19):9355-9363

We propose the L₁ distance between the distribution of a binned data sample and a probability distribution from which it is hypothetically drawn as a statistic for testing agreement between the data and a model. We study the distribution of this distance for N-element samples drawn from k bins of equal probability and derive asymptotic formulae for the mean and dispersion of L₁ in the large-N limit. We argue that the L₁ distance is asymptotically normally distributed, with the mean and dispersion being accurately reproduced by asymptotic formulae even for moderately large values of N and k. 相似文献

17.

AN INFINITE-PHASE QUASI-BIRTH-AND-DEATH MODEL FOR THE NON-PREEMPTIVE PRIORITY M/PH/1 QUEUE

《随机性模型》2013,29(3):387-424

This paper considers a single server queue that handles arrivals from N classes of customers on a non-preemptive priority basis. Each of the N classes of customers features arrivals from a Poisson process at rate λ _i and class-dependent phase type service. To analyze the queue length and waiting time processes of this queue, we derive a matrix geometric solution for the stationary distribution of the underlying Markov chain. A defining characteristic of the paper is the fact that the number of distinct states represented within the sub-level is countably infinite, rather than finite as is usually assumed. Among the results we obtain in the two-priority case are tractable algorithms for the computation of both the joint distribution for the number of customers present and the marginal distribution of low-priority customers, and an explicit solution for the marginal distribution of the number of high-priority customers. This explicit solution can be expressed completely in terms of the arrival rates and parameters of the two service time distributions. These results are followed by algorithms for the stationary waiting time distributions for high- and low-priority customers. We then address the case of an arbitrary number of priority classes, which we solve by relating it to an equivalent three-priority queue. Numerical examples are also presented. 相似文献

18.

Estimators of shift based on statistics of the Kolmogorov-Smirnov type

Alain Boulanger 《Revue canadienne de statistique》1983,11(4):271-284

This paper is concerned with the estimation of a shift parameter δ_o, based on some nonnegative functional Hg₁ of the pair (D^δ_N(x), f?^δ_N(x)), where D^δ_N(x) = K_N/b {F_2,n(x)—F_1,m (x + δ)}, +^δ_N(x) = {mF_1,m (x + δ) + nF_2,n(x)}/N, where F_1,m and F_2,n are the empirical distribution functions of two independent random samples (N = m + n), and where K²_N = mn/N. First an estimator δ_N, is defined as a value of δ minimizing a functional H of the type of H₁. A second estimator δ¹_N is also defined which is a linearized version of the first. Finite and asymptotic properties of these estimators are considered. It is also shown that most well-known test statistics of the Kolmogorov-Smirnov type are particular cases of such functionals H₁. The asymptotic distribution and the asymptotic efficiency of some estimators are given. 相似文献

19.

Variance Reduction Using Nonlinear Controls and Transformations

Peter A. W. Lewis Richard L. Ressler R. Kevin Wood 《统计学通讯:模拟与计算》2013,42(2):655-672

Nonlinear regression-adjusted control variables are investigated for improving variance reduction in statistical and system simulations. To this end, simple control variables are piecewise sectioned and then transformed using linear and nonlinear transformations. Optimal parameters of these transformations are selected using linear or nonlinear least-squares regression algorithms. As an example, piecewise power-transformed variables are used in the estimation of the mean for the twovariable Anderson-Darling goodness-of-fit statistic W ₂ ². Substantial variance reduction over straightforward controls is obtained. These parametric transformations are compared against optimal, additive nonparametric transformations obtained by using the ACE algorithm and are shown, in comparison to the results from ACE, to be nearly optimal. 相似文献

20.

Corrected asymptotic distribution of statistics based on the multinomial law

D. Neveu A. Kramar P. Dujols 《Statistical Methodology》2007,4(1):64-74

Investigators and epidemiologists often use statistics based on the parameters of a multinomial distribution. Two main approaches have been developed to assess the inferences of these statistics. The first one uses asymptotic formulae which are valid for large sample sizes. The second one computes the exact distribution, which performs quite well for small samples. They present some limitations for sample sizes N neither large enough to satisfy the assumption of asymptotic normality nor small enough to allow us to generate the exact distribution. We analytically computed the 1/N corrections of the asymptotic distribution for any statistics based on a multinomial law. We applied these results to the kappa statistic in 2×2 and 3×3 tables. We also compared the coverage probability obtained with the asymptotic and the corrected distributions under various hypothetical configurations of sample size and theoretical proportions. With this method, the estimate of the mean and the variance were highly improved as well as the 2.5 and the 97.5 percentiles of the distribution, allowing us to go down to sample sizes around 20, for data sets not too asymmetrical. The order of the difference between the exact and the corrected values was 1/N² for the mean and 1/N³ for the variance. 相似文献