首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
巩红禹  陈雅 《统计研究》2018,35(12):113-122
本文主要讨论样本代表性的改进和多目标调查两个问题。一,本文提出了一种新的改进样本代表性多目标抽样方法,增加样本量与调整样本结构相结合的方法-追加样本的平衡设计,即通过追加样本,使得补充的样本与原来的样本组合生成新的平衡样本,相对于初始样本,减少样本与总体的结构性偏差。平衡样本是指辅助变量总量的霍维茨汤普森估计量等于总体总量真值。二,平衡样本通过选择与多个目标参数相关的辅助变量,使得一套样本对不同的目标参数而言都具有良好的代表性,进而完成多目标调查。结合2010年第六次人口分县普查数据,通过选择多个目标参数,对追加样本后的平衡样本作事后评估结果表明,追加平衡设计能够有效改进样本结构,使得样本结构与总体结构相近,降低目标估计的误差;同时也说明平衡抽样设计能够实现多目标调查,提高样本的使用效率。  相似文献   

2.
Summary. The paper presents a general strategy for selecting the bandwidth of nonparametric regression estimators and specializes it to local linear regression smoothers. The procedure requires the sample to be divided into a training sample and a testing sample. Using the training sample we first compute a family of regression smoothers indexed by their bandwidths. Next we select the bandwidth by minimizing the empirical quadratic prediction error on the testing sample. The resulting bandwidth satisfies a finite sample oracle inequality which holds for all bounded regression functions. This permits asymptotically optimal estimation for nearly any regression function. The practical performance of the method is illustrated by a simulation study which shows good finite sample behaviour of our method compared with other bandwidth selection procedures.  相似文献   

3.
The sample linear discriminant function (LDF) is known to perform poorly when the number of features p is large relative to the size of the training samples, A simple and rarely applied alternative to the sample LDF is the sample Euclidean distance classifier (EDC). Raudys and Pikelis (1980) have compared the sample LDF with three other discriminant functions, including thesample EDC, when classifying individuals from two spherical normal populations. They have concluded that the sample EDC outperforms the sample LDF when p is large relative to the training sample size. This paper derives conditions for which the two classifiers are equivalent when all parameters are known and employs a Monte Carlo simulation to compare the sample EDC with the sample LDF no only for the spherical normal case but also for several nonspherical parameter configurations. Fo many practical situations, the sample EDC performs as well as or superior to the sample LDF, even for nonspherical covariance configurations.  相似文献   

4.
In this paper we discuss the sample size problem for balanced one-way ANOVA under a posterior Bayesian formulation of the problem. Using the distribution theory of appropriate quadratic forms we derive explicit sample sizes for prespecified posterior precisions. Comparisons with classical sample sizes are made. Instead of extensive tables, a mathematica program for sample size calculation is given. The formulations given in this article form a foundational step towards Bayesian calculation of sample size, in general.  相似文献   

5.
In this paper, we discuss some stochastic comparisons for the sample median in a random sample from a normal distribution. Specifically, we establish that the sample median is stochastically farther than the sample mean to the population mean. To verify the result of comparison, we derive an upper bound for some distributional characteristics of the distance between the sample median and the population mean. The stochastic ordering considered here is the likelihood ratio order.  相似文献   

6.
We consider Bayesian testing for independence of two categorical variables with covariates for a two-stage cluster sample. This is a difficult problem because we have a complex sample (i.e. cluster sample), not a simple random sample. Our approach is to convert the cluster sample with covariates into an equivalent simple random sample without covariates, which provides a surrogate of the original sample. Then, this surrogate sample is used to compute the Bayes factor to make an inference about independence. We apply our methodology to the data from the Trend in International Mathematics and Science Study [30] for fourth grade US students to assess the association between the mathematics and science scores represented as categorical variables. We show that if there is strong association between two categorical variables, there is no significant difference between the tests with and without the covariates. We also performed a simulation study to further understand the effect of covariates in various situations. We found that for borderline cases (moderate association between the two categorical variables), there are noticeable differences in the test with and without covariates.  相似文献   

7.
This paper provides closed form expressions for the sample size for two-level factorial experiments when the response is the number of defectives. The sample sizes are obtained by approximating the two-sided test for no effect through tests for the mean of a normal distribution, and borrowing the classical sample size solution for that problem. The proposals are appraised relative to the exact sample sizes computed numerically, without appealing to any approximation to the binomial distribution, and the use of the sample size tables provided is illustrated through an example.  相似文献   

8.
Abstract.  The sampling-importance resampling (SIR) algorithm aims at drawing a random sample from a target distribution π. First, a sample is drawn from a proposal distribution q , and then from this a smaller sample is drawn with sample probabilities proportional to the importance ratios π/ q . We propose here a simple adjustment of the sample probabilities and show that this gives faster convergence. The results indicate that our version converges better also for small sample sizes. The SIR algorithms are compared with the Metropolis–Hastings (MH) algorithm with independent proposals. Although MH converges asymptotically faster, the results indicate that our improved SIR version is better than MH for small sample sizes. We also establish a connection between the SIR algorithms and importance sampling with normalized weights. We show that the use of adjusted SIR sample probabilities as importance weights reduces the bias of the importance sampling estimate.  相似文献   

9.
Sampling cost is a crucial factor in sample size planning, particularly when the treatment group is more expensive than the control group. To either minimize the total cost or maximize the statistical power of the test, we used the distribution-free Wilcoxon–Mann–Whitney test for two independent samples and the van Elteren test for randomized block design, respectively. We then developed approximate sample size formulas when the distribution of data is abnormal and/or unknown. This study derived the optimal sample size allocation ratio for a given statistical power by considering the cost constraints, so that the resulting sample sizes could minimize either the total cost or the total sample size. Moreover, for a given total cost, the optimal sample size allocation is recommended to maximize the statistical power of the test. The proposed formula is not only innovative, but also quick and easy. We also applied real data from a clinical trial to illustrate how to choose the sample size for a randomized two-block design. For nonparametric methods, no existing commercial software for sample size planning has considered the cost factor, and therefore the proposed methods can provide important insights related to the impact of cost constraints.  相似文献   

10.
Sample size determination is one of the most commonly encountered tasks in the design of every applied research. The general guideline suggests that a pilot study can offer plausible planning values for the vital model characteristics. This article examines two viable approaches to taking into account the imprecision of a variance estimate in sample size calculations for linear statistical models. The multiplier procedure employs an adjusted sample variance in the form of a multiple of the observed sample variance. The Bayesian method accommodates the uncertainty of a sample variance through a prior distribution. It is shown that the two seemingly distinct techniques are equivalent for sample size determination under the designated assurance requirements that the actual power exceeds the planned threshold with a given tolerance probability, or the expected power attains the desired level. The selection of optimum pilot sample size for minimizing the expected total cost is also considered.  相似文献   

11.
We discuss 3 alternative approaches to sample size calculation: traditional sample size calculation based on power to show a statistically significant effect, sample size calculation based on assurance, and sample size based on a decision‐theoretic approach. These approaches are compared head‐to‐head for clinical trial situations in rare diseases. Specifically, we consider 3 case studies of rare diseases (Lyell disease, adult‐onset Still disease, and cystic fibrosis) with the aim to plan the sample size for an upcoming clinical trial. We outline in detail the reasonable choice of parameters for these approaches for each of the 3 case studies and calculate sample sizes. We stress that the influence of the input parameters needs to be investigated in all approaches and recommend investigating different sample size approaches before deciding finally on the trial size. Highly influencing for the sample size are choice of treatment effect parameter in all approaches and the parameter for the additional cost of the new treatment in the decision‐theoretic approach. These should therefore be discussed extensively.  相似文献   

12.
In this note, we present alternative derivations for the probability that an individual order statistic is closest to the target parameter among all order statistics from a complete random sample. This approach is simpler than the geometric arguments used earlier. We also provide a simple direct proof for the symmetry property of the simultaneous closeness probabilities among order statistics for the estimation of percentiles from a symmetric family. Finally, we offer an alternative simpler proof for the result that sample medians from larger odd sample sizes are Pitman closer to the population median than sample medians from smaller odd sample sizes.  相似文献   

13.
A two–sample test statistic for detecting shifts in location is developed for a broad range of underlying distributions using adaptive techniques. The test statistic is a linear rank statistics which uses a simple modification of the Wilcoxon test; the scores are Winsorized ranks where the upper and lower Winsorinzing proportions are estimated in the first stage of the adaptive procedure using sample the first stage of the adaptive procedure using sample measures of the distribution's skewness and tailweight. An empirical relationship between the Winsorizing proportions and the sample skewness and tailweight allows for a ‘continuous’ adaptation of the test statistic to the data. The test has good asymptotic properties, and the small sample results are compared with other populatr parametric, nonparametric, and two–stage tests using Monte Carlo methods. Based on these results, this proposed test procedure is recommended for moderate and larger sample sizes.  相似文献   

14.
The purpose of our study is to propose a. procedure for determining the sample size at each stage of the repeated group significance, tests intended to compare the efficacy of two treatments when a response variable is normal. It is necessary to devise a procedure for reducing the maximum sample size because a large number of sample size are often used in group sequential test. In order to reduce the sample size at each stage, we construct the repeated confidence boundaries which enable us to find which of the two treatments is the more effective at an early stage. Thus we use the recursive formulae of numerical integrations to determine the sample size at the intermediate stage. We compare our procedure with Pocock's in terms of maximum sample size and average sample size in the simulations.  相似文献   

15.
A challenge for implementing performance-based Bayesian sample size determination is selecting which of several methods to use. We compare three Bayesian sample size criteria: the average coverage criterion (ACC) which controls the coverage rate of fixed length credible intervals over the predictive distribution of the data, the average length criterion (ALC) which controls the length of credible intervals with a fixed coverage rate, and the worst outcome criterion (WOC) which ensures the desired coverage rate and interval length over all (or a subset of) possible datasets. For most models, the WOC produces the largest sample size among the three criteria, and sample sizes obtained by the ACC and the ALC are not the same. For Bayesian sample size determination for normal means and differences between normal means, we investigate, for the first time, the direction and magnitude of differences between the ACC and ALC sample sizes. For fixed hyperparameter values, we show that the difference of the ACC and ALC sample size depends on the nominal coverage, and not on the nominal interval length. There exists a threshold value of the nominal coverage level such that below the threshold the ALC sample size is larger than the ACC sample size, and above the threshold the ACC sample size is larger. Furthermore, the ACC sample size is more sensitive to changes in the nominal coverage. We also show that for fixed hyperparameter values, there exists an asymptotic constant ratio between the WOC sample size and the ALC (ACC) sample size. Simulation studies are conducted to show that similar relationships among the ACC, ALC, and WOC may hold for estimating binomial proportions. We provide a heuristic argument that the results can be generalized to a larger class of models.  相似文献   

16.
Some asymptotic statistical properties of the sample mean of a class locally stationary long-memory process are studied in this paper. Conditions for consistency are investigated and precise convergence rates of the variance of the sample mean are established for a class of time-varying long-memory parameter functions. A central limit theorem for the sample mean is also established. Furthermore, the calculation of the variance of the sample mean is illustrated through several numerical and simulation experiments.  相似文献   

17.
For the Poisson a posterior distribution for the complete sample size, N, is derived from an incomplete sample when any specified subset of the classes are missing.Means as well as other posterior characteristics of N are obtained for two examples with various classes removed. For the special case of a truncated ‘missing zero class’ Poisson sample a simulation experiment is performed for the small ‘N=25’ sample situation applying both Bayesian and maximum likelihood methods of estimation.  相似文献   

18.
Assume that a sample is available from a population having an exponential distribution, and that l Future sample are to be taken from the same population. This paper provides a formula for the same population. This paper provides a formula for computing a one–sided lower simulataneous prediction limit which is to be below the (ki ? mi + 1) –st order statistics of a future sample of size ki for the i = 1,…,2, hased on the sample mean of a past sample. Tables for factors for one–sided lower simultaneous predicition limits are provided. Such limits are of practical importance in determining acceptance criteria and predicting system survival times.  相似文献   

19.
In this article, we present a straightforward Bonferroni approach for determining sample size for estimating the mean vector of a multivariate population under two scenarios: (1) a pre-specified overall confidence level is desired; and (2) a pre-specified confidence level needs to be guaranteed for each individual variable. It is demonstrated that correlation between variables helps reduce the sample size. The formula to calculate the reduced sample size is derived. A binormal example is presented to illustrate the effect of correlation on sample size reduction for various values of the correlation coefficient.  相似文献   

20.
We consider in this article the problem of numerically approximating the quantiles of a sample statistic for a given population, a problem of interest in many applications, such as bootstrap confidence intervals. The proposed Monte Carlo method can be routinely applied to handle complex problems that lack analytical results. Furthermore, the method yields estimates of the quantiles of a sample statistic of any sample size though Monte Carlo simulations for only two optimally selected sample sizes are needed. An analysis of the Monte Carlo design is performed to obtain the optimal choices of these two sample sizes and the number of simulated samples required for each sample size. Theoretical results are presented for the bias and variance of the numerical method proposed. The results developed are illustrated via simulation studies for the classical problem of estimating a bivariate linear structural relationship. It is seen that the size of the simulated samples used in the Monte Carlo method does not have to be very large and the method provides a better approximation to quantiles than those based on an asymptotic normal theory for skewed sampling distributions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号