首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The sample lead can refer to the lead of one party over another in public opinion polls, of one product over another in market research surveys, of one programme over another in TV viewing surveys, etc. In applied statistics, it is common to assume that the distribution of the sample lead is approximately normal. The assumption is justified in most situations, but, when samples are small or when population proportions are extreme, the normal approximation may be inadequate. This paper describes the derivation of the exact distribution of the sample lead and employs it to test hypotheses when the normal approximation is inadequate. The exact distribution also can be used to check whether or not a particular distribution of the sample lead can be adequately represented by the normal distribution.  相似文献   

2.
Corrigendum     
The articles presents results concerningthe significances of the sample lead and illustrates their uses. They apply to the sample lead of one party over another in political polls, one productover another in market research surveys, share of the market of one company over another in industrial studies, one class over another in social investigations, one programme over another in TV viewing surveys, etc. They enable calculation of confidence intervals and required sample sizes, testing of hypotheses, etc. They deal with sampling from finite and infinite populations, with and without replacement, sampling from population with overlapping classifications, stratified sampling, optimum allocation to strata, etc. Derivation of the results is given in the appendices. Some of the proofs are rather complicated, but the final results are quite simple and easy to use.  相似文献   

3.
This article presents results concerning the significance of sample leads and illustrates their uses. They enable the analysis of one party over another in political polls, one product over another in market research surveys, market share in industrial studies, etc. They enable the calculation of confidence intervals and required sample size, testing of hypotheses, etc. Sampling from infinite and finite populations with overlapping classifications, stratified sampling, optimum allocation to strata, etc., are considered. Derivation of the results is given in a set of Appendices. Some of these are rather complicated, but their uses are straightforward.  相似文献   

4.
Summary.  The problem motivating the paper is the determination of sample size in clinical trials under normal likelihoods and at the substantive testing stage of a financial audit where normality is not an appropriate assumption. A combination of analytical and simulation-based techniques within the Bayesian framework is proposed. The framework accommodates two different prior distributions: one is the general purpose fitting prior distribution that is used in Bayesian analysis and the other is the expert subjective prior distribution, the sampling prior which is believed to generate the parameter values which in turn generate the data. We obtain many theoretical results and one key result is that typical non-informative prior distributions lead to very small sample sizes. In contrast, a very informative prior distribution may either lead to a very small or a very large sample size depending on the location of the centre of the prior distribution and the hypothesized value of the parameter. The methods that are developed are quite general and can be applied to other sample size determination problems. Some numerical illustrations which bring out many other aspects of the optimum sample size are given.  相似文献   

5.
We derive the asymptotic distribution of the sample autocor-relations of nonstationary fractionally integrated processes of order d. If d≥1, the sample autocorrelations approach their probability limit one with a rate equal to the sample size. If d<1, the rate is slower and depends on d. These findings carry over to the case of detrended series. Monte Carlo evidence and an empirical example illustrate the theoretical results.  相似文献   

6.
Tim Fischer  Udo Kamps 《Statistics》2013,47(1):142-158
There are several well-known mappings which transform the first r common order statistics in a sample of size n from a standard uniform distribution to a full vector of dimension r of order statistics in a sample of size r from a uniform distribution. Continuing the results reported in a previous paper by the authors, it is shown that transformations of these types do not lead to order statistics from an i.i.d. sample of random variables, in general, when being applied to order statistics from non-uniform distributions. By accepting the loss of one dimension, a structure-preserving transformation exists for power function distributions.  相似文献   

7.
Recently, a new non-randomized parallel design is proposed by Tian (2013) for surveys with sensitive topics. However, the sample size formulae associated with testing hypotheses for the parallel model are not yet available. As a crucial component in surveys, the sample size formulae with the parallel design are developed in this paper by using the power analysis method for both the one- and two-sample problems. We consider both the one- and two-sample problems. The asymptotic power functions and the corresponding sample size formulae for both the one- and two-sided tests based on the large-sample normal approximation are derived. The performance is assessed through comparing the asymptotic power with the exact power and reporting the ratio of the sample sizes with the parallel model and the design of direct questioning. We numerically compare the sample sizes needed for the parallel design with those required for the crosswise and triangular models. Two theoretical justifications are also provided. An example from a survey on ‘sexual practices’ in San Francisco, Las Vegas and Portland is used to illustrate the proposed methods.  相似文献   

8.
In this paper, we are interested in the joint distribution of two order statistics from overlapping samples. We give an explicit formula for the distribution of such a pair of random variables under the assumption that the parent distribution is absolutely continuous. We are also interested in the question to what extent conditional expectation of one of such order statistic given another determines the parent distribution. In particular, we provide a new characterization by linearity of regression of an order statistic from the extended sample given the one from the original sample, special case of which solves a problem explicitly stated in the literature. It appears that to describe the correct parent distribution it is convenient to use quantile density functions. In several other cases of regressions of order statistics we provide new results regarding uniqueness of the distribution in the sample.  相似文献   

9.
于力超  金勇进 《统计研究》2018,35(11):93-104
大规模抽样调查多采用复杂抽样设计,得到具有分层嵌套结构的调查数据集,其中不可避免会遇到数据缺失问题,针对分层结构含缺失数据集的插补策略目前鲜有研究。本文将Gibbs算法应用到分层含缺失数据集的多重插补过程中,分别研究了固定效应模型插补法和随机效应模型插补法,进而通过理论推导和数值模拟,在不同组内相关系数、群组规模、数据缺失比例等情形下,从参数估计结果的无偏性和有效性两方面,比较不同方法的插补效果,给出插补模型的选择建议。研究结果表明,采用随机效应模型作为插补模型时,得到的参数估计结果更准确,而固定效应模型作为插补模型操作相对简便,在数据缺失比例较小、组内相关系数较大、群组规模较大等情形下,可以采用固定效应插补模型,否则建议采用随机效应插补模型。  相似文献   

10.
On the planning and design of sample surveys   总被引:1,自引:1,他引:0  
Surveys rely on structured questions used to map out reality, using sample observations from a population frame, into data that can be statistically analyzed. This paper focuses on the planning and design of surveys, making a distinction between individual surveys, household surveys and establishment surveys. Knowledge from cognitive science is used to provide guidelines on questionnaire design. Non-standard, but simple, statistical methods are described for analyzing survey results. The paper is based on experience gained by conducting over 150 customer satisfaction surveys in Europe, America and the Far East.  相似文献   

11.
In the traditional study design of a single‐arm phase II cancer clinical trial, the one‐sample log‐rank test has been frequently used. A common practice in sample size calculation is to assume that the event time in the new treatment follows exponential distribution. Such a study design may not be suitable for immunotherapy cancer trials, when both long‐term survivors (or even cured patients from the disease) and delayed treatment effect are present, because exponential distribution is not appropriate to describe such data and consequently could lead to severely underpowered trial. In this research, we proposed a piecewise proportional hazards cure rate model with random delayed treatment effect to design single‐arm phase II immunotherapy cancer trials. To improve test power, we proposed a new weighted one‐sample log‐rank test and provided a sample size calculation formula for designing trials. Our simulation study showed that the proposed log‐rank test performs well and is robust of misspecified weight and the sample size calculation formula also performs well.  相似文献   

12.
Planning a study using the General Linear Univariate Model often involves sample size calculation based on a variance estimated in an earlier study. Noncentrality, power, and sample size inherit the randomness. Additional complexity arises if the estimate has been censored. Left censoring occurs when only significant tests lead to a power calculation, while right censoring occurs when only non-significant tests lead to a power calculation. We provide simple expressions for straightforward computation of the distribution function, moments, and quantiles of the censored variance estimate, estimated noncentrality, power, and sample size. We also provide convenient approximations and evaluate their accuracy. The results allow demonstrating that ignoring right censoring falsely widens confidence intervals for noncentrality and power, while ignoring left censoring falsely narrows the confidence intervals. The new results allow assessing and avoiding the potentially substantial bias that censoring may create.  相似文献   

13.
The coverage rate of the original data by the prediction interval in simple linear regression is obtained by computer simulation. The results show that for small sample size, the coverage rate is higher than the assigned prediction coverage rate (confidence level). The two coverage rates begin to converge when the sample size is larger than 50 and the convergence rate depends very little on the distribution of the independent variable. Also, theoretical results on the asymptotic coverage rate and on the absolute minimum bounds are obtained  相似文献   

14.
15.
The internal pilot study design allows for modifying the sample size during an ongoing study based on a blinded estimate of the variance thus maintaining the trial integrity. Various blinded sample size re‐estimation procedures have been proposed in the literature. We compare the blinded sample size re‐estimation procedures based on the one‐sample variance of the pooled data with a blinded procedure using the randomization block information with respect to bias and variance of the variance estimators, and the distribution of the resulting sample sizes, power, and actual type I error rate. For reference, sample size re‐estimation based on the unblinded variance is also included in the comparison. It is shown that using an unbiased variance estimator (such as the one using the randomization block information) for sample size re‐estimation does not guarantee that the desired power is achieved. Moreover, in situations that are common in clinical trials, the variance estimator that employs the randomization block length shows a higher variability than the simple one‐sample estimator and in turn the sample size resulting from the related re‐estimation procedure. This higher variability can lead to a lower power as was demonstrated in the setting of noninferiority trials. In summary, the one‐sample estimator obtained from the pooled data is extremely simple to apply, shows good performance, and is therefore recommended for application. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

16.
Large governmental surveys typically provide accurate national statistics. To decrease the mean squared error of estimates for small areas, i.e., domains in which the sample size is small, auxiliary variables from administrative records are often used as covariates in a mixed linear model. It is generally assumed that the auxiliary information is available for every small area. In many cases, though, such information is available for only some of the small areas, either from another survey or from a previous administration of the same survey. The authors propose and study small area estimators that use multivariate models to combine information from several surveys. They discuss computational algorithms, and a simulation study indicates that if quantities in the different surveys are sufficiently correlated, substantial gains in efficiency can be achieved.  相似文献   

17.
Logistic regression is often confronted with separation of likelihood problem, especially with unbalanced success–failure distribution. We propose to address this issue by drawing a ranked set sample (RSS). Simulation studies illustrated the advantages of logistic regression models fitted with RSS samples with small sample size regardless of the distribution of the binary response. As sample size increases, RSS eventually becomes comparable to SRS, but still has the advantage over SRS in mitigating the problem of separation of likelihood. Even in the presence of ranking errors, models from RSS samples yield higher predictive ability than its SRS counterpart.  相似文献   

18.
It is important to identify outliers since inclusion, especially when using parametric methods, can cause distortion in the analysis and lead to erroneous conclusions. One of the easiest and most useful methods is based on the boxplot. This method is particularly appealing since it does not use any outliers in computing spread. Two methods, one by Carling and another by Schwertman and de Silva, adjust the boxplot method for sample size and skewness. In this paper, the two procedures are compared both theoretically and by Monte Carlo simulations. Simulations using both a symmetric distribution and an asymmetric distribution were performed on data sets with none, one, and several outliers. Based on the simulations, the Carling approach is superior in avoiding masking outliers, that is, the Carling method is less likely to overlook an outlier while the Schwertman and de Silva procedure is much better at reducing swamping, that is, misclassifying an observation as an outlier. Carling’s method is to the Schwertman and de Silva procedure as comparisonwise versus experimentwise error rate is for multiple comparisons. The two methods, rather than being competitors, appear to complement each other. Used in tandem they provide the data analyst a more complete prospective for identifying possible outliers.  相似文献   

19.
This paper is concerned with methods of reducing variability and computer time in a simulation study. The Monte Carlo swindle, through mathematical manipulations, has been shown to yield more precise estimates than the “naive” approach. In this study computer time is considered in conjunction with the variance estimates. It is shown that by this measure the naive method is often a viable alternative to the swindle. This study concentrates on the problem of estimating the variance of an estimator of location. The advantage of one technique over another depends upon the location estimator, the sample size, and the underlying distribution. For a fixed number of samples, while the naive method gives a less precise estimate than the swindle, it requires fewer computations. In addition, for certain location estimators and distributions, the naive method is able to take advantage of certain shortcuts in the generation of each sample. The small amount of time required by this “enlightened” naive method often more than compensates for its relative lack of precision.  相似文献   

20.
A class of predictive densities is derived by weighting the observed samples in maximizing the log-likelihood function. This approach is effective in cases such as sample surveys or design of experiments, where the observed covariate follows a different distribution than that in the whole population. Under misspecification of the parametric model, the optimal choice of the weight function is asymptotically shown to be the ratio of the density function of the covariate in the population to that in the observations. This is the pseudo-maximum likelihood estimation of sample surveys. The optimality is defined by the expected Kullback–Leibler loss, and the optimal weight is obtained by considering the importance sampling identity. Under correct specification of the model, however, the ordinary maximum likelihood estimate (i.e. the uniform weight) is shown to be optimal asymptotically. For moderate sample size, the situation is in between the two extreme cases, and the weight function is selected by minimizing a variant of the information criterion derived as an estimate of the expected loss. The method is also applied to a weighted version of the Bayesian predictive density. Numerical examples as well as Monte-Carlo simulations are shown for polynomial regression. A connection with the robust parametric estimation is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号