首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.  相似文献   

2.
In this paper, Anbar's (1983) approach for estimating a difference between two binomial proportions is discussed with respect to a hypothesis testing problem. Such an approach results in two possible testing strategies. While the results of the tests are expected to agree for a large sample size when two proportions are equal, the tests are shown to perform quite differently in terms of their probabilities of a Type I error for selected sample sizes. Moreover, the tests can lead to different conclusions, which is illustrated via a simple example; and the probability of such cases can be relatively large. In an attempt to improve the tests while preserving their relative simplicity feature, a modified test is proposed. The performance of this test and a conventional test based on normal approximation is assessed. It is shown that the modified Anbar's test better controls the probability of a Type I error for moderate sample sizes.  相似文献   

3.
Brownian motion has been used to derive stopping boundaries for group sequential trials, however, when we observe dependent increment in the data, fractional Brownian motion is an alternative to be considered to model such data. In this article we compared expected sample sizes and stopping times for different stopping boundaries based on the power family alpha spending function under various values of Hurst coefficient. Results showed that the expected sample sizes and stopping times will decrease and power increases when the Hurst coefficient increases. With same Hurst coefficient, the closer the boundaries are to that of O'Brien-Fleming, the higher the expected sample sizes and stopping times are; however, power has a decreasing trend for values start from H = 0.6 (early analysis), 0.7 (equal space), 0.8 (late analysis). We also illustrate study design changes using results from the BHAT study.  相似文献   

4.
Several distribution-free bounds on expected values of L-statistics based on the sample of possibly dependent and nonidentically distributed random variables are given in the case when the sample size is a random variable, possibly dependent on the observations, with values in the set {1,2,…}. Some bounds extend the results of Papadatos (2001a) to the case of random sample size. The others provide new evaluations even if the sample size is nonrandom. Some applications of the presented bounds are also indicated.  相似文献   

5.
Sample coordination maximizes or minimizes the overlap of two or more samples selected from overlapping populations. It can be applied to designs with simultaneous or sequential selection of samples. We propose a method for sample coordination in the former case. We consider the case where units are to be selected with maximum overlap using two designs with given unit inclusion probabilities. The degree of coordination is measured by the expected sample overlap, which is bounded above by a theoretical bound, called the absolute upper bound, and which depends on the unit inclusion probabilities. If the expected overlap equals the absolute upper bound, the sample coordination is maximal. Most of the methods given in the literature consider fixed marginal sampling designs, but in many cases, the absolute upper bound is not achieved. We propose to construct optimal sampling designs for given unit inclusion probabilities in order to realize maximal coordination. Our method is based on some theoretical conditions on joint selection probability of two samples and on the controlled selection method with linear programming implementation. The method can also be applied to minimize the sample overlap.  相似文献   

6.
The paper proposes a new disclosure limitation procedure based on simulation. The key feature of the proposal is to protect actual microdata by drawing artificial units from a probability model, that is estimated from the observed data. Such a model is designed to maintain selected characteristics of the empirical distribution, thus providing a partial representation of the latter. The characteristics we focus on are the expected values of a set of functions; these are constrained to be equal to their corresponding sample averages; the simulated data, then, reproduce on average the sample characteristics. If the set of constraints covers the parameters of interest of a user, information loss is controlled for, while, as the model does not preserve individual values, re-identification attempts are impaired-synthetic individuals correspond to actual respondents with very low probability.Disclosure is mainly discussed from the viewpoint of record re-identification. According to this definition, as the pledge for confidentiality only involves the actual respondents, release of synthetic units should in principle rule out the concern for confidentiality.The simulation model is built on the Italian sample from the Community Innovation Survey (CIS). The approach can be applied in more generality, and especially suits quantitative traits. The model has a semi-parametric component, based on the maximum entropy principle, and, here, a parametric component, based on regression. The maximum entropy principle is exploited to match data traits; moreover, entropy measures uncertainty of a distribution: its maximisation leads to a distribution which is consistent with the given information but is maximally noncommittal with regard to missing information.Application results reveal that the fixed characteristics are sustained, and other features such as marginal distributions are well represented. Model specification is clearly a major point; related issues are selection of characteristics, goodness of fit and strength of dependence relations.  相似文献   

7.
A distribution-free test for the equality of the coefficients of variation from k populations is obtained by using the squared ranks test for variances, as presented by Conover and Iman (1978) and Conover (1980), on the original observations divided by their respective expected values. Substitution of the sample mean in place of the expected value results in the test being only asymptotically distribution-free. Results of a simulation study evaluating the size of the test for various coefficient of variation values and probability distributions are presented.  相似文献   

8.
Sample size determination is one of the most commonly encountered tasks in the design of every applied research. The general guideline suggests that a pilot study can offer plausible planning values for the vital model characteristics. This article examines two viable approaches to taking into account the imprecision of a variance estimate in sample size calculations for linear statistical models. The multiplier procedure employs an adjusted sample variance in the form of a multiple of the observed sample variance. The Bayesian method accommodates the uncertainty of a sample variance through a prior distribution. It is shown that the two seemingly distinct techniques are equivalent for sample size determination under the designated assurance requirements that the actual power exceeds the planned threshold with a given tolerance probability, or the expected power attains the desired level. The selection of optimum pilot sample size for minimizing the expected total cost is also considered.  相似文献   

9.
Practitioners of statistics are too often guilty of routinely selecting a 95% confidence level in interval estimation and ignoring the sample size and the expected size of the interval. One way to balance coverage and size is to use a loss function in a decision problem. Then either the Bayes risk or usual risk (if a pivotal quantity exists) may be minimized. It is found that some non-Bayes solutions are equivalent to Bayes results based on non-informative priors. The decision theory approach is applied to the mean and standard deviation of the univariate normal model and the mean of the multivariate normal. Tables are presented for critical values, expected size, confidence and sample size.  相似文献   

10.
two‐stage studies may be chosen optimally by minimising a single characteristic like the maximum sample size. However, given that an investigator will initially select a null treatment e?ect and the clinically relevant di?erence, it is better to choose a design that also considers the expected sample size for each of these values. The maximum sample size and the two expected sample sizes are here combined to produce an expected loss function to ?nd designs that are admissible. Given the prior odds of success and the importance of the total sample size, minimising the expected loss gives the optimal design for this situation. A novel triangular graph to represent the admissible designs helps guide the decision‐making process. The H 0‐optimal, H 1‐optimal, H 0‐minimax and H 1‐minimax designs are all particular cases of admissible designs. The commonly used H 0‐optimal design is rarely good when allowing stopping for e?cacy. Additionally, the δ‐minimax design, which minimises the maximum expected sample size, is sometimes admissible under the loss function. However, the results can be varied and each situation will require the evaluation of all the admissible designs. Software to do this is provided. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

11.
The inverse hypergeometric distribution is of interest in applications of inverse sampling without replacement from a finite population where a binary observation is made on each sampling unit. Thus, sampling is performed by randomly choosing units sequentially one at a time until a specified number of one of the two types is selected for the sample. Assuming the total number of units in the population is known but the number of each type is not, we consider the problem of estimating this parameter. We use the Delta method to develop approximations for the variance of three parameter estimators. We then propose three large sample confidence intervals for the parameter. Based on these results, we selected a sampling of parameter values for the inverse hypergeometric distribution to empirically investigate performance of these estimators. We evaluate their performance in terms of expected probability of parameter coverage and confidence interval length calculated as means of possible outcomes weighted by the appropriate outcome probabilities for each parameter value considered. The unbiased estimator of the parameter is the preferred estimator relative to the maximum likelihood estimator and an estimator based on a negative binomial approximation, as evidenced by empirical estimates of closeness to the true parameter value. Confidence intervals based on the unbiased estimator tend to be shorter than the two competitors because of its relatively small variance but at a slight cost in terms of coverage probability.  相似文献   

12.
Inequalities involving some sample means and order statistics are established. An upper bound of the absolute difference between the sample mean and median is also derived. Interesting inequalities among the sample mean and the median are obtained for cases when all the observations have the same sign. Some other algebraic inequalities are derived by taking expected values of the sample results and then applying them to some continuous distributions. It is also proved that the mean of a non-negative continuous random variable is at least as large as p times 100(1 ? p)th percentile.  相似文献   

13.
The small sample properties of the score function approximation to the maximum likelihood estimator for the three-parameter lognormal distribution using an alternative parameterization are considered. The new set of parameters is a continuous function of the usual parameters. However, unlike with the usual parameterization, the score function technique for this parameterization is extremely insensitive to starting values. Further, it is shown that whenever the sample third moment is less than zero, a local maximum to the likelihood function exists at a boundary point. For the usual parameterization, this point is unattainable. However, the alternative parameter space can be expanded to include these boundary points. This procedure results in good estimates of the expected value, variance, extreme percentiles and other parameters of the distribution even in samples where, with the typical parameterization, the estimation procedure fails to converge.  相似文献   

14.
Group sequential trialswith time to event end points can be complicated to design. Notonly are there unlimited choices for the number of events requiredat each stage, but for each of these choices, there are unlimitedcombinations of accrual and follow-up at each stage that providethe required events. Methods are presented for determining optimalcombinations of accrual and follow-up for two-stage clinicaltrials with time to event end points. Optimization is based onminimizing the expected total study length as a function of theexpected accrual duration or sample size while providing an appropriateoverall size and power. Optimal values of expected accrual durationand minimum expected total study length are given assuming anexponential proportional hazards model comparing two treatmentgroups. The expected total study length can be substantiallydecreased by including a follow-up period during which accrualis suspended. Conditions that warrant an interim follow-up periodare considered, and the gain in efficiency achieved by includingan interim follow-up period is quantified. The gain in efficiencyshould be weighed against the practical difficulties in implementingsuch designs. An example is given to illustrate the use of thesetechniques in designing a clinical trial to compare two chemotherapyregimens for lung cancer. Practical considerations of includingan interim follow-up period are discussed.  相似文献   

15.
This article presents the goodness-of-fit tests for the Laplace distribution based on its maximum entropy characterization result. The critical values of the test statistics estimated by Monte Carlo simulations are tabulated for various window and sample sizes. The test statistics use an entropy estimator depending on the window size; so, the choice of the optimal window size is an important problem. The window sizes for yielding the maximum power of the tests are given for selected sample sizes. Power studies are performed to compare the proposed tests with goodness-of-fit tests based on the empirical distribution function. Simulation results report that entropy-based tests have consistently higher power than EDF tests against almost all alternatives considered.  相似文献   

16.
ABSTRACT

Sharp bounds on expected values of L-statistics based on a sample of possibly dependent, identically distributed random variables are given in the case when the sample size is a random variable with values in the set {0, 1, 2,…}. The dependence among observations is modeled by copulas and mixing. The bounds are attainable and provide characterizations of some non trivial distributions.  相似文献   

17.
This study constructs a simultaneous confidence region for two combinations of coefficients of linear models and their ratios based on the concept of generalized pivotal quantities. Many biological studies, such as those on genetics, assessment of drug effectiveness, and health economics, are interested in a comparison of several dose groups with a placebo group and the group ratios. The Bonferroni correction and the plug-in method based on the multivariate-t distribution have been proposed for the simultaneous region estimation. However, the two methods are asymptotic procedures, and their performance in finite sample sizes has not been thoroughly investigated. Based on the concept of generalized pivotal quantity, we propose a Bonferroni correction procedure and a generalized variable (GV) procedure to construct the simultaneous confidence regions. To address a genetic concern of the dominance ratio, we conduct a simulation study to empirically investigate the probability coverage and expected length of the methods for various combinations of sample sizes and values of the dominance ratio. The simulation results demonstrate that the simultaneous confidence region based on the GV procedure provides sufficient coverage probability and reasonable expected length. Thus, it can be recommended in practice. Numerical examples using published data sets illustrate the proposed methods.  相似文献   

18.
In this article, optimal progressive censoring schemes are examined for the nonparametric confidence intervals of population quantiles. The results obtained can be universally applied to any continuous probability distribution. By using the interval mass as an optimality criterion, the optimization process is free of the actual observed values from the sample and needs only the initial sample size n and the number of complete failures m. Using several sample sizes combined with various degrees of censoring, the results of the optimization are presented here for the population median at selected levels of confidence (99, 95, and 90%). With the optimality criterion under consideration, the efficiencies of the worst progressive Type-II censoring scheme and ordinary Type-II censoring scheme are also examined in comparison to the best censoring scheme obtained for fixed n and m.  相似文献   

19.
Testing between hypotheses, when independent sampling is possible, is a well developed subject. In this paper, we propose hypothesis tests that are applicable when the samples are obtained using Markov chain Monte Carlo. These tests are useful when one is interested in deciding whether the expected value of a certain quantity is above or below a given threshold. We show non-asymptotic error bounds and bounds on the expected number of samples for three types of tests, a fixed sample size test, a sequential test with indifference region, and a sequential test without indifference region. Our tests can lead to significant savings in sample size. We illustrate our results on an example of Bayesian parameter inference involving an ODE model of a biochemical pathway.  相似文献   

20.
A necessary and sufficient condition that two distributions having finite means are identical is that for any fixed integer r > 0, the expected values of their rth (n ? r) order statistics are equal [or the expected values of their (n-r)th (n > r ? 0) order statistics are equal] for all n where n is the sample size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号