期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Recursive computation of inclusion probabilities in ranked-set sampling 总被引：1，自引：0，他引：1

Jesse Frey 《Journal of statistical planning and inference》2011,141(11):3632-3639

We derive recursive algorithms for computing first-order and second-order inclusion probabilities for ranked-set sampling from a finite population. These algorithms make it practical to compute inclusion probabilities even for relatively large sample and population sizes. As an application, we use the inclusion probabilities to examine the performance of Horvitz-Thompson estimators under different varieties of balanced ranked-set sampling. We find that it is only for balanced Level 2 sampling that the Horvitz-Thompson estimator can be relied upon to outperform the simple random sampling mean estimator. 相似文献

2.

Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm

J. G. Booth & J. P. Hobert 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(1):265-285

Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension. 相似文献

3.

Non-parametric Bayesian Inference for Integrals with respect to an Unknown Finite Measure

TORKEL ERHARDSSON 《Scandinavian Journal of Statistics》2008,35(2):369-384

Abstract. We consider the problem of estimating a collection of integrals with respect to an unknown finite measure μ from noisy observations of some of the integrals. A new method to carry out Bayesian inference for the integrals is proposed. We use a Dirichlet or Gamma process as a prior for μ , and construct an approximation to the posterior distribution of the integrals using the sampling importance resampling algorithm and samples from a new multidimensional version of a Markov chain by Feigin and Tweedie. We prove that the Markov chain is positive Harris recurrent, and that the approximating distribution converges weakly to the posterior as the sample size increases, under a mild integrability condition. Applications to polymer chemistry and mathematical finance are given. 相似文献

4.

Multilevel modelling of complex survey data

Sophia Rabe-Hesketh Anders Skrondal 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(4):805-827

Summary. Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the 'Program for international student assessment' 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximum-likelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used. 相似文献

5.

A class of simple approximate sequential tests for adaptive comparison of two treatments

Lakhbir S. Hayre Bruce W. Turnbull 《统计学通讯:理论与方法》2013,42(22):2339-2360

We consider the problem of sequentially deciding which of two treatments is superior, A class of simple approximate sequential tests is proposed. These have the probabilities of correct selection approximately independent of the sampling rule and depending on unknown parameters only through the function of interest, such as the difference or ratio of mean responses. The tests are obtained by using a normal approximation, and this is employed to derive approximate expressions for the probabilities of correct selection and the expected sample sizes. A class of data-dependent sampling rules is proposed for minimizing any weighted average of the expected sample sizes on the two treatments, with the weights being allowed to depend on unknown parameters. The tests are studied in the particular cases of exponentially. 相似文献

6.

Pitman closeness as a criterion for the determination of the optimal progressive censoring scheme

《Statistical Methodology》2012,9(6):563-572

Selecting the optimal progressive censoring scheme for the exponential distribution according to Pitman closeness criterion is discussed. For small sample sizes the Pitman closeness probabilities are calculated explicitly, and it is shown that the optimal progressive censoring scheme is the usual Type-II right censoring case. It is conjectured that this to be the case for all sample sizes. A general algorithm is also presented for the numerical computation of the Pitman closeness probabilities between any two progressive censoring schemes of the same size. 相似文献

7.

Pitman closeness as a criterion for the determination of the optimal progressive censoring scheme

《Statistical Methodology》2013,10(6):563-572

Selecting the optimal progressive censoring scheme for the exponential distribution according to Pitman closeness criterion is discussed. For small sample sizes the Pitman closeness probabilities are calculated explicitly, and it is shown that the optimal progressive censoring scheme is the usual Type-II right censoring case. It is conjectured that this to be the case for all sample sizes. A general algorithm is also presented for the numerical computation of the Pitman closeness probabilities between any two progressive censoring schemes of the same size. 相似文献

8.

The Use of Sample Weights in Hot Deck Imputation

Andridge RR Little RJ 《Journal of official statistics》2009,25(1):21-36

A common strategy for handling item nonresponse in survey sampling is hot deck imputation, where each missing value is replaced with an observed response from a "similar" unit. We discuss here the use of sampling weights in the hot deck. The naive approach is to ignore sample weights in creation of adjustment cells, which effectively imputes the unweighted sample distribution of respondents in an adjustment cell, potentially causing bias. Alternative approaches have been proposed that use weights in the imputation by incorporating them into the probabilities of selection for each donor. We show by simulation that these weighted hot decks do not correct for bias when the outcome is related to the sampling weight and the response propensity. The correct approach is to use the sampling weight as a stratifying variable alongside additional adjustment variables when forming adjustment cells. 相似文献

9.

Free-knot polynomial splines with confidence intervals

Wenxin Mao Linda H. Zhao 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(4):901-919

Summary. We construct approximate confidence intervals for a nonparametric regression function, using polynomial splines with free-knot locations. The number of knots is determined by generalized cross-validation. The estimates of knot locations and coefficients are obtained through a non-linear least squares solution that corresponds to the maximum likelihood estimate. Confidence intervals are then constructed based on the asymptotic distribution of the maximum likelihood estimator. Average coverage probabilities and the accuracy of the estimate are examined via simulation. This includes comparisons between our method and some existing methods such as smoothing spline and variable knots selection as well as a Bayesian version of the variable knots method. Simulation results indicate that our method works well for smooth underlying functions and also reasonably well for discontinuous functions. It also performs well for fairly small sample sizes. 相似文献

10.

Bootstrap confidence bands for the CDF using ranked-set sampling

《Journal of the Korean Statistical Society》2014,43(3):453-461

In ranked-set sampling (RSS), a stratification by ranks is used to obtain a sample that tends to be more informative than a simple random sample of the same size. Previous work has shown that if the rankings are perfect, then one can use RSS to obtain Kolmogorov–Smirnov type confidence bands for the CDF that are narrower than those obtained under simple random sampling. Here we develop Kolmogorov–Smirnov type confidence bands that work well whether the rankings are perfect or not. These confidence bands are obtained by using a smoothed bootstrap procedure that takes advantage of special features of RSS. We show through a simulation study that the coverage probabilities are close to nominal even for samples with just two or three observations. A new algorithm allows us to avoid the bootstrap simulation step when sample sizes are relatively small. 相似文献

11.

Variance reduction of estimators arising from Metropolis–Hastings algorithms

George Iliopoulos Sonia Malefaki 《Statistics and Computing》2013,23(5):577-587

The Metropolis–Hastings algorithm is one of the most basic and well-studied Markov chain Monte Carlo methods. It generates a Markov chain which has as limit distribution the target distribution by simulating observations from a different proposal distribution. A proposed value is accepted with some particular probability otherwise the previous value is repeated. As a consequence, the accepted values are repeated a positive number of times and thus any resulting ergodic mean is, in fact, a weighted average. It turns out that this weighted average is an importance sampling-type estimator with random weights. By the standard theory of importance sampling, replacement of these random weights by their (conditional) expectations leads to more efficient estimators. In this paper we study the estimator arising by replacing the random weights with certain estimators of their conditional expectations. We illustrate by simulations that it is often more efficient than the original estimator while in the case of the independence Metropolis–Hastings and for distributions with finite support we formally prove that it is even better than the “optimal” importance sampling estimator. 相似文献

12.

Reasonable sample sizes for convergence to normality

Carsten Schröder 《统计学通讯:模拟与计算》2017,46(9):7074-7087

The central limit theorem says that, provided an estimator fulfills certain weak conditions, then, for reasonable sample sizes, the sampling distribution of the estimator converges to normality. We propose a procedure to find out what a “reasonably large sample size” is. The procedure is based on the properties of Gini's mean difference decomposition. We show the results of implementations of the procedure from simulated datasets and data from the German Socio-economic Panel. 相似文献

13.

Corrected likelihood ratio and score tests for the beta distribution

《Journal of Statistical Computation and Simulation》2012,82(8):585-596

We present correction formulae to improve likelihood ratio and score teats for testing simple and composite hypotheses on the parameters of the beta distribution. As a special case of our results we obtain improved tests for the hypothesis that a sample is drawn from a uniform distribution on (0, 1). We present some Monte Carlo investigations to show that both corrected tests have better performances than the classical likelihood ratio and score tests at least for small sample sizes. 相似文献

14.

A Horvitz-Thompson Estimator of the Population Mean Using Inclusion Probabilities of Ranked Set Sampling

Fikri Gökpinar Yaprak Arzu Özdemir 《统计学通讯:理论与方法》2013,42(6):1029-1039

In this study, we define the Horvitz-Thompson estimator of the population mean using the inclusion probabilities of a ranked set sample in a finite population setting. The second-order inclusion probabilities that are required to calculate the variance of the Horvitz-Thompson estimator were obtained. The Horvitz-Thompson estimator, using the inclusion probabilities of ranked set sample, tends to be more efficient than the classical ranked set sampling estimator especially in a positively skewed population with small sizes. Also, we present a real data example with the volatility of gasoline to illustrate the Horvitz-Thompson estimator based on ranked set sampling. 相似文献

15.

Comparison of Interval Estimators of Pr(X < Y) in the Two-parameter Exponential Distribution

Ayman Baklizi 《统计学通讯:模拟与计算》2016,45(8):2937-2946

We consider confidence intervals for the stress–strength reliability Pr(X< Y) in the two-parameter exponential distribution. We have derived the Bayesian highest posterior density interval using non-informative prior distributions. We have compared its performance with the intervals based on the generalized pivot variable intervals in terms of their coverage probabilities and expected lengths. Our simulation study shows that the Bayesian interval performs better according to the criteria used, especially when the sample sizes are very small. An example is given. 相似文献

16.

Multiple comparisons based on a modified one-step M-estimator

Rand Wilcox 《Journal of applied statistics》2003,30(10):1231-1241

Although many methods are available for performing multiple comparisons based on some measure of location, most can be unsatisfactory in at least some situations, in simulations when sample sizes are small, say less than or equal to twenty. That is, the actual Type I error probability can substantially exceed the nominal level, and for some methods the actual Type I error probability can be well below the nominal level, suggesting that power might be relatively poor. In addition, all methods based on means can have relatively low power under arbitrarily small departures from normality. Currently, a method based on 20% trimmed means and a percentile bootstrap method performs relatively well (Wilcox, in press). However, symmetric trimming was used, even when sampling from a highly skewed distribution and a rigid adherence to 20% trimming can result in low efficiency when a distribution is sufficiently heavy-tailed. Robust M-estimators are more flexible but they can be unsatisfactory in terms of Type I errors when sample sizes are small. This paper describes an alternative approach based on a modified one-step M-estimator that introduces more flexibility than a trimmed mean but provides better control over Type I error probabilities compared with using a one-step M-estimator. 相似文献

17.

Properties of bagged nearest neighbour classifiers

Peter Hall Richard J. Samworth 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(3):363-379

Summary. It is shown that bagging, a computationally intensive method, asymptotically improves the performance of nearest neighbour classifiers provided that the resample size is less than 69% of the actual sample size, in the case of with-replacement bagging, or less than 50% of the sample size, for without-replacement bagging. However, for larger sampling fractions there is no asymptotic difference between the risk of the regular nearest neighbour classifier and its bagged version. In particular, neither achieves the large sample performance of the Bayes classifier. In contrast, when the sampling fractions converge to 0, but the resample sizes diverge to ∞, the bagged classifier converges to the optimal Bayes rule and its risk converges to the risk of the latter. These results are most readily seen when the two populations have well-defined densities, but they may also be derived in other cases, where densities exist in only a relative sense. Cross-validation can be used effectively to choose the sampling fraction. Numerical calculation is used to illustrate these theoretical properties. 相似文献

18.

Stochastic matching pursuit for Bayesian variable selection

Ray-Bing Chen Chi-Hsiang Chu Te-You Lai Ying Nian Wu 《Statistics and Computing》2011,21(2):247-259

This article proposes a stochastic version of the matching pursuit algorithm for Bayesian variable selection in linear regression. In the Bayesian formulation, the prior distribution of each regression coefficient is assumed to be a mixture of a point mass at 0 and a normal distribution with zero mean and a large variance. The proposed stochastic matching pursuit algorithm is designed for sampling from the posterior distribution of the coefficients for the purpose of variable selection. The proposed algorithm can be considered a modification of the componentwise Gibbs sampler. In the componentwise Gibbs sampler, the variables are visited by a random or a systematic scan. In the stochastic matching pursuit algorithm, the variables that better align with the current residual vector are given higher probabilities of being visited. The proposed algorithm combines the efficiency of the matching pursuit algorithm and the Bayesian formulation with well defined prior distributions on coefficients. Several simulated examples of small n and large p are used to illustrate the algorithm. These examples show that the algorithm is efficient for screening and selecting variables. 相似文献

19.

Bayesian weighted inference from surveys

David Gunawan Anastasios Panagiotelis William Griffiths Duangkamon Chotikapanich 《Australian & New Zealand Journal of Statistics》2020,62(1):71-94

Data from large surveys are often supplemented with sampling weights that are designed to reflect unequal probabilities of response and selection inherent in complex survey sampling methods. We propose two methods for Bayesian estimation of parametric models in a setting where the survey data and the weights are available, but where information on how the weights were constructed is unavailable. The first approach is to simply replace the likelihood with the pseudo likelihood in the formulation of Bayes theorem. This is proven to lead to a consistent estimator but also leads to credible intervals that suffer from systematic undercoverage. Our second approach involves using the weights to generate a representative sample which is integrated into a Markov chain Monte Carlo (MCMC) or other simulation algorithms designed to estimate the parameters of the model. In the extensive simulation studies, the latter methodology is shown to achieve performance comparable to the standard frequentist solution of pseudo maximum likelihood, with the added advantage of being applicable to models that require inference via MCMC. The methodology is demonstrated further by fitting a mixture of gamma densities to a sample of Australian household income. 相似文献

20.

A List Sequential Sampling Method Suitable for Real-Time Sampling

LENNART BONDESSON DANIEL THORBURN 《Scandinavian Journal of Statistics》2008,35(3):466-483

Abstract. A flexible list sequential π ps sampling method is introduced and studied. It can reproduce any given sampling design without replacement, of fixed or random sample size. The method is a splitting method and uses successive updating of inclusion probabilities. The main advantage of the method is in real-time sampling situations where it can be used as a powerful alternative to Bernoulli and Poisson sampling and can give any desired second-order inclusion probabilities and thus considerably reduce the variability of the sample size. 相似文献