期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal rank set sampling estimates for a population proportion

《Journal of statistical planning and inference》2005,127(1-2):309-321

This paper examines two different classes of estimates for a population proportion based on an unbalanced rank set sample. Specifically, the two classes correspond to the maximum likelihood estimator (MLE) and a weighted average (WA) estimate. Both estimators are asymptotically normal, so standard inference procedures can still be implemented. Furthermore, these results can be used to develop optimal allocation schemes for both estimators. The performances of the optimal estimators are studied in terms of both finite sample and asymptotic relative efficiency. In general, the MLE is more efficient than the WA estimate. Lastly, the practicality of the optimal sampling plans is addressed and illustrated via an example. 相似文献

2.

Nested exposure case-control sampling: a sampling scheme to analyze rare time-dependent exposures

Feifel Jan Gebauer Madlen Schumacher Martin Beyersmann Jan 《Lifetime data analysis》2020,26(1):21-44

For large cohort studies with rare outcomes, the nested case-control design only requires data collection of small subsets of the individuals at risk. These are typically randomly sampled at the observed event times and a weighted, stratified analysis takes over the role of the full cohort analysis. Motivated by observational studies on the impact of hospital-acquired infection on hospital stay outcome, we are interested in situations, where not necessarily the outcome is rare, but time-dependent exposure such as the occurrence of an adverse event or disease progression is. Using the counting process formulation of general nested case-control designs, we propose three sampling schemes where not all commonly observed outcomes need to be included in the analysis. Rather, inclusion probabilities may be time-dependent and may even depend on the past sampling and exposure history. A bootstrap analysis of a full cohort data set from hospital epidemiology allows us to investigate the practical utility of the proposed sampling schemes in comparison to a full cohort analysis and a too simple application of the nested case-control design, if the outcome is not rare.

相似文献

3.

Inference and rare event simulation for stopped Markov processes via reverse-time sequential Monte Carlo

Jere Koskela Dario Spanò Paul A. Jenkins 《Statistics and Computing》2018,28(1):131-144

We present a sequential Monte Carlo algorithm for Markov chain trajectories with proposals constructed in reverse time, which is advantageous when paths are conditioned to end in a rare set. The reverse time proposal distribution is constructed by approximating the ratio of Green’s functions in Nagasawa’s formula. Conditioning arguments can be used to interpret these ratios as low-dimensional conditional sampling distributions of some coordinates of the process given the others. Hence, the difficulty in designing SMC proposals in high dimension is greatly reduced. Empirically, our method outperforms an adaptive multilevel splitting algorithm in three examples: estimating an overflow probability in a queueing model, the probability that a diffusion follows a narrowing corridor, and the initial location of an infection in an epidemic model on a network. 相似文献

4.

Bayesian model averaging for estimating the number of classes: applications to the total number of species in metagenomics

Sébastien Li-Thiao-Té Daudin Jean-Jacques Robin Stéphane 《Journal of applied statistics》2012,39(7):1489-1504

相似文献

5.

Inverse Adaptive Cluster Sampling with Unequal Selection Probabilities: Case Studies on Crab Holes and Arsenic Pollution

下载免费PDF全文

Mohammad Salehi Mohammad Moradi Jassim A. Al Khayat Jennifer Brown Adil Eltayeb Mohamed Yousif 《Australian & New Zealand Journal of Statistics》2015,57(2):189-201

Adaptive cluster sampling is an efficient method of estimating the parameters of rare and clustered populations. The method mimics how biologists would like to collect data in the field by targeting survey effort to localised areas where the rare population occurs. Another popular sampling design is inverse sampling. Inverse sampling was developed so as to be able to obtain a sample of rare events having a predetermined size. Ideally, in inverse sampling, the resultant sample set will be sufficiently large to ensure reliable estimation of population parameters. In an effort to combine the good properties of these two designs, adaptive cluster sampling and inverse sampling, we introduce inverse adaptive cluster sampling with unequal selection probabilities. We develop an unbiased estimator of the population total that is applicable to data obtained from such designs. We also develop numerical approximations to this estimator. The efficiency of the estimators that we introduce is investigated through simulation studies based on two real populations: crabs in Al Khor, Qatar and arsenic pollution in Kurdistan, Iran. The simulation results show that our estimators are efficient. 相似文献

6.

A METHOD FOR DETERMINING THE ASYMPTOTIC EFFICIENCY OF SOME SEQUENTIAL PROBABILITY RATIO TESTS

Graham Pollard 《Australian & New Zealand Journal of Statistics》1990,32(2):191-204

This paper gives a method for decomposing many sequential probability ratio tests into smaller independent components called “modules”. A function of some characteristics of modules can be used to determine the asymptotically most efficient of a set of statistical tests in which a, the probability of type I error equals β, the probability of type II error. The same test is seen also to give the asymptotically most efficient of the corresponding set of tests in which a is not equal to β. The “module” method is used to give an explanation for the super-efficiency of the play-the-winner and play-the-loser rules in two-sample binomial sampling. An example showing how complex cases can be analysed numerically using this method is also given. 相似文献

7.

Mean estimate in ranked set sampling using a length-biased concomitant variable

Chang Cui Lei Zhang 《统计学通讯:理论与方法》2019,48(12):2917-2931

In this paper, a ranked set sampling procedure with ranking based on a length-biased concomitant variable is proposed. The estimate for population mean based on this sample is given. It is proved that the estimate based on ranked set samples is asymptotically more efficient than the estimate based on simple random samples. Simulation studies are conducted to present the properties of the proposed estimate for finite sample size. Moreover, the consequence of ignoring length bias is also addressed by simulation studies and the real data analysis. 相似文献

8.

Multilevel structured additive regression

Stefan Lang Nikolaus Umlauf Peter Wechselberger Kenneth Harttgen Thomas Kneib 《Statistics and Computing》2014,24(2):223-238

相似文献

9.

Robust extreme double ranked set sampling

M. H. Hashemi Majd 《Journal of Statistical Computation and Simulation》2018,88(9):1749-1758

As a well-known method for selecting representative samples of populations, ranked set sampling (RSS) has been considered increasingly in recent years. This (RSS) method has proved to be more efficient than the usual simple random sampling (SRS) for estimating most of the population parameters. In order to have a more efficient estimate of the population mean, a new sampling scheme called as robust extreme double ranked set sampling (REDRSS) is introduced and investigated in this paper. A simulation study shows that using REDRSS scheme gives more efficient estimates of population mean with smaller variance than the usual SRS, RSS and most other sampling schemes based on RSS estimators in non-uniform (symmetric or non-symmetric) distributions. 相似文献

10.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker 《Econometric Reviews》2013,32(2-3):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

11.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker Siem Jan Koopman 《Econometric Reviews》2006,25(2):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

12.

Sequential Monte Carlo for rare event estimation

F. Cérou P. Del Moral T. Furon A. Guyader 《Statistics and Computing》2012,22(3):795-808

This paper discusses a novel strategy for simulating rare events and an associated Monte Carlo estimation of tail probabilities. Our method uses a system of interacting particles and exploits a Feynman-Kac representation of that system to analyze their fluctuations. Our precise analysis of the variance of a standard multilevel splitting algorithm reveals an opportunity for improvement. This leads to a novel method that relies on adaptive levels and produces, in the limit of an idealized version of the algorithm, estimates with optimal variance. The motivation for this theoretical work comes from problems occurring in watermarking and fingerprinting of digital contents, which represents a new field of applications of rare event simulation techniques. Some numerical results show performance close to the idealized version of our technique for these practical applications. 相似文献

13.

市场风险的高效率加速算法研究

高全胜伍旭朱丹青《统计与信息论坛》2012,(2):9-14

在计算投资组合市场风险时,采用高效率重要性抽样技术来处理大规模、高维度和稀有事件问题可以提高计算的速度和效率。在对投资组合损失进行Delta-Gamma近似的基础上,通过利用辅助分布变换函数,将求解抽样参数的最小抽样方差问题转化为一个非线性的广义最小二乘问题;在指数族抽样核的假设下,进一步将问题转化为迭代线性回归问题,从而简化了计算;通过德尔塔对冲和指数对冲投资组合的模拟算例验证了所提出方法的有效性。相似文献

14.

Record-breaking data: a parametric comparison of the inverse-sampling and the random-sampling schemes

《Journal of Statistical Computation and Simulation》2012,82(3):225-238

In many industrial quality control experiments and destructive stress testing, the only available data are successive minima (or maxima)i.e., record-breaking data. There are two sampling schemes used to collect record-breaking data: random sampling and inverse sampling. For random sampling, the total sample size is predetermined and the number of records is a random variable while in inverse-sampling the number of records to be observed is predetermined; thus the sample size is a random variable. The purpose of this papper is to determinevia simulations, which of the two schemes, if any, is more efficient. Since the two schemes are equivalent asymptotically, the simulations were carried out for small to moderate sized record-breaking samples. Simulated biases and mean square errors of the maximum likelihood estimators of the parameters using the two sampling schemes were compared. In general, it was found that if the estimators were well behaved, then there was no significant difference between the mean square errors of the estimates for the two schemes. However, for certain distributions described by both a shape and a scale parameter, random sampling led to estimators that were inconsistent. On the other hand, the estimated obtained from inverse sampling were always consistent. Moreover, for moderated sized record-breaking samples, the total sample size that needs to be observed is smaller for inverse sampling than for random sampling. 相似文献

15.

Mixed ranked set sampling design

Abdul Haq Jennifer Brown Elena Moltchanova Amer Ibrahim Al-Omari 《Journal of applied statistics》2014,41(10):2141-2156

The main focus of agricultural, ecological and environmental studies is to develop well designed, cost-effective and efficient sampling designs. Ranked set sampling (RSS) is one method that leads to accomplish such objectives by incorporating expert knowledge to its advantage. In this paper, we propose an efficient sampling scheme, named mixed RSS (MxRSS), for estimation of the population mean and median. The MxRSS scheme is a suitable mixture of both simple random sampling (SRS) and RSS schemes. The MxRSS scheme provides an unbiased estimator of the population mean, and its variance is always less than the variance of sample mean based on SRS. For both symmetric and asymmetric populations, the mean and median estimators based on SRS, partial RSS (PRSS) and MxRSS schemes are compared. It turns out that the mean and median estimates under MxRSS scheme are more precise than those based on SRS scheme. Moreover, when estimating the mean of symmetric and some asymmetric populations, the mean estimates under MxRSS scheme are found to be more efficient than the mean estimates with PRSS scheme. An application to real data is also provided to illustrate the implementation of the proposed sampling scheme. 相似文献

16.

Large sample results for tests of association ii. multinomial and stratified sampling

Elena Kulinskaya 《统计学通讯:理论与方法》2013,42(5):1121-1150

This paper deals with the asymptotics of a class of tests for association in 2-way contingency tables based on square forms in cell frequencies, given the total number of observations (multinomial sampling) or one set of marginal totals (stratified sampling). The case when both row and column marginal totals are fixed (hypergeometric sampling) was studied in Kulinskaya (1994), The class of tests under consideration includes a number of classical measures for association, Its two subclasses are the tests based on statistics using centralized cell frequencies (asymptotically distributed as weighted sums of central chi-squares) and those using the non-centralized cell frequencies (asymptotically normal). The parameters of asymptotic distributions depend on the sampling model and on true marginal probabilities. Maximum efficiency for asymptotically normal statistics is achieved under hypergeometric sampling, If the cell frequencies or the statistic as a whole are centralized using marginal proportions as estimates for marginal probabilities, the asymptotic distribution does not differ much between models and it is equivalent to that under hypergeometric sampling. These findings give an extra justification for the use of permutation tests for association (which are based on hypergeometric sampling). As an application, several well known measures of association are analysed. 相似文献

17.

Outliers in multilevel data 总被引：2，自引：0，他引：2

I. H. Langford & T. Lewis 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1998,161(2):121-160

This paper offers the data analyst a range of practical procedures for dealing with outliers in multilevel data. It first develops several techniques for data exploration for outliers and outlier analysis and then applies these to the detailed analysis of outliers in two large scale multilevel data sets from educational contexts. The techniques include the use of deviance reduction, measures based on residuals, leverage values, hierarchical cluster analysis and a measure called DFITS. Outlier analysis is more complex in a multilevel data set than in, say, a univariate sample or a set of regression data, where the concept of an outlying value is straightforward. In the multilevel situation one has to consider, for example, at what level or levels a particular response is outlying, and in respect of which explanatory variables; furthermore, the treatment of a particular response at one level may affect its status or the status of other units at other levels in the model. 相似文献

18.

New two-stage sampling designs based on neoteric ranked set sampling

Cesar Augusto Taconeli Angelo da Silva Cabral 《Journal of Statistical Computation and Simulation》2019,89(2):232-248

Neoteric ranked set sampling (NRSS) is a recently developed sampling plan, derived from the well-known ranked set sampling (RSS) scheme. It has already been proved that NRSS provides more efficient estimators for population mean and variance compared to RSS and other sampling designs based on ranked sets. In this work, we propose and evaluate the performance of some two-stage sampling designs based on NRSS. Five different sampling schemes are proposed. Through an extensive Monte Carlo simulation study, we verified that all proposed sampling designs outperform RSS, NRSS, and the original double RSS design, producing estimators for the population mean with a lower mean square error. Furthermore, as with NRSS, two-stage NRSS estimators present some bias for asymmetric distributions. We complement the study with a discussion on the relative performance of the proposed estimators. Moreover, an additional simulation based on data of the diameter and height of pine trees is presented. 相似文献

19.

Improving the best linear unbiased estimator for the scale parameter of symmetric distributions by using the absolute value of ranked set samples

Gang Zheng Mohammad Al-Saleh 《Journal of applied statistics》2003,30(3):253-265

Ranked set sampling is a cost efficient sampling technique when actually measuring sampling units is difficult but ranking them is relatively easy. For a family of symmetric location-scale distributions with known location parameter, we consider a best linear unbiased estimator for the scale parameter. Instead of using original ranked set samples, we propose to use the absolute deviations of the ranked set samples from the location parameter. We demonstrate that this new estimator has smaller variance than the best linear unbiased estimator using original ranked set samples. Optimal allocation in the absolute value of ranked set samples is also discussed for the estimation of the scale parameter when the location parameter is known. Finally, we perform some sensitivity analyses for this new estimator when the location parameter is unknown but estimated using ranked set samples and when the ranking of sampling units is imperfect. 相似文献

20.

The matched pair sign test using bivariate ranked set sampling for different ranking based schemes

Hani M. Samawi Mohammad F. Al-Saleh Obaid Al-Saidy 《Statistical Methodology》2009,6(4):397-407

Bivariate rank set sample (BVRSS) matched pair sign test is introduced and investigated for different ranking based schemes. We show that this test is asymptotically more efficient and more powerful than its counterpart sign test based on a bivariate simple random sample (BVSRS) for different ranking schemes. The asymptotic null distribution and the efficiency of the test are derived. Pitman’s asymptotic relative efficiency is used to compare the asymptotic performance of the matched pair sign test using BVRSS versus using BVSRS in all ranking cases. For small sample sizes, the bootstrap method is used to estimate P-values. Numerical comparisons are used to gain insight about the efficiency of the BVRSS sign test compared to the BVSRS sign test. Our numerical and theoretical results indicate that using any ranking scheme of BVRSS for the matched pair sign test is more efficient than using BVSRS. 相似文献