首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper deals with techniques for obtaining random point samples from spatial databases. We seek random points from a continuous domain (usually 2) which satisfy a spatial predicate that is represented in the database as a collection of polygons. Several applications of spatial sampling (e.g. environmental monitoring, agronomy, forestry, etc) are described. Sampling problems are characterized in terms of two key parameters: coverage (selectivity), and expected stabbing number (overlap). We discuss two fundamental approaches to sampling with spatial predicates, depending on whether we sample first or evaluate the predicate first. The approaches are described in the context of both quadtrees and R-trees, detailing the sample first, acceptance/rejection tree, and partial area tree algorithms. A sequential algorithm, the one-pass spatial reservoir algorithm is also described. The relative performance of the various sampling algorithms is compared and choice of preferred algorithms is suggested. We conclude with a short discussion of possible extensions.  相似文献   

2.
This paper addresses the problem of unbiased estimation of P[X > Y] = θ for two independent exponentially distributed random variables X and Y. We present (unique) unbiased estimator of θ based on a single pair of order statistics obtained from two independent random samples from the two populations. We also indicate how this estimator can be utilized to obtain unbiased estimators of θ when only a few selected order statistics are available from the two random samples as well as when the samples are selected by an alternative procedure known as ranked set sampling. It is proved that for ranked set samples of size two, the proposed estimator is uniformly better than the conventional non-parametric unbiased estimator and further, a modified ranked set sampling procedure provides an unbiased estimator even better than the proposed estimator.  相似文献   

3.
Abstract

Multistage sampling is a common sampling technique employed in many studies. In this setting, observations are identically distributed but not independent, thus many traditional kernel smoothing techniques, which assume that the data are independent and identically distributed (i.i.d.), may not produce reasonable density estimates. In this paper, we sample repeatedly with replacement from each cluster, create multiple i.i.d. samples containing one observation from each cluster, and then create a kernel density estimate from each i.i.d. sample. These estimates will then be combined to form an estimate of the marginal probability density function of the population.  相似文献   

4.
Biased sampling from an underlying distribution with p.d.f. f(t), t>0, implies that observations follow the weighted distribution with p.d.f. f w (t)=w(t)f(t)/E[w(T)] for a known weight function w. In particular, the function w(t)=t α has important applications, including length-biased sampling (α=1) and area-biased sampling (α=2). We first consider here the maximum likelihood estimation of the parameters of a distribution f(t) under biased sampling from a censored population in a proportional hazards frailty model where a baseline distribution (e.g. Weibull) is mixed with a continuous frailty distribution (e.g. Gamma). A right-censored observation contributes a term proportional to w(t)S(t) to the likelihood; this is not the same as S w (t), so the problem of fitting the model does not simply reduce to fitting the weighted distribution. We present results on the distribution of frailty in the weighted distribution and develop an EM algorithm for estimating the parameters of the model in the important Weibull–Gamma case. We also give results for the case where f(t) is a finite mixture distribution. Results are presented for uncensored data and for Type I right censoring. Simulation results are presented, and the methods are illustrated on a set of lifetime data.  相似文献   

5.
Previous work has been carried out on the use of double sampling schemes for inference from binomial data which are subject to misclassification. The double sampling scheme utilizes a sample of n units which are classified by both a fallible and a true device and another sample of n2 units which are classified only by a fallible device. A triple sampljng scheme incorporates an additional sample of nl units which are classified only by the true device. In this paper we apply this triple sampling to estimation from binomialdata. First estimation of a binomial proportion is discussed under different misclassification structures. Then, the problem of optimal allocation of sample sizes is discussed.  相似文献   

6.
The use of robust measures helps to increase the precision of the estimators, especially for the estimation of extremely skewed distributions. In this article, a generalized ratio estimator is proposed by using some robust measures with single auxiliary variable under the adaptive cluster sampling (ACS) design. We have incorporated tri-mean (TM), mid-range (MR) and Hodges-Lehman (HL) of the auxiliary variable as robust measures together with some conventional measures. The expressions of bias and mean square error (MSE) of the proposed generalized ratio estimator are derived. Two types of numerical study have been conducted using artificial clustered population and real data application to examine the performance of the proposed estimator over the usual mean per unit estimator under simple random sampling (SRS). Related results of the simulation study show that the proposed estimators provide better estimation results on both real and artificial population over the competing estimators.  相似文献   

7.
Under stratified random sampling, we develop a kth-order bootstrap bias-corrected estimator of the number of classes θ which exist in a study region. This research extends Smith and van Belle’s (1984) first-order bootstrap bias-corrected estimator under simple random sampling. Our estimator has applicability for many settings including: estimating the number of animals when there are stratified capture periods, estimating the number of species based on stratified random sampling of subunits (say, quadrats) from the region, and estimating the number of errors/defects in a product based on observations from two or more types of inspectors. When the differences between the strata are large, utilizing stratified random sampling and our estimator often results in superior performance versus the use of simple random sampling and its bootstrap or jackknife [Burnham and Overton (1978)] estimator. The superior performance is often associated with more observed classes, and we provide insights into optimal designation of the strata and optimal allocation of sample sectors to strata.  相似文献   

8.
Millions of smart meters that are able to collect individual load curves, that is, electricity consumption time series, of residential and business customers at fine scale time grids are now deployed by electricity companies all around the world. It may be complex and costly to transmit and exploit such a large quantity of information, therefore it can be relevant to use survey sampling techniques to estimate mean load curves of specific groups of customers. Data collection, like every mass process, may undergo technical problems at every point of the metering and collection chain resulting in missing values. We consider imputation approaches (linear interpolation, kernel smoothing, nearest neighbours, principal analysis by conditional estimation) that take advantage of the specificities of the data, that is to say the strong relation between the consumption at different instants of time. The performances of these techniques are compared on a real example of Irish electricity load curves under various scenarios of missing data. A general variance approximation of total estimators is also given which encompasses nearest neighbours, kernel smoothers imputation and linear imputation methods. The Canadian Journal of Statistics 47: 65–89; 2019 © 2018 Statistical Society of Canada  相似文献   

9.
This study investigates the statistical properties of the adaptive Hotelling's T 2 charts with run rules in which the sample size and sampling interval are allowed to vary according on the current and past sampling points. The adaptive charts include variable sample size (VSS), variable sampling interval (VSI), and variable sample size and sampling interval (VSSI) charts. The adaptive Hotelling's T 2 charts with run rules are compared with the fixed sampling rate Hotelling's T 2 chart with run rules. The numerical results show that the VSS, VSI, and VSSI features improve the performance of the Hotelling's T 2 chart with run rules.  相似文献   

10.
The present article deals with some methods for estimation of finite populations means in the presence of linear trend among the population values. As a result, we provided a strategy for the selection of sampling interval k for the case of circular systematic sampling, which ensures better estimator for the population mean compared to other choices of the sampling interval. This has been established based on empirical studies. Further we more, applied multiple random starts methods for selecting random samples for the case of linear systematic sampling and diagonal systematic sampling schemes. We also derived the explicit expressions for the variances and their estimates. The relative performances of simple random sampling, linear systematic sampling and diagonal systematic sampling schemes with single and multiple random starts are also assessed based on numerical examples.  相似文献   

11.
In this work, we define a new method of ranked set sampling (RSS) which is suitable when the characteristic (variable) Y of primary interest on the units is jointly distributed with an auxiliary characteristic X on which one can take its measurement on any number of units, so that units having record values on X alone are ranked and retained for making measurement on Y. We name this RSS as concomitant record ranked set sampling (CRRSS). We propose estimators of the parameters associated with the variable Y of primary interest based on observations of the proposed CRRSS which are applicable to a very large class of distributions viz. Morgenstern family of distributions. We illustrate the application of CRRSS and our estimation technique of parameters, when the basic distribution is Morgenstern-type bivariate logistic distribution. A primary data collected by CRRSS method is demonstrated and the obtained data used to illustrate the results developed in this work.  相似文献   

12.
Variational and variational Bayes techniques are popular approaches for statistical inference of complex models but their theoretical properties are still not well known. Because of both unobserved variables and intricate dependency structures, mixture models for random graphs constitute a good case study. We first present four different variational estimates for the parameters of these models. We then compare their accuracy through simulation studies and show that the variational Bayes estimates seem the most accurate for moderate graph size. We finally re-analyse the regulatory network of Escherichia coli with this approach.  相似文献   

13.
The classical histogram method has already been applied in line transect sampling to estimate the parameter f(0), which in turns is used to estimate the population abundance D or the population size N. It is well know that the bias convergence rate for histogram estimator of f(0) is o(h2) as h → 0, under the shoulder condition assumption. If the shoulder condition is not true, then the bias convergence rate is only o(h). This paper proposed two new estimators for f(0), which can be considered as modifications of the classical histogram estimator. The first estimator is derived when the shoulder condition is assumed to be valid and it reduces the bias convergence rate from o(h2) to o(h3). The other one is constructed without using the shoulder condition assumption and it reduces the bias convergence rate from o(h) to o(h2). The asymptotic properties of the proposed estimators are derived and formulas for bin width are also given. The finite properties based on a real data set and an extensive simulation study demonstrated the potential practical use of the proposed estimators.  相似文献   

14.
A bioequivalence test is to compare bioavailability parameters, such as the maximum observed concentration (Cmax) or the area under the concentration‐time curve, for a test drug and a reference drug. During the planning of a bioequivalence test, it requires an assumption about the variance of Cmax or area under the concentration‐time curve for the estimation of sample size. Since the variance is unknown, current 2‐stage designs use variance estimated from stage 1 data to determine the sample size for stage 2. However, the estimation of variance with the stage 1 data is unstable and may result in too large or too small sample size for stage 2. This problem is magnified in bioequivalence tests with a serial sampling schedule, by which only one sample is collected from each individual and thus the correct assumption of variance becomes even more difficult. To solve this problem, we propose 3‐stage designs. Our designs increase sample sizes over stages gradually, so that extremely large sample sizes will not happen. With one more stage of data, the power is increased. Moreover, the variance estimated using data from both stages 1 and 2 is more stable than that using data from stage 1 only in a 2‐stage design. These features of the proposed designs are demonstrated by simulations. Testing significance levels are adjusted to control the overall type I errors at the same level for all the multistage designs.  相似文献   

15.
Abstract. Two new unequal probability sampling methods are introduced: conditional and restricted Pareto sampling. The advantage of conditional Pareto sampling compared with standard Pareto sampling, introduced by Rosén (J. Statist. Plann. Inference, 62, 1997, 135, 159), is that the factual inclusion probabilities better agree with the desired ones. Restricted Pareto sampling, preferably conditioned or adjusted, is able to handle cases where there are several restrictions on the sample and is an alternative to the recent cube method for balanced sampling introduced by Deville and Tillé (Biometrika, 91, 2004, 893). The new sampling designs have high entropy and the involved random numbers can be seen as permanent random numbers.  相似文献   

16.
Control charts are the most important statistical process control tool for monitoring variations in a process. A number of articles are available in the literature for the X? control chart based on simple random sampling, ranked set sampling, median-ranked set sampling (MRSS), extreme-ranked set sampling, double-ranked set sampling, double median-ranked set sampling and median double-ranked set sampling. In this study, we highlight some limitations of the existing ranked set charting structures. Besides, we propose different runs rules-based control charting structures under a variety of sampling strategies. We evaluate the performance of the control charting structures using power curves as a performance criterion. We observe that the proposed merger of varying runs rules schemes with different sampling strategies improve significantly the detection ability of location control charting structures. More specifically, the MRSS performs the best under both single- and double-ranked set strategies with varying runs rules schemes. We also include a real-life example to explain the proposal and highlight its significance for practical data sets.  相似文献   

17.
In this paper, we suggest three new ratio estimators of the population mean using quartiles of the auxiliary variable when there are missing data from the sample units. The suggested estimators are investigated under the simple random sampling method. We obtain the mean square errors equations for these estimators. The suggested estimators are compared with the sample mean and ratio estimators in the case of missing data. Also, they are compared with estimators in Singh and Horn [Compromised imputation in survey sampling, Metrika 51 (2000), pp. 267–276], Singh and Deo [Imputation by power transformation, Statist. Papers 45 (2003), pp. 555–579], and Kadilar and Cingi [Estimators for the population mean in the case of missing data, Commun. Stat.-Theory Methods, 37 (2008), pp. 2226–2236] and present under which conditions the proposed estimators are more efficient than other estimators. In terms of accuracy and of the coverage of the bootstrap confidence intervals, the suggested estimators performed better than other estimators.  相似文献   

18.
This paper presents the problem of prediction of a domain total value based on the general linear model. In many methods presented in the survey sampling literature (e.g. Cassel, Särndal & Wretman, 1977 [Foundations of inference in survey sampling, New York: John Wiley & Sons]; Valliant, Dorfman & Royall, 2000 [Finite population sampling and inference. A prediction approach. New York: John Wiley & Sons]; Rao, 2003 [Small area estimation. New York; John Wiley & Sons]) a common assumption is that for each element of a population the domain to which it belongs is known. This assumption is especially important in the situation when a superpopulation model with auxiliary variables is considered. In this paper a method is proposed for prediction of the domain total when it is not known whether a unit belongs to a given domain or not, or when the information is available only for sampled elements of the population.  相似文献   

19.
ABSTRACT

In this article, we consider the estimation of R = P(Y < X), when Y and X are two independent three-parameter Lindley (LI) random variables. On the basis of two independent samples, the modified maximum likelihood estimator along its asymptotic behavior and conditional likelihood-based estimator are used to estimate R. We also propose sample-based estimate of R and the associated credible interval based on importance sampling procedure. A real life data set involving the times to breakdown of an insulating fluid is presented and analyzed for illustrative purposes.  相似文献   

20.
Bayesian inference for pairwise interacting point processes   总被引:1,自引:0,他引:1  
Pairwise interacting point processes are commonly used to model spatial point patterns. To perform inference, the established frequentist methods can produce good point estimates when the interaction in the data is moderate, but some methods may produce severely biased estimates when the interaction in strong. Furthermore, because the sampling distributions of the estimates are unclear, interval estimates are typically obtained by parametric bootstrap methods. In the current setting however, the behavior of such estimates is not well understood. In this article we propose Bayesian methods for obtaining inferences in pairwise interacting point processes. The requisite application of Markov chain Monte Carlo (MCMC) techniques is complicated by an intractable function of the parameters in the likelihood. The acceptance probability in a Metropolis-Hastings algorithm involves the ratio of two likelihoods evaluated at differing parameter values. The intractable functions do not cancel, and hence an intractable ratio r must be estimated within each iteration of a Metropolis-Hastings sampler. We propose the use of importance sampling techniques within MCMC to address this problem. While r may be estimated by other methods, these, in general, are not readily applied in a Bayesian setting. We demonstrate the validity of our importance sampling approach with a small simulation study. Finally, we analyze the Swedish pine sapling dataset (Strand 1972) and contrast the results with those in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号