首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
Random sampling from databases: a survey   总被引:2,自引:0,他引:2  
This paper reviews recent literature on techniques for obtaining random samples from databases. We begin with a discussion of why one would want to include sampling facilities in database management systems. We then review basic sampling techniques used in constructing DBMS sampling algorithms, e.g. acceptance/rejection and reservoir sampling. A discussion of sampling from various data structures follows: B + trees, hash files, spatial data structures (including R-trees and quadtrees). Algorithms for sampling from simple relational queries, e.g. single relational operators such as selection, intersection, union, set difference, projection, and join are then described. We then describe sampling for estimation of aggregates (e.g. the size of query results). Here we discuss both clustered sampling, and sequential sampling approaches. Decision-theoretic approaches to sampling for query optimization are reviewed.  相似文献   

2.
Under stratified random sampling, we develop a kth-order bootstrap bias-corrected estimator of the number of classes θ which exist in a study region. This research extends Smith and van Belle’s (1984) first-order bootstrap bias-corrected estimator under simple random sampling. Our estimator has applicability for many settings including: estimating the number of animals when there are stratified capture periods, estimating the number of species based on stratified random sampling of subunits (say, quadrats) from the region, and estimating the number of errors/defects in a product based on observations from two or more types of inspectors. When the differences between the strata are large, utilizing stratified random sampling and our estimator often results in superior performance versus the use of simple random sampling and its bootstrap or jackknife [Burnham and Overton (1978)] estimator. The superior performance is often associated with more observed classes, and we provide insights into optimal designation of the strata and optimal allocation of sample sectors to strata.  相似文献   

3.
Perfect simulation of positive Gaussian distributions   总被引:1,自引:0,他引:1  
We provide an exact simulation algorithm that produces variables from truncated Gaussian distributions on ( +) p via a perfect sampling scheme, based on stochastic ordering and slice sampling, since accept-reject algorithms like the one of Geweke (1991) and Robert (1995) are difficult to extend to higher dimensions.  相似文献   

4.
Abstract. The strong Rayleigh property is a new and robust negative dependence property that implies negative association; in fact it implies conditional negative association closed under external fields (CNA+). Suppose that and are two families of 0‐1 random variables that satisfy the strong Rayleigh property and let . We show that {Zi} conditioned on is also strongly Rayleigh; this turns out to be an easy consequence of the results on preservation of stability of polynomials of Borcea & Brändén (Invent. Math., 177, 2009, 521–569). This entails that a number of important π ps sampling algorithms, including Sampford sampling and Pareto sampling, are CNA+. As a consequence, statistics based on such samples automatically satisfy a version of the Central Limit Theorem for triangular arrays.  相似文献   

5.
We present a surprising though obvious result that seems to have been unnoticed until now. In particular, we demonstrate the equivalence of two well-known problems—the optimal allocation of the fixed overall sample size n among L strata under stratified random sampling and the optimal allocation of the H = 435 seats among the 50 states for apportionment of the U.S. House of Representatives following each decennial census. In spite of the strong similarity manifest in the statements of the two problems, they have not been linked and they have well-known but different solutions; one solution is not explicitly exact (Neyman allocation), and the other (equal proportions) is exact. We give explicit exact solutions for both and note that the solutions are equivalent. In fact, we conclude by showing that both problems are special cases of a general problem. The result is significant for stratified random sampling in that it explicitly shows how to minimize sampling error when estimating a total TY while keeping the final overall sample size fixed at n; this is usually not the case in practice with Neyman allocation where the resulting final overall sample size might be near n + L after rounding. An example reveals that controlled rounding with Neyman allocation does not always lead to the optimum allocation, that is, an allocation that minimizes variance.  相似文献   

6.
Motivated by a real-life problem, we develop a Two-Stage Cluster Sampling with Ranked Set Sampling (TSCRSS) design in the second stage for which we derive an unbiased estimator of population mean and its variance. An unbiased estimator of the variance of mean estimator is also derived. It is proved that the TSCRSS is more efficient—in the sense of having smaller variance—than the conventional two-stage cluster simple random sampling in which the second-stage sampling is with replacement. Using a simulation study on a real-life population, we show that the TSCRSS is more efficient than the conventional two-stage cluster sampling when simple random sampling without replacement is used in both stages.  相似文献   

7.
Gupta and Shabbir 2 Gupta, S. and Shabbir, J. 2008. On improvement in estimating the population mean in simple random sampling. J. Appl. Stat., 35(5): 559566. [Taylor & Francis Online], [Web of Science ®] [Google Scholar] have suggested an alternative form of ratio-type estimators for estimating the population mean. In this paper, we obtained a corrected version for the mean square error (MSE) of the Gupta–Shabbir estimator, up to first order of approximation, and the optimum case is discussed. We expand this estimator to the stratified random sampling and propose general classes for combined and separate estimators. Also an empirical study is carried out to show the properties of the proposed estimators.  相似文献   

8.
In this paper we consider the problem of unbiased estimation of the distribution function of an exponential population using order statistics based on a random sample. We present a (unique) unbiased estimator based on a single, say ith, order statistic and study some properties of the estimator for i = 2. We also indicate how this estimator can be utilized to obtain unbiased estimators when a few selected order statistics are available as well as when the sample is selected following an alternative sampling procedure known as ranked set sampling. It is further proved that for a ranked set sample of size two, the proposed estimator is uniformly better than the conventional nonparametric unbiased estimator, further, for a general sample size, a modified ranked set sampling procedure provides an unbiased estimator uniformly better than the conventional nonparametric unbiased estimator based on the usual ranked set sampling procedure.  相似文献   

9.
Abstract. Two new unequal probability sampling methods are introduced: conditional and restricted Pareto sampling. The advantage of conditional Pareto sampling compared with standard Pareto sampling, introduced by Rosén (J. Statist. Plann. Inference, 62, 1997, 135, 159), is that the factual inclusion probabilities better agree with the desired ones. Restricted Pareto sampling, preferably conditioned or adjusted, is able to handle cases where there are several restrictions on the sample and is an alternative to the recent cube method for balanced sampling introduced by Deville and Tillé (Biometrika, 91, 2004, 893). The new sampling designs have high entropy and the involved random numbers can be seen as permanent random numbers.  相似文献   

10.
This article addresses the problem of estimating the population mean in stratified random sampling using the information of an auxiliary variable. A class of estimators for population mean is defined with its properties under large sample approximation. In particular, various classes of estimators are identified as particular member of the suggested class. It has been shown that the proposed class of estimators is better than usual unbiased estimator, usual combined ratio estimator, usual product estimator, usual regression estimator and Koyuncu and Kadilar (2009 Koyuncu, N., Kadilar, C. (2009). Ratio and product estimators in stratified random sampling. J. Statist. Plan. Infere. 139:25522558.[Crossref], [Web of Science ®] [Google Scholar]) class of estimators. The results have been illustrated through an empirical study.  相似文献   

11.
In this paper we examine the failure-censored sampling plans for the two–parameter exponential distri- bution based on m random samples, each of size n. The suggested procedure is based on exact results and only the first failure time of each sample is needed. The values of the acceptability constant are also tabulated for selected values of p α 1 p β 1, α and β. Further, a comparison of the proposed sampling plans with ordinary sampling plans using a sample of size mn is made. When compared to ordinary sampling plans, the proposed plan has an advantage in terms of shorter test-time and a saving of resources.  相似文献   

12.
This article addresses the problem of estimating the finite population mean in stratified random sampling using auxiliary information. Motivated by Singh (1967 Singh , M. P. ( 1967 ). Ratio cum product method of estimation . Metrika 12 : 3442 .[Crossref] [Google Scholar]) and Bahl and Tuteja (1991 Bahl , S. , Tuteja , R. K. ( 1991 ). Ratio and product type exponential estimator . Inform. Optimiz. Sci. 12 ( 1 ): 159163 .[Taylor &; Francis Online] [Google Scholar]) a ratio-cum-product type exponential estimator has been suggested and its bias and mean squared error have been derived under large sample approximation. Suggested estimator has been compared with usual unbiased estimator of population mean in stratified random sampling, combined ratio estimator, combined product estimator, ratio and product type exponential estimator of Singh et al. (2008 Singh , R. , Kumar , M. , Singh , R. D. , Chaudhary , M. K. ( 2008 ). Exponential ratio type estimators in stratified random sampling. Presented in International Symposium on Optimisation and Statistics (I.S.O.S) at A.M.U., Aligarh, India, during 29–31 Dec . [Google Scholar]). Conditions under which suggested estimator is more efficient than other considered estimators have been obtained. A numerical illustration is given in support of the theoretical findings.  相似文献   

13.
In the standard linear regression model with independent, homoscedastic errors, the Gauss—Markov theorem asserts that = (X'X)-1(X'y) is the best linear unbiased estimator of β and, furthermore, that is the best linear unbiased estimator of c'β for all p × 1 vectors c. In the corresponding random regressor model, X is a random sample of size n from a p-variate distribution. If attention is restricted to linear estimators of c'β that are conditionally unbiased, given X, the Gauss—Markov theorem applies. If, however, the estimator is required only to be unconditionally unbiased, the Gauss—Markov theorem may or may not hold, depending on what is known about the distribution of X. The results generalize to the case in which X is a random sample without replacement from a finite population.  相似文献   

14.
15.
ABSTRACT

In this paper, some deficiencies in traditional selection procedure of circular version of systematic sampling schemes are investigated and alternative methods are proposed. We also suggest some rules of thumb for coincidence of units in the sample. The end corrections proposed by Bellhouse and Rao (1975 Bellhouse, D.R., Rao, J.N.K. (1975). Systematic sampling in the presence of a trend. Biometrika. 62:694697.[Crossref], [Web of Science ®] [Google Scholar]) and Sampath and Varalakshmi (2008) for circular systematic sampling and diagonal circular systematic sampling, respectively, are also modified.  相似文献   

16.
ABSTRACT

The article suggests a class of estimators of population mean in stratified random sampling using auxiliary information with its properties. In addition, various known estimators/classes of estimators are identified as members of the suggested class. It has been shown that the suggested class of estimators under optimum condition performs better than the usual unbiased, usual combined ratio, usual combined regression, Kadilar and Cingi (2005 Kadilar, C., Cingi, H. (2005). A new ratio estimator in stratified sampling. Commun. Stat. Theory Methods 34:597602.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]), Singh and Vishwakarma (2006 Singh, H.P., Vishwakarma, G.K. (2006). Combined ratio-product estimator of finite population mean in stratified sampling. Metodologia de Encuestas Monografico: Incidencias en el trabjo de Campo 7(1):3240. [Google Scholar]) estimators and the members belonging to the classes of estimators envisaged by Kadilar and Cingi (2003 Kadilar, C., Cingi, H. (2003). Ratio estimator in stratified sampling. Biomet. J. 45:218225.[Crossref], [Web of Science ®] [Google Scholar]), Singh, Tailor et al. (2008 Singh, H.P., Agnihotri, N. (2008). A general procedure of estimating population mean using auxiliary information in sample surveys. Stat. Trans. 9(1):7187. [Google Scholar]), Singh et al. (2009 Singh, R., Kumar, M., Chaudhary, M.K., Kadilar, C. (2009). Improved exponential estimator in stratified random sampling. Pak. J. Stat. Oper. Res. 5(2):6782.[Crossref] [Google Scholar]), Singh and Vishwakarma (2010 Singh, H.P., Vishwakarma, G.K. (2010). A general procedure for estimating the population mean in stratified sampling using auxiliary information. METRON 67(1):4765.[Crossref] [Google Scholar]) and Koyuncu and Kadilar (2010) Koyuncu, N., Kadilar, C. (2010). On improvement in estimating population mean in stratified random sampling. J. Appl. Stat. 37(6):9991013.[Taylor & Francis Online], [Web of Science ®] [Google Scholar].  相似文献   

17.
The present article deals with some methods for estimation of finite populations means in the presence of linear trend among the population values. As a result, we provided a strategy for the selection of sampling interval k for the case of circular systematic sampling, which ensures better estimator for the population mean compared to other choices of the sampling interval. This has been established based on empirical studies. Further we more, applied multiple random starts methods for selecting random samples for the case of linear systematic sampling and diagonal systematic sampling schemes. We also derived the explicit expressions for the variances and their estimates. The relative performances of simple random sampling, linear systematic sampling and diagonal systematic sampling schemes with single and multiple random starts are also assessed based on numerical examples.  相似文献   

18.
This paper suggests an efficient class of ratio and product estimators for estimating the population mean in stratified random sampling using auxiliary information. It is interesting to mention that, in addition to many, Koyuncu and Kadilar (2009 Koyuncu , N. , Kadilar , C. ( 2009 ). Ratio and product estimators in stratified random sampling . J. Statist. Plann. Infer. 139 : 25522558 .[Crossref], [Web of Science ®] [Google Scholar]), Kadilar and Cingi (2003 Kadilar , C. , Cingi , H. ( 2003 ). Ratio estimator in stratified sampling . Biometr. J. 45 : 218225 .[Crossref], [Web of Science ®] [Google Scholar], 2005 Kadilar , C. , Cingi , H. ( 2005 ). A new estimator in stratified random sampling . Commun. Statist. Theor. Meth. 34 : 597602 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]), and Singh and Vishwakarma (2007 Singh , H. P. , Vishwakarma , G. K. ( 2007 ). Modified exponential ratio and product estimators for finite population mean in double sampling . Austr. J. Statist. 36 ( 3 ): 217225 . [Google Scholar]) estimators are identified as members of the proposed class of estimators. The expressions of bias and mean square error (MSE) of the proposed estimators are derived under large sample approximation in general form. Asymptotically optimum estimator (AOE) in the class is identified alongwith its MSE formula. It has been shown that the proposed class of estimators is more efficient than combined regression estimator and Koyuncu and Kadilar (2009 Koyuncu , N. , Kadilar , C. ( 2009 ). Ratio and product estimators in stratified random sampling . J. Statist. Plann. Infer. 139 : 25522558 .[Crossref], [Web of Science ®] [Google Scholar]) estimator. Moreover, theoretical findings are supported through a numerical example.  相似文献   

19.
20.
This paper addresses the problem of unbiased estimation of P[X > Y] = θ for two independent exponentially distributed random variables X and Y. We present (unique) unbiased estimator of θ based on a single pair of order statistics obtained from two independent random samples from the two populations. We also indicate how this estimator can be utilized to obtain unbiased estimators of θ when only a few selected order statistics are available from the two random samples as well as when the samples are selected by an alternative procedure known as ranked set sampling. It is proved that for ranked set samples of size two, the proposed estimator is uniformly better than the conventional non-parametric unbiased estimator and further, a modified ranked set sampling procedure provides an unbiased estimator even better than the proposed estimator.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号