首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Singh and Sukhatme [4] have considered the problem of optimum stratification on an auxiliary variable x when the units from the different strata are selected with probability proportional to the value of the auxiliary variable and the sample sizes for the different strata are determined by using Neyman allocation method. The present paper considers the same problem for the proportional and equal allocation methods. The rules for finding approximately optimum strata boundaries for these two allocation methods have been given. An investigation into the relative efficiency of these allocation methods with respect to the Neyman allocation has also been made. The performance of equal allocation is found to be better than that of proportional allocation and practically equivalent to the Neyman allocation.  相似文献   

3.
In this article, a new algorithm for rather expensive simulation problems is presented, which consists of two phases. In the first phase, as a model-based algorithm, the simulation output is used directly in the optimization stage. In the second phase, the simulation model is replaced by a valid metamodel. In addition, a new optimization algorithm is presented. To evaluate the performance of the proposed algorithm, it is applied to the (s,S) inventory problem as well as to five test functions. Numerical results show that the proposed algorithm leads to better solutions with less computational time than the corresponding metamodel-based algorithm.  相似文献   

4.
When the information on a highly positively correlated auxiliary variable x is used to construct stratified regression (or ratio) estimates of the population mean of the study variable y, the paper considers the problem of determining approximately optimum strata boundaries (AOSB) on x when the sample size in each stratum is equal. The form of the conditional variance function V(y/x) is assumed to be known. A numerical investigation into the relative efficiency of equal allocation with respect to the Neyman and proportional allocations has also been made. The relative efficiency of equal allocation with respect to Neyman allocation is found to be nearly equal to one.  相似文献   

5.
In many real life situations the linear cost function does not approximate the actual cost incurred adequately. The cost of traveling between the units selected in the sample within a stratum is significant, instead of linear cost function. In this paper, we have considered the problem of finding a compromise allocation for a multivariate stratified sample survey with a significant travel cost within strata is formulated as a problem of non-linear stochastic programming with multiple objective functions. The compromise solutions are obtained through Chebyshev approximation technique, D 1- distance and goal programming. A numerical example is presented to illustrate the computational details of the proposed methods.  相似文献   

6.
In this work it is shown how the k-means method for clustering objects can be applied in the context of statistical shape analysis. Because the choice of the suitable distance measure is a key issue for shape analysis, the Hartigan and Wong k-means algorithm is adapted for this situation. Simulations on controlled artificial data sets demonstrate that distances on the pre-shape spaces are more appropriate than the Euclidean distance on the tangent space. Finally, results are presented of an application to a real problem of oceanography, which in fact motivated the current work.  相似文献   

7.
The paper considers the problem of finding optimum strata boundaries when sample sizes to different strata are allocated in proportion to the strata totals of the auxiliary variable. This variable is also treated as the stratification variable. Minimal equations, solutions to which provide the optimum boundaries, have been obtained. Because of the implicit nature of these equations their exact solutions cannot be obtained. Therefore, methods of obtaining their approximate solutions have been presented. A lim¬iting expression for the variance of the estimate of population mean, as the number of strata tend to become large, has also been obtained.  相似文献   

8.
Dealing with incomplete data is a pervasive problem in statistical surveys. Bayesian networks have been recently used in missing data imputation. In this research, we propose a new methodology for the multivariate imputation of missing data using discrete Bayesian networks and conditional Gaussian Bayesian networks. Results from imputing missing values in coronary artery disease data set and milk composition data set as well as a simulation study from cancer-neapolitan network are presented to demonstrate and compare the performance of three Bayesian network-based imputation methods with those of multivariate imputation by chained equations (MICE) and the classical hot-deck imputation method. To assess the effect of the structure learning algorithm on the performance of the Bayesian network-based methods, two methods called Peter-Clark algorithm and greedy search-and-score have been applied. Bayesian network-based methods are: first, the method introduced by Di Zio et al. [Bayesian networks for imputation, J. R. Stat. Soc. Ser. A 167 (2004), 309–322] in which, each missing item of a variable is imputed using the information given in the parents of that variable; second, the method of Di Zio et al. [Multivariate techniques for imputation based on Bayesian networks, Neural Netw. World 15 (2005), 303–310] which uses the information in the Markov blanket set of the variable to be imputed and finally, our new proposed method which applies the whole available knowledge of all variables of interest, consisting the Markov blanket and so the parent set, to impute a missing item. Results indicate the high quality of our new proposed method especially in the presence of high missingness percentages and more connected networks. Also the new method have shown to be more efficient than the MICE method for small sample sizes with high missing rates.  相似文献   

9.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

10.
Standard Methods of optimal stratification are solving the optimization problem as a function of strata boundaries and sample allocation only. In this paper we show that by means of a flexible two stage grid search procedure strata boundaries, sample allocation and furthermore number of strata can be taken into account in an effective way when optimizing stratification and allocation. By means of a Monte Carlo simulation we show that the proposed procedure is efficient compared to the well known standard procedures.  相似文献   

11.
Effectively solving the label switching problem is critical for both Bayesian and Frequentist mixture model analyses. In this article, a new relabeling method is proposed by extending a recently developed modal clustering algorithm. First, the posterior distribution is estimated by a kernel density from permuted MCMC or bootstrap samples of parameters. Second, a modal EM algorithm is used to find the m! symmetric modes of the KDE. Finally, samples that ascend to the same mode are assigned the same label. Simulations and real data applications demonstrate that the new method provides more accurate estimates than many existing relabeling methods.  相似文献   

12.
Bryant, Hartley & Jessen (1960) presented a two‐way stratification sampling design when the sample size n is less than the number of strata. Their design was extended to a three‐way stratification case by Chaudhary & Kumar (1988) , but this design does not take into account serial correlation, which might be present as a result of the presence of a time variable. In this paper, a new sampling procedure is presented for three‐way stratification when one of the stratifying variables is time. The purpose of such a design is to take into account serial correlation. The variance of the unweighted estimator of the population mean with respect to a super population model is used as the basis for comparison. Simulation results show that the suggested design is more efficient than the Chaudhary & Kumar (1988) design.  相似文献   

13.
This paper describes two new, mathematical programming-based approaches for evaluating general, one- and two-sidedp-variate normal probabilities where the variance-covariance matrix (of arbitrary structure) is singular with rankr(r<pand r and p can be of unlimited dimensions. In both cases, principal components are used to transform the original, ill-definedp-dimensional integral into a well-definedrdimensional integral over a convex polyhedron. The first algorithm that is presented uses linear programming coupled with a Gauss-Legendre quadrature scheme to compute this integral, while the second algorithm uses multi-parametric programming techniques in order to significantly reduce the number of optimization problems that need to be solved. The application of the algorithms is demonstrated and aspects of computational performance are discussed through a number of examples, ranging from a practical problem that arises in chemical engineering to larger, numerical examples.  相似文献   

14.
Three new test statistics are introduced for correlated categorical data in stratified R×C tables. They are similar in form to the standard generalized Cochran-Mantel-Haenszel statistics but modified to handle correlated outcomes. Two of these statistics are asymptotically valid in both many-strata (sparse data) and large-strata limiting models. The third one is designed specifically for the many-strata case but is valid even with a small number of strata. This latter statistic is also appropriate when strata are assumed to be random.  相似文献   

15.
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.  相似文献   

16.
This paper proposes robust regression to solve the problem of outliers in seemingly unrelated regression (SUR) models. The authors present an adaptation of S‐estimators to SUR models. S‐estimators are robust, have a high breakdown point and are much more efficient than other robust regression estimators commonly used in practice. Furthermore, modifications to Ruppert's algorithm allow a fast evaluation of them in this context. The classical example of U.S. corporations is revisited, and it appears that the procedure gives an interesting insight into the problem.  相似文献   

17.
In multivariate stratified sample survey with L strata, let p-characteristics are defined on each unit of the population. To estimate the unknown p-population means of each characteristic, a random sample is taken out from the population. In multivariate stratified sample survey, the optimum allocation of any characteristic may not be optimum for others. Thus the problem arises to find out an allocation which may be optimum for all characteristics in some sense. Therefore a compromise criterion is needed to workout such allocation. In this paper, the procedure of estimation of p-population means is discussed in the presence of nonresponse when the use of linear cost function is not advisable. A solution procedure is suggested by using lexicographic goal programming problem. The numerical illustrations are given for its practical utility.  相似文献   

18.
In this paper an asymptotic test for the separability of the spatial AR(p 1,1) model is presented by translating the spatial problem to a multiple time series problem. It is shown that the transformed problem reduces to testing whether or not the coefficient matrices of a certain VAR(p 1) are diagonal.

Some simulation study results are also presented here to demonstrate the use of this test.  相似文献   

19.
ABSTRACT

This article addresses the problem of repeats detection used in the comparison of significant repeats in sequences. The case of self-overlapping leftmost repeats for large sequences generated by an homogeneous stationary Markov chain has not been treated in the literature. In this work, we are interested by the approximation of the number of self-overlapping leftmost long enough repeats distribution in an homogeneous stationary Markov chain. Using the Chen–Stein method, we show that the number of self-overlapping leftmost long enough repeats distribution is approximated by the Poisson distribution. Moreover, we show that this approximation can be extended to the case where the sequences are generated by a m-order Markov chain.  相似文献   

20.
We propose a new stochastic approximation (SA) algorithm for maximum-likelihood estimation (MLE) in the incomplete-data setting. This algorithm is most useful for problems when the EM algorithm is not possible due to an intractable E-step or M-step. Compared to other algorithm that have been proposed for intractable EM problems, such as the MCEM algorithm of Wei and Tanner (1990), our proposed algorithm appears more generally applicable and efficient. The approach we adopt is inspired by the Robbins-Monro (1951) stochastic approximation procedure, and we show that the proposed algorithm can be used to solve some of the long-standing problems in computing an MLE with incomplete data. We prove that in general O(n) simulation steps are required in computing the MLE with the SA algorithm and O(n log n) simulation steps are required in computing the MLE using the MCEM and/or the MCNR algorithm, where n is the sample size of the observations. Examples include computing the MLE in the nonlinear error-in-variable model and nonlinear regression model with random effects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号