首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
In this study, a new method for the estimation of the shrinkage and biasing parameters of Liu-type estimator is proposed. Because k is kept constant and d is optimized in Liu’s method, a (k, d) pair is not guaranteed to be the optimal point in terms of the mean square error of the parameters. The optimum (k, d) pair that minimizes the mean square error, which is a function of the parameters k and d, should be estimated through a simultaneous optimization process rather than through a two-stage process. In this study, by utilizing a different objective function, the parameters k and d are optimized simultaneously with the particle swarm optimization technique.  相似文献   

2.
Reduced k‐means clustering is a method for clustering objects in a low‐dimensional subspace. The advantage of this method is that both clustering of objects and low‐dimensional subspace reflecting the cluster structure are simultaneously obtained. In this paper, the relationship between conventional k‐means clustering and reduced k‐means clustering is discussed. Conditions ensuring almost sure convergence of the estimator of reduced k‐means clustering as unboundedly increasing sample size have been presented. The results for a more general model considering conventional k‐means clustering and reduced k‐means clustering are provided in this paper. Moreover, a consistent selection of the numbers of clusters and dimensions is described.  相似文献   

3.
ABSTRACT

Many mathematical and physical problems are led to find a root of a real function f. This kind of equation is an inverse problem and it is difficult to solve it. Especially in engineering sciences, the analytical expression of the function f is unknown to the experimenter, but it can be measured at each point xk with M(xk) as expected value and induced error ξk. The aim is to approximate the unique root θ under some assumptions on the function f and errors ξk. We use a stochastic approximation algorithm that constructs a sequence (xk)k ? 1. We establish the almost complete convergence of the sequence (xk)k to the exact root θ by considering the errors (ξk)k quasi-associated and we illustrate the method by numerical examples to show its efficiency.  相似文献   

4.
In this article, a general method of construction of neighbor block designs is given. The designs are constructed using variation of a simple method which we refer to as the method of addition (renamed as the method of cyclic shifts). We give complete solution of neighbor balanced designs for k = 4 for any value of v. We also give many series of generalized neighbor designs (GNDs). In the last section, we have constructed GNDs in a sequential manner (as Did John 1981) for v ≤ 50 and r is multiple of k.  相似文献   

5.
A method of constructing a resolvable orthogonal array (4λk2,2) which can be partitioned into λ orthogonal arrays (4,k 2,1) is proposed. The number of constraints kfor this type of orthogonal array is at most 3λ. When λ=2 or a multiple of 4, an orthogonal array with the maximum number of constraints of 3λ can be constructed. When λ=4n+2(n≧1) an orthogonal array with 2λ+2 constraints can be constructed. When λ is an odd number, orthogonal arrays can be constructed for λ=3,5,7, and 9 with k=4,8,12, and 13 respectively.  相似文献   

6.
In this article, we discuss finding the optimal k of (i) kth simple moving average, (ii) kth weighted moving average, and (iii) kth exponential weighted moving average based on simulated autoregressive AR(p) model. We run a simulation using the three above examining method under specific conditions. The main finding is that the optimal k = 4 and then k = 3. Especially, the fourth WMA ARIMA model, fourth EWMA ARIMA model, and third EWMA ARIMA model are the best forecasting models among others, respectively. For all the six real data reveal the similar results of simulation study.  相似文献   

7.
The k nearest neighbors (k-NN) classifier is one of the most popular methods for statistical pattern recognition and machine learning. In practice, the size k, the number of neighbors used for classification, is usually arbitrarily set to one or some other small numbers, or based on the cross-validation procedure. In this study, we propose a novel alternative approach to decide the size k. Based on a k-NN-based multivariate multi-sample test, we assign each k a permutation test based Z-score. The number of NN is set to the k with the highest Z-score. This approach is computationally efficient since we have derived the formulas for the mean and variance of the test statistic under permutation distribution for multiple sample groups. Several simulation and real-world data sets are analyzed to investigate the performance of our approach. The usefulness of our approach is demonstrated through the evaluation of prediction accuracies using Z-score as a criterion to select the size k. We also compare our approach to the widely used cross-validation approaches. The results show that the size k selected by our approach yields high prediction accuracies when informative features are used for classification, whereas the cross-validation approach may fail in some cases.  相似文献   

8.
In this paper, we propose a general kth correlation coefficient between the density function and distribution function of a continuous variable as a measure of symmetry and asymmetry. We first propose a root-n moment-based estimator of the kth correlation coefficient and present its asymptotic results. Next, we consider statistical inference of the kth correlation coefficient by using the empirical likelihood (EL) method. The EL statistic is shown to be asymptotically a standard chi-squared distribution. Last, we propose a residual-based estimator of the kth correlation coefficient for a parametric regression model to test whether the density function of the true model error is symmetric or not. We present the asymptotic results of the residual-based kth correlation coefficient estimator and also construct its EL-based confidence intervals. Simulation studies are conducted to examine the performance of the proposed estimators, and we also use our proposed estimators to analyze the air quality dataset.  相似文献   

9.
It is assumed that k(k?>?2) independent samples of sizes n i (i?=?1, …, k) are available from k lognormal distributions. Four hypothesis cases (H 1H 4) are defined. Under H 1, all k median parameters as well as all k skewness parameters are equal; under H 2, all k skewness parameters are equal but not all k median parameters are equal; under H 3, all k median parameters are equal but not all k skewness parameters are equal; under H 4, neither the k median parameters nor the k skewness parameters are equal. The Expectation Maximization (EM) algorithm is used to obtain the maximum likelihood (ML) estimates of the lognormal parameters in each of these four hypothesis cases. A (2k???1) degree polynomial is solved at each step of the EM algorithm for the H 3 case. A two-stage procedure for testing the equality of the medians either under skewness homogeneity or under skewness heterogeneity is also proposed and discussed. A simulation study was performed for the case k?=?3.  相似文献   

10.
k-POD: A Method for k-Means Clustering of Missing Data   总被引:1,自引:0,他引:1  
The k-means algorithm is often used in clustering applications but its usage requires a complete data matrix. Missing data, however, are common in many applications. Mainstream approaches to clustering missing data reduce the missing data problem to a complete data formulation through either deletion or imputation but these solutions may incur significant costs. Our k-POD method presents a simple extension of k-means clustering for missing data that works even when the missingness mechanism is unknown, when external information is unavailable, and when there is significant missingness in the data.

[Received November 2014. Revised August 2015.]  相似文献   

11.
The authors consider a finite population ρ = {(Yk, xk), k = 1,…,N} conforming to a linear superpopulation model with unknown heteroscedastic errors, the variances of which are values of a smooth enough function of the auxiliary variable X for their nonparametric estimation. They describe a method of the Chambers‐Dunstan type for estimation of the distribution of {Yk, k = 1,…, N} from a sample drawn from without replacement, and determine the asymptotic distribution of its estimation error. They also consider estimation of its mean squared error in particular cases, evaluating both the analytical estimator derived by “plugging‐in” the asymptotic variance, and a bootstrap approach that is also applicable to estimation of parameters other than mean squared error. These proposed methods are compared with some common competitors in simulation studies.  相似文献   

12.
One of the most popular methods and algorithms to partition data to k clusters is k-means clustering algorithm. Since this method relies on some basic conditions such as, the existence of mean and finite variance, it is unsuitable for data that their variances are infinite such as data with heavy tailed distribution. Pitman Measure of Closeness (PMC) is a criterion to show how much an estimator is close to its parameter with respect to another estimator. In this article using PMC, based on k-means clustering, a new distance and clustering algorithm is developed for heavy tailed data.  相似文献   

13.
G = F k (k > 1); G = 1 − (1−F) k (k < 1); G = F k (k < 1); and G = 1 − (1−F) k (k > 1), where F and G are two continuous cumulative distribution functions. If an optimal precedence test (one with the maximal power) is determined for one of these four classes, the optimal tests for the other classes of alternatives can be derived. Application of this is given using the results of Lin and Sukhatme (1992) who derived the best precedence test for testing the null hypothesis that the lifetimes of two types of items on test have the same distibution. The test has maximum power for fixed κ in the class of alternatives G = 1 − (1−F) k , with k < 1. Best precedence tests for the other three classes of Lehmann-type alternatives are derived using their results. Finally, a comparison of precedence tests with Wilcoxon's two-sample test is presented. Received: February 22, 1999; revised version: June 7, 2000  相似文献   

14.
The problem of inference in Bayesian Normal mixture models is known to be difficult. In particular, direct Bayesian inference (via quadrature) suffers from a combinatorial explosion in having to consider every possible partition of n observations into k mixture components, resulting in a computation time which is O(k n). This paper explores the use of discretised parameters and shows that for equal-variance mixture models, direct computation time can be reduced to O(D k n k), where relevant continuous parameters are each divided into D regions. As a consequence, direct inference is now possible on genuine data sets for small k, where the quality of approximation is determined by the level of discretisation. For large problems, where the computational complexity is still too great in O(D k n k) time, discretisation can provide a convergence diagnostic for a Markov chain Monte Carlo analysis.  相似文献   

15.
A random effects model for analyzing mixed longitudinal normal and count outcomes with and without the possibility of non ignorable missing outcomes is presented. The count response is inflated in two points (k and l) and the (k, l)-Hurdle power series is used as its distribution. The new distribution contains, as special submodels, several important distributions which are discussed, such as (k, l)-Hurdle Poisson and (k, l)-Hurdle negative binomial and (k, l)-Hurdle binomial distributions among others. Random effects are used to take into account the correlation between longitudinal outcomes and inflation parameters. A full likelihood-based approach is used to yield maximum likelihood estimates of the model parameters. A simulation study is performed in which for count outcome (k, l)-Hurdle Poisson, (k, l)-Hurdle negative binomial and (k, l)-Hurdle binomial distributions are considered. To illustrate the application of such modelling the longitudinal data of body mass index and the number of joint damage are analyzed.  相似文献   

16.
ABSTRACT

Weighted k-out-of-n system has been widely used in various engineering areas. Performance of such system is characterized by the total capacity of the components. Therefore, capacity evaluation is of great importance for research on the behavior of the system over time. Capacity evaluation for binary weighted k-out-of-n system has been reported in the literature. In this paper, to shorten computational time, we first develop a multiplication method for capacity evaluation of binary weighted k-out-of-n system. We then generalize capacity evaluation to multi-state weighted k-out-of-n system. Recursive algorithm and multiplication algorithm are developed for capacity evaluation for such system. Comparison is made of the two methods in different aspects. An illustrative example of an oil transmission system is presented to demonstrate the implementation of the proposed methods.  相似文献   

17.
We propose here some strategies for estimating the population mean per subunit at the current occasion and change in mean from one occasion to the next based on two-stage sampling on two successive occasions. Estimators of the mean in two-stage sampling over successive occasions have so far been based on the knowledge of the total number of subunits (elements), M0 in the population or on assumed equal sizes for the primary units. We have therefore given the ratio-to-size estimators for the population mean per subunit on the current occasion and the change in mean over the two occasions where M0 is not known. The results obtained also apply to the situation where M1 is correlated with the variable of interest y.  相似文献   

18.
The k-means algorithm is one of the most common non hierarchical methods of clustering. It aims to construct clusters in order to minimize the within cluster sum of squared distances. However, as most estimators defined in terms of objective functions depending on global sums of squares, the k-means procedure is not robust with respect to atypical observations in the data. Alternative techniques have thus been introduced in the literature, e.g., the k-medoids method. The k-means and k-medoids methodologies are particular cases of the generalized k-means procedure. In this article, focus is on the error rate these clustering procedures achieve when one expects the data to be distributed according to a mixture distribution. Two different definitions of the error rate are under consideration, depending on the data at hand. It is shown that contamination may make one of these two error rates decrease even under optimal models. The consequence of this will be emphasized with the comparison of influence functions and breakdown points of these error rates.  相似文献   

19.
In this article, the influence of a cold standby component to the reliability of weighted k-out-of-n: G systems consisting of two different types of components is studied. Weighted k-out-of-n: G systems are generalization of k-out-of-n systems that has attracted substantial interest in reliability theory because of their various applications in engineering. A method based on residual lifetimes of mixed components is presented for computing reliability of weighted k-out-of-n: G systems with two types of components and a cold standby component. Reliability and mean time to failure of different structured systems have been computed. Moreover, obtained results are used for defining optimal system configurations that can minimize the overall system costs.  相似文献   

20.
Consider k( ? 2) normal populations with unknown means μ1, …, μk, and a common known variance σ2. Let μ[1] ? ??? ? μ[k] denote the ordered μi.The populations associated with the t(1 ? t ? k ? 1) largest means are called the t best populations. Hsu and Panchapakesan (2004) proposed and investigated a procedure RHPfor selecting a non empty subset of the k populations whose size is at most m(1 ? m ? k ? t) so that at least one of the t best populations is included in the selected subset with a minimum guaranteed probability P* whenever μ[k ? t + 1] ? μ[k ? t] ? δ*, where P*?and?δ* are specified in advance of the experiment. This probability requirement is known as the indifference-zone probability requirement. In the present article, we investigate the same procedure RHP for the same goal as before but when k ? t < m ? k ? 1 so that at least one of the t best populations is included in the selected subset with a minimum guaranteed probability P* whatever be the configuration of the unknown μi. The probability requirement in this latter case is termed the subset selection probability requirement. Santner (1976) proposed and investigated a different procedure (RS) based on samples of size n from each of the populations, considering both cases, 1 ? m ? k ? t and k ? t < m ? k. The special case of t = 1 was earlier studied by Gupta and Santner (1973) and Hsu and Panchapakesan (2002) for their respective procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号