排序方式: 共有7条查询结果,搜索用时 828 毫秒
1
1.
ABSTRACTDiscrepancies are measures which are defined as the deviation between the empirical and the theoretical uniform distribution. In this way, discrepancy is a measure of uniformity which provides a way of construction a special kind of space filling designs, namely uniform designs. Several discrepancies have been proposed in recent literature. A brief, selective review of these measures including some construction algorithms are given in this paper. Furthermore, a critical discussion along with some comparisons is provided, as well. 相似文献
2.
The identification of active effects in supersaturated designs (SSDs) constitutes a problem of considerable interest to both scientists and engineers. The complicated structure of the design matrix renders the analysis of such designs a complicated issue. Although several methods have been proposed so far, a solution to the problem beyond one or two active factors seems to be inadequate. This article presents a heuristic approach for analyzing SSDs using the cumulative sum control chart (CUSUM) under a sure independence screening approach. Simulations are used to investigate the performance of the method comparing the proposed method with other well-known methods from the literature. The results establish the powerfulness of the proposed methodology. 相似文献
3.
In this work we present a study on the analysis of a large data set from seismology. A set of different large margin classifiers based on the well-known support vector machine (SVM) algorithm is used to classify the data into two classes based on their magnitude on the Richter scale. Due to the imbalance of nature between the two classes reweighing techniques are used to show the importance of reweighing algorithms. Moreover, we present an incremental algorithm to explore the possibility of predicting the strength of an earthquake with incremental techniques. 相似文献
4.
The statistical modeling of big data bases constitutes one of the most challenging issues, especially nowadays. The issue is even more critical in case of a complicated correlation structure. Variable selection plays a vital role in statistical analysis of large data bases and many methods have been proposed so far to deal with the aforementioned problem. One of such methods is the Sure Independence Screening which has been introduced to reduce dimensionality to a relatively smaller scale. This method, though simple, produces remarkable results even under both ultra high dimensionality and big scale in terms of sample size problems. In this paper we dealt with the analysis of a big real medical data set assuming a Poisson regression model. We support the analysis by conducting simulated experiments taking into consideration the correlation structure of the design matrix. 相似文献
5.
K. Drosou 《统计学通讯:模拟与计算》2013,42(7):1979-1995
ABSTRACTSupersaturated designs (SSDs) constitute a large class of fractional factorial designs which can be used for screening out the important factors from a large set of potentially active ones. A major advantage of these designs is that they reduce the experimental cost dramatically, but their crucial disadvantage is the confounding involved in the statistical analysis. Identification of active effects in SSDs has been the subject of much recent study. In this article we present a two-stage procedure for analyzing two-level SSDs assuming a main-effect only model, without including any interaction terms. This method combines sure independence screening (SIS) with different penalty functions; such as Smoothly Clipped Absolute Deviation (SCAD), Lasso and MC penalty achieving both the down-selection and the estimation of the significant effects, simultaneously. Insights on using the proposed methodology are provided through various simulation scenarios and several comparisons with existing approaches, such as stepwise in combination with SCAD and Dantzig Selector (DS) are presented as well. Results of the numerical study and real data analysis reveal that the proposed procedure can be considered as an advantageous tool due to its extremely good performance for identifying active factors. 相似文献
6.
Robust parameter design methodology was originally introduced by Taguchi [14] as an engineering methodology for quality improvement of products and processes. A robust design of a system is one in which two different types of factors are varied; control factors and noise factors. Control factors are variables with levels that are adjustable, whereas noise factors are variables with levels that are hard or impossible to control during normal conditions, such as environmental conditions and raw-material properties. Robust parameter design aims at the reduction of process variation by properly selecting the levels of control factors so that the process becomes insensitive to changes in noise factors. Taguchi [14 15] proposed the use of crossed arrays (inner–outer arrays) for robust parameter design. A crossed array is the cross-product of an orthogonal array (OA) involving control factors (inner array) and an OA involving noise factors (outer array). Objecting to the run size and the flexibility of crossed arrays, several authors combined control and noise factors in a single design matrix, which is called a combined array, instead of crossed arrays. In this framework, we present the use of OAs in Taguchi's methodology as a useful tool for designing robust parameter designs with economical run size. 相似文献
7.
Krystallenia Drosou 《Journal of applied statistics》2017,44(3):533-553
One of the major issues in medical field constitutes the correct diagnosis, including the limitation of human expertise in diagnosing the disease in a manual way. Nowadays, the use of machine learning classifiers, such as support vector machines (SVM), in medical diagnosis is increasing gradually. However, traditional classification algorithms can be limited in their performance when they are applied on highly imbalanced data sets, in which negative examples (i.e. negative to a disease) outnumber the positive examples (i.e. positive to a disease). SVM constitutes a significant improvement and its mathematical formulation allows the incorporation of different weights so as to deal with the problem of imbalanced data. In the present work an extensive study of four medical data sets is conducted using a variant of SVM, called proximal support vector machine (PSVM) proposed by Fung and Mangasarian [9]. Additionally, in order to deal with the imbalanced nature of the medical data sets we applied both a variant of SVM, referred as two-cost support vector machine and a modification of PSVM referred as modified PSVM. Both algorithms incorporate different weights one for each class examples. 相似文献
1