首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Parameter design or robust parameter design (RPD) is an engineering methodology intended as a cost-effective approach for improving the quality of products and processes. The goal of parameter design is to choose the levels of the control variables that optimize a defined quality characteristic. An essential component of RPD involves the assumption of well estimated models for the process mean and variance. Traditionally, the modeling of the mean and variance has been done parametrically. It is often the case, particularly when modeling the variance, that nonparametric techniques are more appropriate due to the nature of the curvature in the underlying function. Most response surface experiments involve sparse data. In sparse data situations with unusual curvature in the underlying function, nonparametric techniques often result in estimates with problematic variation whereas their parametric counterparts may result in estimates with problematic bias. We propose the use of semi-parametric modeling within the robust design setting, combining parametric and nonparametric functions to improve the quality of both mean and variance model estimation. The proposed method will be illustrated with an example and simulations.  相似文献   

2.
The author introduces robust techniques for estimation, inference and variable selection in the analysis of longitudinal data. She first addresses the problem of the robust estimation of the regression and nuisance parameters, for which she derives the asymptotic distribution. She uses weighted estimating equations to build robust quasi‐likelihood functions. These functions are then used to construct a class of test statistics for variable selection. She derives the limiting distribution of these tests and shows its robustness properties in terms of stability of the asymptotic level and power under contamination. An application to a real data set allows her to illustrate the benefits of a robust analysis.  相似文献   

3.
A new method for constructing interpretable principal components is proposed. The method first clusters the variables, and then interpretable (sparse) components are constructed from the correlation matrices of the clustered variables. For the first step of the method, a new weighted-variances method for clustering variables is proposed. It reflects the nature of the problem that the interpretable components should maximize the explained variance and thus provide sparse dimension reduction. An important feature of the new clustering procedure is that the optimal number of clusters (and components) can be determined in a non-subjective manner. The new method is illustrated using well-known simulated and real data sets. It clearly outperforms many existing methods for sparse principal component analysis in terms of both explained variance and sparseness.  相似文献   

4.
A mixed-integer programing formulation for clustering is proposed, one that encompasses a wider range of objectives--and side conditions--than standard clustering approaches. The flexibility of the formulation is demonstrated in diagrams of sample problems and solutions. Preliminary computational tests in a practical setting confirm the usefulness of the formulation.  相似文献   

5.
A method for robustness in linear models is to assume that there is a mixture of standard and outlier observations with a different error variance for each class. For generalised linear models (GLMs) the mixture model approach is more difficult as the error variance for many distributions has a fixed relationship to the mean. This model is extended to GLMs by changing the classes to one where the standard class is a standard GLM and the outlier class which is an overdispersed GLM achieved by including a random effect term in the linear predictor. The advantages of this method are it can be extended to any model with a linear predictor, and outlier observations can be easily identified. Using simulation the model is compared to an M-estimator, and found to have improved bias and coverage. The method is demonstrated on three examples.  相似文献   

6.
Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp.  相似文献   

7.
Following the developments in DasGupta et al. (2000), the authors propose and explore a new method for constructing proper default priors and a method for selecting a Bayes estimate from a family. Their results are based on asymptotic expansions of certain marginal correlations. For ease of exposition, most results are presented for location families and squared error loss only. The default prior methodology amounts, ultimately, to the minimization of Fisher information, and hence, Bickel's prior works out as the default prior if the location parameter is bounded. As for the selected Bayes estimate, it corresponds to ‘Gaussian tilting’ of an initial reference prior.  相似文献   

8.
9.
This study develops a robust automatic algorithm for clustering probability density functions based on the previous research. Unlike other existing methods that often pre-determine the number of clusters, this method can self-organize data groups based on the original data structure. The proposed clustering method is also robust in regards to noise. Three examples of synthetic data and a real-world COREL dataset are utilized to illustrate the accurateness and effectiveness of the proposed approach.  相似文献   

10.
We empirically illustrate how concepts and methods involved in a grade of membership (GoM) analysis can be used to sort individuals by competence. Our study relies on a data set compiled from the international survey on higher education graduates called REFLEX. We focus on the subset of data related to the perception of own competencies. It is first decomposed into fuzzy clusters that form a hierarchical fuzzy partition. Then, we calculate a scalar measure of competencies for each fuzzy cluster, and subsequently use the individual GoM scores to combine cluster-based competencies to position individuals on a scale from 0 to 1.  相似文献   

11.
In this article, we propose a robust statistical approach to select an appropriate error distribution, in a classical multiplicative heteroscedastic model. In a first step, unlike to the traditional approach, we do not use any GARCH-type estimation of the conditional variance. Instead, we propose to use a recently developed nonparametric procedure [31 D. Mercurio and V. Spokoiny, Statistical inference for time-inhomogeneous volatility models, Ann. Stat. 32 (2004), pp. 577602.[Crossref], [Web of Science ®] [Google Scholar]]: the local adaptive volatility estimation. The motivation for using this method is to avoid a possible model misspecification for the conditional variance. In a second step, we suggest a set of estimation and model selection procedures (Berk–Jones tests, kernel density-based selection, censored likelihood score, and coverage probability) based on the so-obtained residuals. These methods enable to assess the global fit of a set of distributions as well as to focus on their behaviour in the tails, giving us the capacity to map the strengths and weaknesses of the candidate distributions. A bootstrap procedure is provided to compute the rejection regions in this semiparametric context. Finally, we illustrate our methodology throughout a small simulation study and an application on three time series of daily returns (UBS stock returns, BOVESPA returns and EUR/USD exchange rates).  相似文献   

12.

The purpose of this paper is to show in regression clustering how to choose the most relevant solutions, analyze their stability, and provide information about best combinations of optimal number of groups, restriction factor among the error variance across groups and level of trimming. The procedure is based on two steps. First we generalize the information criteria of constrained robust multivariate clustering to the case of clustering weighted models. Differently from the traditional approaches which are based on the choice of the best solution found minimizing an information criterion (i.e. BIC), we concentrate our attention on the so called optimal stable solutions. In the second step, using the monitoring approach, we select the best value of the trimming factor. Finally, we validate the solution using a confirmatory forward search approach. A motivating example based on a novel dataset concerning the European Union trade of face masks shows the limitations of the current existing procedures. The suggested approach is initially applied to a set of well known datasets in the literature of robust regression clustering. Then, we focus our attention on a set of international trade datasets and we provide a novel informative way of updating the subset in the random start approach. The Supplementary material, in the spirit of the Special Issue, deepens the analysis of trade data and compares the suggested approach with the existing ones available in the literature.

  相似文献   

13.
We introduce the concept of snipping, complementing that of trimming, in robust cluster analysis. An observation is snipped when some of its dimensions are discarded, but the remaining are used for clustering and estimation. Snipped k-means is performed through a probabilistic optimization algorithm which is guaranteed to converge to the global optimum. We show global robustness properties of our snipped k-means procedure. Simulations and a real data application to optical recognition of handwritten digits are used to illustrate and compare the approach.  相似文献   

14.
15.
In this work, we modify finite mixtures of factor analysers to provide a method for simultaneous clustering of subjects and multivariate discrete outcomes. The joint clustering is performed through a suitable reparameterization of the outcome (column)-specific parameters. We develop an expectation–maximization-type algorithm for maximum likelihood parameter estimation where the maximization step is divided into orthogonal sub-blocks that refer to row and column-specific parameters, respectively. Model performance is evaluated via a simulation study with varying sample size, number of outcomes and row/column-specific clustering (partitions). We compare the performance of our model with the performance of standard model-based biclustering approaches. The proposed method is also demonstrated on a benchmark data set where a multivariate binary response is considered.  相似文献   

16.
Robust parameter design methodology was originally introduced by Taguchi [14 Taguchi, G. 1986. Introduction to Quality Engineering: Designing Quality Into Products and Process, Tokyo: Asian Productivity Organization.  [Google Scholar]] as an engineering methodology for quality improvement of products and processes. A robust design of a system is one in which two different types of factors are varied; control factors and noise factors. Control factors are variables with levels that are adjustable, whereas noise factors are variables with levels that are hard or impossible to control during normal conditions, such as environmental conditions and raw-material properties. Robust parameter design aims at the reduction of process variation by properly selecting the levels of control factors so that the process becomes insensitive to changes in noise factors. Taguchi [14 Taguchi, G. 1986. Introduction to Quality Engineering: Designing Quality Into Products and Process, Tokyo: Asian Productivity Organization.  [Google Scholar] 15 Taguchi, G. 1987. System of Experimental Design, Vol. I and II, New York: UNIPUB.  [Google Scholar]] proposed the use of crossed arrays (inner–outer arrays) for robust parameter design. A crossed array is the cross-product of an orthogonal array (OA) involving control factors (inner array) and an OA involving noise factors (outer array). Objecting to the run size and the flexibility of crossed arrays, several authors combined control and noise factors in a single design matrix, which is called a combined array, instead of crossed arrays. In this framework, we present the use of OAs in Taguchi's methodology as a useful tool for designing robust parameter designs with economical run size.  相似文献   

17.
By adopting; the Bayesian method, we develop in this paper some robust procedures for the one and two-sample location problems based on symmetric Type-II censored samples and by assuming normality for the censored samples. The posterior distributions and the Highest Posterior Density (H.P.D.) intervals are? derived. Finally, we illustrate these procedures by applying the results to Darwin's data and to Brown lee's data  相似文献   

18.
Outliers that commonly occur in business sample surveys can have large impacts on domain estimates. The authors consider an outlier‐robust design and smooth estimation approach, which can be related to the so‐called “Surprise stratum” technique [Kish, “Survey Sampling,” Wiley, New York (1965)]. The sampling design utilizes a threshold sample consisting of previously observed outliers that are selected with probability one, together with stratified simple random sampling from the rest of the population. The domain predictor is an extension of the Winsorization‐based estimator proposed by Rivest and Hidiroglou [Rivest and Hidiroglou, “Outlier Treatment for Disaggregated Estimates,” in “Proceedings of the Section on Survey Research Methods,” American Statistical Association (2004), pp. 4248–4256], and is similar to the estimator for skewed populations suggested by Fuller [Fuller, Statistica Sinica 1991;1:137–158]. It makes use of a domain Winsorized sample mean plus a domain‐specific adjustment of the estimated overall mean of the excess values on top of that. The methods are studied in theory from a design‐based perspective and by simulations based on the Norwegian Research and Development Survey data. Guidelines for choosing the threshold values are provided. The Canadian Journal of Statistics 39: 147–164; 2011 © 2010 Statistical Society of Canada  相似文献   

19.
Principal component regression uses principal components (PCs) as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not in line with the general trend. The proposed approach automatically penalises these observations so that their impact on the posterior gradually vanishes as they move further and further away from the general trend, corresponding to a concept in Bayesian statistics called whole robustness. The predictions produced are thus consistent with the bulk of the data. The approach also exploits the geometry of PCs to efficiently identify those that are significant. Individual predictions obtained from the resulting models are consolidated according to model-averaging mechanisms to account for model uncertainty. The approach is evaluated on real data and compared to its nonrobust Bayesian counterpart, the traditional frequentist approach and a commonly employed robust frequentist method. Detailed guidelines to automate the entire statistical procedure are provided. All required code is made available, see ArXiv:1711.06341.  相似文献   

20.
Two key questions in Clustering problems are how to determine the number of groups properly and measure the strength of group-assignments. These questions are specially involved when the presence of certain fraction of outlying data is also expected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号