Clustering the mixed panel dataset using Gower's distance and k-prototypes algorithms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Clustering the mixed panel dataset using Gower's distance and k-prototypes algorithms

Authors:	Özlem Akay Güzin Yüksel

Institution:	1. Department of Statistics, The Faculty of Science and Letters, Institute of Natural and Applied Sciences, ?ukurova University, Adana, Turkeyoakay@cu.edu.tr;3. Statistics Department, Faculty of Arts and Sciences, ?ukurova University, Adana, Turkey

Abstract:	ABSTRACT Panel datasets have been increasingly used in economics to analyze complex economic phenomena. Panel data is a two-dimensional array that combines cross-sectional and time series data. Through constructing a panel data matrix, the clustering method is applied to panel data analysis. This method solves the heterogeneity question of the dependent variable, which belongs to panel data, before the analysis. Clustering is a widely used statistical tool in determining subsets in a given dataset. In this article, we present that the mixed panel dataset is clustered by agglomerative hierarchical algorithms based on Gower's distance and by k-prototypes. The performance of these algorithms has been studied on panel data with mixed numerical and categorical features. The effectiveness of these algorithms is compared by using cluster accuracy. An experimental analysis is illustrated on a real dataset using Stata and R package software.

Keywords:	Cluster analysis Gower's distance k-prototypes Panel data