Hybrid optimization for k-means clustering learning enhancement

In recent years, combinational optimization issues are introduced as critical problems in clustering algorithms to partition data in a way that optimizes the performance of clustering. K-means algorithm is one of the famous and more popular clustering algorithms which can be simply implemented and i...

Full description

Saved in:
Bibliographic Details
Main Author: Farhang, Yousef
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://eprints.utm.my/id/eprint/78635/1/YousefFarhangPFC2016.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utm-ep.78635
record_format uketd_dc
spelling my-utm-ep.786352018-08-29T07:53:13Z Hybrid optimization for k-means clustering learning enhancement 2016-01 Farhang, Yousef QA75 Electronic computers. Computer science In recent years, combinational optimization issues are introduced as critical problems in clustering algorithms to partition data in a way that optimizes the performance of clustering. K-means algorithm is one of the famous and more popular clustering algorithms which can be simply implemented and it can easily solve the optimization issue with less extra information. But the problems associated with Kmeans algorithm are high error rate, high intra cluster distance and low accuracy. In this regard, researchers have worked to improve the problems computationally, creating efficient solutions that lead to better data analysis through the K-means clustering algorithm. The aim of this study is to improve the accuracy of the Kmeans algorithm using hybrid and meta-heuristic methods. To this end, a metaheuristic approach was proposed for the hybridization of K-means algorithm scheme. It obtained better results by developing a hybrid Genetic Algorithm-K-means (GAK- means) and a hybrid Partial Swarm Optimization-K-means (PSO-K-means) method. Finally, the meta-heuristic of Genetic Algorithm-Partial Swarm Optimization (GAPSO) and Partial Swarm Optimization-Genetic Algorithm (PSOGA) through the K-means algorithm were proposed. The study adopted a methodological approach to achieve the goal in three phases. First, it developed a hybrid GA-based K-means algorithm through a new crossover algorithm based on the range of attributes in order to decrease the number of errors and increase the accuracy rate. Then, a hybrid PSO-based K-means algorithm was mooted by a new calculation function based on the range of domain for decreasing intra-cluster distance and increasing the accuracy rate. Eventually, two meta-heuristic algorithms namely GAPSO-K-means and PSOGA-K-means algorithms were introduced by combining the proposed algorithms to increase the number of correct answers and improve the accuracy rate. The approach was evaluated using six integer standard data sets provided by the University of California Irvine (UCI). Findings confirmed that the hybrid optimization approach enhanced the performance of K-means clustering algorithm. Although both GA-K-means and PSO-K-means improved the result of K-means algorithm, GAPSO-K-means and PSOGA-K-means meta-heuristic algorithms outperformed the hybrid approaches. PSOGA-K-means resulted in 5%- 10% more accuracy for all data sets in comparison with other methods. The approach adopted in this study successfully increased the accuracy rate of the clustering analysis and decreased its error rate and intra-cluster distance. 2016-01 Thesis http://eprints.utm.my/id/eprint/78635/ http://eprints.utm.my/id/eprint/78635/1/YousefFarhangPFC2016.pdf application/pdf en public http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:97488 phd doctoral Universiti Teknologi Malaysia, Faculty of Computing Faculty of Computing
institution Universiti Teknologi Malaysia
collection UTM Institutional Repository
language English
topic QA75 Electronic computers
Computer science
spellingShingle QA75 Electronic computers
Computer science
Farhang, Yousef
Hybrid optimization for k-means clustering learning enhancement
description In recent years, combinational optimization issues are introduced as critical problems in clustering algorithms to partition data in a way that optimizes the performance of clustering. K-means algorithm is one of the famous and more popular clustering algorithms which can be simply implemented and it can easily solve the optimization issue with less extra information. But the problems associated with Kmeans algorithm are high error rate, high intra cluster distance and low accuracy. In this regard, researchers have worked to improve the problems computationally, creating efficient solutions that lead to better data analysis through the K-means clustering algorithm. The aim of this study is to improve the accuracy of the Kmeans algorithm using hybrid and meta-heuristic methods. To this end, a metaheuristic approach was proposed for the hybridization of K-means algorithm scheme. It obtained better results by developing a hybrid Genetic Algorithm-K-means (GAK- means) and a hybrid Partial Swarm Optimization-K-means (PSO-K-means) method. Finally, the meta-heuristic of Genetic Algorithm-Partial Swarm Optimization (GAPSO) and Partial Swarm Optimization-Genetic Algorithm (PSOGA) through the K-means algorithm were proposed. The study adopted a methodological approach to achieve the goal in three phases. First, it developed a hybrid GA-based K-means algorithm through a new crossover algorithm based on the range of attributes in order to decrease the number of errors and increase the accuracy rate. Then, a hybrid PSO-based K-means algorithm was mooted by a new calculation function based on the range of domain for decreasing intra-cluster distance and increasing the accuracy rate. Eventually, two meta-heuristic algorithms namely GAPSO-K-means and PSOGA-K-means algorithms were introduced by combining the proposed algorithms to increase the number of correct answers and improve the accuracy rate. The approach was evaluated using six integer standard data sets provided by the University of California Irvine (UCI). Findings confirmed that the hybrid optimization approach enhanced the performance of K-means clustering algorithm. Although both GA-K-means and PSO-K-means improved the result of K-means algorithm, GAPSO-K-means and PSOGA-K-means meta-heuristic algorithms outperformed the hybrid approaches. PSOGA-K-means resulted in 5%- 10% more accuracy for all data sets in comparison with other methods. The approach adopted in this study successfully increased the accuracy rate of the clustering analysis and decreased its error rate and intra-cluster distance.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Farhang, Yousef
author_facet Farhang, Yousef
author_sort Farhang, Yousef
title Hybrid optimization for k-means clustering learning enhancement
title_short Hybrid optimization for k-means clustering learning enhancement
title_full Hybrid optimization for k-means clustering learning enhancement
title_fullStr Hybrid optimization for k-means clustering learning enhancement
title_full_unstemmed Hybrid optimization for k-means clustering learning enhancement
title_sort hybrid optimization for k-means clustering learning enhancement
granting_institution Universiti Teknologi Malaysia, Faculty of Computing
granting_department Faculty of Computing
publishDate 2016
url http://eprints.utm.my/id/eprint/78635/1/YousefFarhangPFC2016.pdf
_version_ 1747818033937645568