Improved clustering using robust and classical principal component

k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set...

Full description

Saved in:

Bibliographic Details
Main Author:	Hassn, Ahmed Kadom
Format:	Thesis
Language:	English
Published:	2017
Subjects:	Algorithms
Online Access:	http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-upm-ir.70922
record_format	uketd_dc
spelling	my-upm-ir.709222022-07-07T03:07:15Z Improved clustering using robust and classical principal component 2017-06 Hassn, Ahmed Kadom k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA. Algorithms 2017-06 Thesis http://psasir.upm.edu.my/id/eprint/70922/ http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf text en public masters Universiti Putra Malaysia Algorithms Fitrianto, Anwar
institution	Universiti Putra Malaysia
collection	PSAS Institutional Repository
language	English
advisor	Fitrianto, Anwar
topic	Algorithms
spellingShingle	Algorithms Hassn, Ahmed Kadom Improved clustering using robust and classical principal component
description	k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA.
format	Thesis
qualification_level	Master's degree
author	Hassn, Ahmed Kadom
author_facet	Hassn, Ahmed Kadom
author_sort	Hassn, Ahmed Kadom
title	Improved clustering using robust and classical principal component
title_short	Improved clustering using robust and classical principal component
title_full	Improved clustering using robust and classical principal component
title_fullStr	Improved clustering using robust and classical principal component
title_full_unstemmed	Improved clustering using robust and classical principal component
title_sort	improved clustering using robust and classical principal component
granting_institution	Universiti Putra Malaysia
publishDate	2017
url	http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf
_version_	1747812937331900416

Improved clustering using robust and classical principal component

Similar Items