Enhancement of new smooth support vector machines for classification problems

Research on Smooth Support Vector Machine (SSVM) for classification problem is an active field in data mining. SSVM is reformulation of standard Support Vector Machines (SVM). In SSVM, smoothing technique must be applied to convert constraint optimization to the unconstraint optimization problem sin...

Full description

Saved in:
Bibliographic Details
Main Author: Santi Wulan, Purnami
Format: Thesis
Language:English
Published: 2011
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/21942/19/Enhancement%20of%20new%20smooth%20support%20vector%20machines%20for%20classification%20problems.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-ump-ir.21942
record_format uketd_dc
spelling my-ump-ir.219422022-01-10T23:52:24Z Enhancement of new smooth support vector machines for classification problems 2011-06 Santi Wulan, Purnami Q Science (General) Research on Smooth Support Vector Machine (SSVM) for classification problem is an active field in data mining. SSVM is reformulation of standard Support Vector Machines (SVM). In SSVM, smoothing technique must be applied to convert constraint optimization to the unconstraint optimization problem since the objective function of this unconstraint optimization is not twice differentiable. The smooth function is used to replace the plus function to obtain a smooth support vector machine (SSVM). To get more accuracy performance, Multiple Knot Spline SSVM (MKS-SSVM) is proposed. MKS-SSVM is a new SSVM which used multiple knot spline function to approximate the plus function instead the integral sigmoid function in SSVM. To obtain optimal accuracy results, Uniform Design method is used to select parameter. The performance of the method is evaluated using 10-fold cross validation accuracy, confusion matrix, sensitivity and specificity. To evaluate the effectiveness of our method, an experiment is carried out on four medical dataset, i.e. Pima Indian diabetes dataset, heart disease, breast cancer prognosis, and breast cancer diagnosis. The results of this study showed that MKS-SSVM was effective to diagnose medical dataset and this is promising results compared to the previously reported results. SSVM algorithms are developed for binary classification. However, in many real problems data points are discriminated into multiple categories. Hence, MKS-SSVM is extended for multiclass classification. Two popular multiclass classification methods One against All (OAA) and One against One (OAO)) were used to extend MKS-SSVM. Numerical experiments show that the classification accuracy of OAA and OAO method are competitive with each other and there is no clear superiority of one method over another. While the computation time, the OAO method is lower than the OAA method on three dataset. This indicated that the OAO method is usually more efficient than the OAA. In the final part, the reduced support vector machine (RSVM) was proposed to solve computational difficulties of SSVM in large dataset. To generate representative reduce set for RSVM, clustering reduced support vector machine (CRSVM) had been proposed. However, CRSVM is restricted to solve classification problems for large dataset with numeric attributes. In this research, an alternative algorithm, k-mode RSVM (KMo-RSVM) that combines RSVM and k-mode clustering technique to handle classification problems on categorical large dataset and k-prototype RSVM (KPro-RSVM) which combine k-prototype and RSVM to classify large dataset with mixed attributes were proposed. In our experiments, the effectiveness of KMo-RSVM is tested on four public available dataset. It turns out that KMo-RSVM can improve speed of running time significantly than SSVM and still obtained a high accuracy. Comparison with RSVM indicates that KMo-RSVM is faster, gets smaller reduced set and comparable testing accuracy than RSVM. From experiments on three public dataset also show that KPro- RSVM can tremendously reduces the computational time and can handling classification for large mixed dataset, when the SSVM method ran out of memory (in case: census dataset). The comparison with RSVM indicate that the computational time of KPro RSVM less than RSVM method, and obtained testing accuracy of KPro-RSVM a little decrease than RSVM. 2011-06 Thesis http://umpir.ump.edu.my/id/eprint/21942/ http://umpir.ump.edu.my/id/eprint/21942/19/Enhancement%20of%20new%20smooth%20support%20vector%20machines%20for%20classification%20problems.pdf pdf en public phd doctoral Universiti Malaysia Pahang Faculty of Computer System & Software Engineering
institution Universiti Malaysia Pahang Al-Sultan Abdullah
collection UMPSA Institutional Repository
language English
topic Q Science (General)
spellingShingle Q Science (General)
Santi Wulan, Purnami
Enhancement of new smooth support vector machines for classification problems
description Research on Smooth Support Vector Machine (SSVM) for classification problem is an active field in data mining. SSVM is reformulation of standard Support Vector Machines (SVM). In SSVM, smoothing technique must be applied to convert constraint optimization to the unconstraint optimization problem since the objective function of this unconstraint optimization is not twice differentiable. The smooth function is used to replace the plus function to obtain a smooth support vector machine (SSVM). To get more accuracy performance, Multiple Knot Spline SSVM (MKS-SSVM) is proposed. MKS-SSVM is a new SSVM which used multiple knot spline function to approximate the plus function instead the integral sigmoid function in SSVM. To obtain optimal accuracy results, Uniform Design method is used to select parameter. The performance of the method is evaluated using 10-fold cross validation accuracy, confusion matrix, sensitivity and specificity. To evaluate the effectiveness of our method, an experiment is carried out on four medical dataset, i.e. Pima Indian diabetes dataset, heart disease, breast cancer prognosis, and breast cancer diagnosis. The results of this study showed that MKS-SSVM was effective to diagnose medical dataset and this is promising results compared to the previously reported results. SSVM algorithms are developed for binary classification. However, in many real problems data points are discriminated into multiple categories. Hence, MKS-SSVM is extended for multiclass classification. Two popular multiclass classification methods One against All (OAA) and One against One (OAO)) were used to extend MKS-SSVM. Numerical experiments show that the classification accuracy of OAA and OAO method are competitive with each other and there is no clear superiority of one method over another. While the computation time, the OAO method is lower than the OAA method on three dataset. This indicated that the OAO method is usually more efficient than the OAA. In the final part, the reduced support vector machine (RSVM) was proposed to solve computational difficulties of SSVM in large dataset. To generate representative reduce set for RSVM, clustering reduced support vector machine (CRSVM) had been proposed. However, CRSVM is restricted to solve classification problems for large dataset with numeric attributes. In this research, an alternative algorithm, k-mode RSVM (KMo-RSVM) that combines RSVM and k-mode clustering technique to handle classification problems on categorical large dataset and k-prototype RSVM (KPro-RSVM) which combine k-prototype and RSVM to classify large dataset with mixed attributes were proposed. In our experiments, the effectiveness of KMo-RSVM is tested on four public available dataset. It turns out that KMo-RSVM can improve speed of running time significantly than SSVM and still obtained a high accuracy. Comparison with RSVM indicates that KMo-RSVM is faster, gets smaller reduced set and comparable testing accuracy than RSVM. From experiments on three public dataset also show that KPro- RSVM can tremendously reduces the computational time and can handling classification for large mixed dataset, when the SSVM method ran out of memory (in case: census dataset). The comparison with RSVM indicate that the computational time of KPro RSVM less than RSVM method, and obtained testing accuracy of KPro-RSVM a little decrease than RSVM.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Santi Wulan, Purnami
author_facet Santi Wulan, Purnami
author_sort Santi Wulan, Purnami
title Enhancement of new smooth support vector machines for classification problems
title_short Enhancement of new smooth support vector machines for classification problems
title_full Enhancement of new smooth support vector machines for classification problems
title_fullStr Enhancement of new smooth support vector machines for classification problems
title_full_unstemmed Enhancement of new smooth support vector machines for classification problems
title_sort enhancement of new smooth support vector machines for classification problems
granting_institution Universiti Malaysia Pahang
granting_department Faculty of Computer System & Software Engineering
publishDate 2011
url http://umpir.ump.edu.my/id/eprint/21942/19/Enhancement%20of%20new%20smooth%20support%20vector%20machines%20for%20classification%20problems.pdf
_version_ 1783732053563408384