A simultaneous spam and phishing attack detection framework for short message service based on text mining approach

Short Messaging Service (SMS) is one type of many communication mediums that are used by scammers to send persuasive messages that will attract unwary recipients. In Malaysia, most sectors such as telecommunication, banking, government, healthcare, and private have taken the initiative to educate th...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd Foozy, Cik Feresa
Format: Thesis
Language:English
English
Published: 2017
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/20626/1/A%20Simultaneous%20Spam%20And%20Phishing%20Attack%20Detection%20Framework%20For%20Short%20Message%20Service%20Based%20On%20Text%20Mining%20Approach.pdf
http://eprints.utem.edu.my/id/eprint/20626/2/A%20simultaneous%20spam%20and%20phishing%20attack%20detection%20framework%20for%20short%20message%20service%20based%20on%20text%20mining%20approach.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utem-ep.20626
record_format uketd_dc
institution Universiti Teknikal Malaysia Melaka
collection UTeM Repository
language English
English
advisor Ahmad, Rabiah
topic Q Science (General)
QA Mathematics
spellingShingle Q Science (General)
QA Mathematics
Mohd Foozy, Cik Feresa
A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
description Short Messaging Service (SMS) is one type of many communication mediums that are used by scammers to send persuasive messages that will attract unwary recipients. In Malaysia, most sectors such as telecommunication, banking, government, healthcare, and private have taken the initiative to educate their clients about SMS scams. Unfortunately, many people still fall victim. Within the field of SMS detection, only the framework for a single attack detection for Spam has been studied. Phishing has never been studied. Existing detection frameworks are not suited to detect SMS Phishing because these attacks have their own specific behaviour and characteristic words. This gives rise to the need of producing a framework that is able to detect both attacks at the same time. This thesis addresses SMS Spam and Phishing attack detection framework development. 3 modules can be found in this framework, of which are Data Collection, Attack Profiling and Text Mining respectively. For Module 1, the data sets used in this research are from the UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The Phishing Rule-Based algorithm is used to extract SMS Phishing. For Module 2, the SMS Attack Profiling algorithm is used in order to produce SMS Spam and Phishing words. The Text Mining module consists of several phases such as Tokenization, Lemmatization, Feature Selection and Classifier. These phases are done with the use of Rapidminer and the Weka data mining tool. Three (3) types of features are used in this framework, which are the Generic Features, Payload Features and Hybrid Features. All of these features are examined and the resulting performance metric used to compare the results is the rate of True Positive (TP) and Accuracy (A). There are four (4) set of results that were successfully obtained from this research. The first result shows that the extraction of SMS Phishing from the SMS Spam class contributes to four (4) enhanced datasets of the UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The second results are the SMS Spam and Phishing attack profiling from the enhance UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The third and fourth results are obtained from Feature Selection and Classifier phase where Eighty (80) experiments were done to examine the Generic Feature, Payload Features and Hybrid Features. There are five (5) Classification techniques used such as Naive Bayes, K-NN, Decision Tree, Random Tree and Decision Stump. The result of Hybrid Feature accuracy using Rapidminer and Naive Bayes technique is 77.47%, for K-NN: 78.56%, Decision Tree: 57.16%, Random Tree: 57.24% and Decision Stump: 57.16%. Meanwhile, by using Weka the Naive Bayes accuracy rate get 71.45%, K-NN: 81.64%, Decision Tree: 57.10%, Random Tree: 70.64% and Decision Stump: 60.19%. The experiments done using Rapidminer and Weka data mining tool because this is the first survey to detect SMS Spam and Phishing attack at the same time and the results are acceptable. Additionally, the proposed framework also can detect the attack simultaneously using text mining approaches.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Mohd Foozy, Cik Feresa
author_facet Mohd Foozy, Cik Feresa
author_sort Mohd Foozy, Cik Feresa
title A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
title_short A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
title_full A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
title_fullStr A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
title_full_unstemmed A simultaneous spam and phishing attack detection framework for short message service based on text mining approach
title_sort simultaneous spam and phishing attack detection framework for short message service based on text mining approach
granting_institution Universiti Teknikal Malaysia Melaka
granting_department Faculty of Information and Communication Technology
publishDate 2017
url http://eprints.utem.edu.my/id/eprint/20626/1/A%20Simultaneous%20Spam%20And%20Phishing%20Attack%20Detection%20Framework%20For%20Short%20Message%20Service%20Based%20On%20Text%20Mining%20Approach.pdf
http://eprints.utem.edu.my/id/eprint/20626/2/A%20simultaneous%20spam%20and%20phishing%20attack%20detection%20framework%20for%20short%20message%20service%20based%20on%20text%20mining%20approach.pdf
_version_ 1747833988951572480
spelling my-utem-ep.206262022-04-20T12:03:00Z A simultaneous spam and phishing attack detection framework for short message service based on text mining approach 2017 Mohd Foozy, Cik Feresa Q Science (General) QA Mathematics Short Messaging Service (SMS) is one type of many communication mediums that are used by scammers to send persuasive messages that will attract unwary recipients. In Malaysia, most sectors such as telecommunication, banking, government, healthcare, and private have taken the initiative to educate their clients about SMS scams. Unfortunately, many people still fall victim. Within the field of SMS detection, only the framework for a single attack detection for Spam has been studied. Phishing has never been studied. Existing detection frameworks are not suited to detect SMS Phishing because these attacks have their own specific behaviour and characteristic words. This gives rise to the need of producing a framework that is able to detect both attacks at the same time. This thesis addresses SMS Spam and Phishing attack detection framework development. 3 modules can be found in this framework, of which are Data Collection, Attack Profiling and Text Mining respectively. For Module 1, the data sets used in this research are from the UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The Phishing Rule-Based algorithm is used to extract SMS Phishing. For Module 2, the SMS Attack Profiling algorithm is used in order to produce SMS Spam and Phishing words. The Text Mining module consists of several phases such as Tokenization, Lemmatization, Feature Selection and Classifier. These phases are done with the use of Rapidminer and the Weka data mining tool. Three (3) types of features are used in this framework, which are the Generic Features, Payload Features and Hybrid Features. All of these features are examined and the resulting performance metric used to compare the results is the rate of True Positive (TP) and Accuracy (A). There are four (4) set of results that were successfully obtained from this research. The first result shows that the extraction of SMS Phishing from the SMS Spam class contributes to four (4) enhanced datasets of the UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The second results are the SMS Spam and Phishing attack profiling from the enhance UCI Machine Learning Repository, the Dublin Institute of Technology (DIT), British English SMS and Malay SMS. The third and fourth results are obtained from Feature Selection and Classifier phase where Eighty (80) experiments were done to examine the Generic Feature, Payload Features and Hybrid Features. There are five (5) Classification techniques used such as Naive Bayes, K-NN, Decision Tree, Random Tree and Decision Stump. The result of Hybrid Feature accuracy using Rapidminer and Naive Bayes technique is 77.47%, for K-NN: 78.56%, Decision Tree: 57.16%, Random Tree: 57.24% and Decision Stump: 57.16%. Meanwhile, by using Weka the Naive Bayes accuracy rate get 71.45%, K-NN: 81.64%, Decision Tree: 57.10%, Random Tree: 70.64% and Decision Stump: 60.19%. The experiments done using Rapidminer and Weka data mining tool because this is the first survey to detect SMS Spam and Phishing attack at the same time and the results are acceptable. Additionally, the proposed framework also can detect the attack simultaneously using text mining approaches. 2017 Thesis http://eprints.utem.edu.my/id/eprint/20626/ http://eprints.utem.edu.my/id/eprint/20626/1/A%20Simultaneous%20Spam%20And%20Phishing%20Attack%20Detection%20Framework%20For%20Short%20Message%20Service%20Based%20On%20Text%20Mining%20Approach.pdf text en public http://eprints.utem.edu.my/id/eprint/20626/2/A%20simultaneous%20spam%20and%20phishing%20attack%20detection%20framework%20for%20short%20message%20service%20based%20on%20text%20mining%20approach.pdf text en validuser https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=107020 phd doctoral Universiti Teknikal Malaysia Melaka Faculty of Information and Communication Technology Ahmad, Rabiah