Text spam messages classification using Artificial Immune System (AIS) algorithms
The problem of spam messages is quite worrying especially for mobile users because statistics show increasing issues albeit many efforts have been introduced to reduce the risk of spam. Spammers chose SMS as their main target for spamming because SMS is considered as an important communication among...
Saved in:
Summary: | The problem of spam messages is quite worrying especially for mobile users because statistics show increasing issues albeit many efforts have been introduced to reduce the risk of spam. Spammers chose SMS as their main target for spamming because SMS is considered as an important communication among them. Problems such as inefficient algorithm, users awareness and high risk of spam are still dominating and challenging. Besides, the varieties of SMS spam sending by spammers giving us a question on the types of messages that are mostly sent by them. Having stated the aforementioned challenges, this research focuses on the second phase which is the classification (or known as clustering). The main objectives of this research are to study the relationship between Artificial Immune System (AIS) and Biology Immune System (BIS) related to spam detection, classification and severity determination, to propose an enhance method for clustering spam messages using the combination of Clonal Selection and Immune Network Theory and lastly to conduct and evaluate the proposed algorithms. A spam management model inspired from the ideology of BIS named Integrated Mobile Spam Model (IMSM) is introduced. This model consists of three phases which are detection, classification and severity determination, and each phase uses only AIS algorithms inspired from BIS. BIS has the capability to protect and defend the body from bacteria or virus that attacks us, so this theory can be applied to the mobile phone to protect from spam messages as well. Classification is the process to cluster spam messages into several groups. By doing this phase, it helps us to identify which group of spam messages that has higher occurrence and is always sent by spammers besides can help in the severity determination phase to determine the level of danger for spam messages. A new algorithm named "Hybrid Immune Clonal Network Algorithm" (HICNA) is proposed for clustering spam messages and this algorithm is a combination of Clonal Selection and Immune Network Theory. Three phases involved in this algorithm; phase one is scanning the spam messages using common keywords while phase two is using uncommon keywords. Expert judgement is needed for the last phase to ensure all spam messages are clustered into identified groups. A number of experiments have been conducted to test the performance and validity of the algorithm using different source of datasets and also to identify its usability in the detection process. The research results show that three defined objectives were fulfilled and the proposed algorithm gives better results in clustering spam messages into several groups. In addition, it shows the capability of AIS algorithm for the clustering process. |
---|