Semantic feature selection for spam filtering
(Spam or unsolicited e-mail could be in a form of advertisement, product promotions, etc. It has become a key problem for e-mail users. Due to this, spam filtering has become a major research attention. In this research, spam filtering is explored based on semantic feature selection. Here, the W...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/14865/1/Azlina%20Narawi%20ft.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-unimas-ir.14865 |
---|---|
record_format |
uketd_dc |
spelling |
my-unimas-ir.148652023-03-29T03:06:31Z Semantic feature selection for spam filtering 2010 Azlina, Narawi T Technology (General) (Spam or unsolicited e-mail could be in a form of advertisement, product promotions, etc. It has become a key problem for e-mail users. Due to this, spam filtering has become a major research attention. In this research, spam filtering is explored based on semantic feature selection. Here, the Wordnet-based approach is employed with statistical approaches used for the purpose of comparison. In further enhancing the task, another technique using distributed clustering has been proposed for identifying meaningful words for characterization) A series of experiments were conducted. The results show that the WordNet-based approach is able to select more meaningful features as compared to statistical approaches. The WordNet-based approach has the ability to achieve great dimensionality. A reduction of 72.9 % and 49.2% for the non-spam and spam categories was achieved respectively. Pruning of features by incorporating distributed clustering enhanced performance significantly. A new framework for semantics filtering was proposed as a result with distinct features in Spam and non-spam e-mail documents were determined. The promising results achieved, show that this approach can be further explored on other datasets or applications. Universiti Malaysia Sarawak, (UNIMAS) 2010 Thesis http://ir.unimas.my/id/eprint/14865/ http://ir.unimas.my/id/eprint/14865/1/Azlina%20Narawi%20ft.pdf text en validuser masters Universiti Malaysia Sarawak Faculty of Computer science and Information Technology |
institution |
Universiti Malaysia Sarawak |
collection |
UNIMAS Institutional Repository |
language |
English |
topic |
T Technology (General) |
spellingShingle |
T Technology (General) Azlina, Narawi Semantic feature selection for spam filtering |
description |
(Spam
or unsolicited e-mail could be in a form of advertisement, product promotions,
etc. It has become a key problem for e-mail users. Due to this, spam filtering has
become a major research attention. In this research, spam filtering is explored based
on semantic feature selection. Here, the Wordnet-based approach is employed with
statistical approaches used for the purpose of comparison. In further enhancing the
task, another technique using distributed clustering has been proposed for
identifying meaningful words for characterization) A series of experiments were
conducted. The results show that the WordNet-based approach is able to select more
meaningful features as compared to statistical approaches. The WordNet-based
approach has the ability to achieve great dimensionality. A reduction of 72.9 % and
49.2% for the non-spam and spam categories was achieved respectively. Pruning of
features by incorporating distributed clustering enhanced performance significantly.
A new framework for semantics filtering was proposed as a result with distinct
features in Spam and non-spam e-mail documents were determined. The promising
results achieved, show that this approach can be further explored on other datasets
or applications. |
format |
Thesis |
qualification_level |
Master's degree |
author |
Azlina, Narawi |
author_facet |
Azlina, Narawi |
author_sort |
Azlina, Narawi |
title |
Semantic feature selection for spam filtering |
title_short |
Semantic feature selection for spam filtering |
title_full |
Semantic feature selection for spam filtering |
title_fullStr |
Semantic feature selection for spam filtering |
title_full_unstemmed |
Semantic feature selection for spam filtering |
title_sort |
semantic feature selection for spam filtering |
granting_institution |
Universiti Malaysia Sarawak |
granting_department |
Faculty of Computer science and Information Technology |
publishDate |
2010 |
url |
http://ir.unimas.my/id/eprint/14865/1/Azlina%20Narawi%20ft.pdf |
_version_ |
1783728172031803392 |