A stylometry approach for blind linguistic steganalysis model against translation-based steganography
Steganography is the art of hiding information in ways that prevent the detection of a secret message. In Translation-based Steganography (TBS), the secret messages are encoded in the “noise” made via translation of natural language text programmed. The adversarial technique to extract the secret me...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English English |
Published: |
2023
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/10995/1/24p%20SYIHAM%20MOHD%20LOKMAN.pdf http://eprints.uthm.edu.my/10995/2/SYIHAM%20MOHD%20LOKMAN%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/10995/3/SYIHAM%20MOHD%20LOKMAN%20WATERMARK.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-uthm-ep.10995 |
---|---|
record_format |
uketd_dc |
spelling |
my-uthm-ep.109952024-05-20T01:36:54Z A stylometry approach for blind linguistic steganalysis model against translation-based steganography 2023-02 Mohd Lokman, Syiham T Technology (General) Steganography is the art of hiding information in ways that prevent the detection of a secret message. In Translation-based Steganography (TBS), the secret messages are encoded in the “noise” made via translation of natural language text programmed. The adversarial technique to extract the secret message is called steganalysis, which can be categorized into two types; targeted vs. blind. While targeted steganalysis is designed to attack a specific embedding algorithm, blind steganalysis use features extracted or selection from the medium to detect any anomalies that indicate a possibility that a secret data has been embedded within the medium. However, accuracy of blind steganalysis algorithms highly depend on the features selected from the input data especially when attacking embedding techniques in TBS. This thesis explore the potential of using stylometry or linguistic style to improve the representation of characteristics among the word distribution in distinguishing the stego text from the cover text for TBS. This is because all translated in TBS text have an intrinsic structural styles that can be used to improve the performance of a blind steganalysis model. The proposed stylometry-based blind steganalysis model consists of two stages, which are stylometric feature selection and classification. The proposed stylometric features selected from a set of cover text are categorized into two group features; lexical and syntactic features before implemented into the model Support Vector Machine (SVM) as the classifier. The performance of the stylometry-based blind steganalysis model is then evaluated based on all false rate, missing rate and accuracy rate and compared against three other standard classifiers in steganalysis; Naive Bayes (NB), k-Nearest Neighbor (k-NN), and Decision Tree (J48). The results showed that the stylometric features are impactful to a blind steganalysis model by giving higher detection performance. Meanwhile, SVM is the best classifier for stego text detection with significantly low processing time performance 2023-02 Thesis http://eprints.uthm.edu.my/10995/ http://eprints.uthm.edu.my/10995/1/24p%20SYIHAM%20MOHD%20LOKMAN.pdf text en public http://eprints.uthm.edu.my/10995/2/SYIHAM%20MOHD%20LOKMAN%20COPYRIGHT%20DECLARATION.pdf text en staffonly http://eprints.uthm.edu.my/10995/3/SYIHAM%20MOHD%20LOKMAN%20WATERMARK.pdf text en validuser mphil masters Universiti Tun Hussein Onn Malaysia Fakulti Sains Komputer dan Teknologi Maklumat |
institution |
Universiti Tun Hussein Onn Malaysia |
collection |
UTHM Institutional Repository |
language |
English English English |
topic |
T Technology (General) |
spellingShingle |
T Technology (General) Mohd Lokman, Syiham A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
description |
Steganography is the art of hiding information in ways that prevent the detection of a secret message. In Translation-based Steganography (TBS), the secret messages are encoded in the “noise” made via translation of natural language text programmed. The adversarial technique to extract the secret message is called steganalysis, which can be categorized into two types; targeted vs. blind. While targeted steganalysis is designed to attack a specific embedding algorithm, blind steganalysis use features extracted or selection from the medium to detect any anomalies that indicate a possibility that a secret data has been embedded within the medium. However, accuracy of blind steganalysis algorithms highly depend on the features selected from the input data especially when attacking embedding techniques in TBS. This thesis explore the potential of using stylometry or linguistic style to improve the representation of characteristics among the word distribution in distinguishing the stego text from the cover text for TBS. This is because all translated in TBS text have an intrinsic structural styles that can be used to improve the performance of a blind steganalysis model. The proposed stylometry-based blind steganalysis model consists of two stages, which are stylometric feature selection and classification. The proposed stylometric features selected from a set of cover text are categorized into two group features; lexical and syntactic features before implemented into the model Support Vector Machine (SVM) as the classifier. The performance of the stylometry-based blind steganalysis model is then evaluated based on all false rate, missing rate and accuracy rate and compared against three other standard classifiers in steganalysis; Naive Bayes (NB), k-Nearest Neighbor (k-NN), and Decision Tree (J48). The results showed that the stylometric features are impactful to a blind steganalysis model by giving higher detection performance. Meanwhile, SVM is the best classifier for stego text detection with significantly low processing time performance |
format |
Thesis |
qualification_name |
Master of Philosophy (M.Phil.) |
qualification_level |
Master's degree |
author |
Mohd Lokman, Syiham |
author_facet |
Mohd Lokman, Syiham |
author_sort |
Mohd Lokman, Syiham |
title |
A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
title_short |
A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
title_full |
A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
title_fullStr |
A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
title_full_unstemmed |
A stylometry approach for blind linguistic steganalysis model against translation-based steganography |
title_sort |
stylometry approach for blind linguistic steganalysis model against translation-based steganography |
granting_institution |
Universiti Tun Hussein Onn Malaysia |
granting_department |
Fakulti Sains Komputer dan Teknologi Maklumat |
publishDate |
2023 |
url |
http://eprints.uthm.edu.my/10995/1/24p%20SYIHAM%20MOHD%20LOKMAN.pdf http://eprints.uthm.edu.my/10995/2/SYIHAM%20MOHD%20LOKMAN%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/10995/3/SYIHAM%20MOHD%20LOKMAN%20WATERMARK.pdf |
_version_ |
1804890133162360832 |