Multitasking deep neural network models for Arabic dialect sentiment analysis

Polarity classification or sentiment analysis is considered one of the opinion mining tasks which distinguishes between the polarities categories (two, three, and five) of opinions which focus on the degree of the sentiment (such as positive and negative for two polarities; and positive, neutral...

Full description

Saved in:

Bibliographic Details
Main Author:	Alali, Muath Mohammad Oqlah
Format:	Thesis
Language:	English
Published:	2022
Subjects:	Arabic language - Dialects Text processing (Computer science). Deep learning (Machine learning).
Online Access:	http://psasir.upm.edu.my/id/eprint/113149/1/113149%20UPM.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-upm-ir.113149
record_format	uketd_dc
institution	Universiti Putra Malaysia
collection	PSAS Institutional Repository
language	English
advisor	Mohd Sharef, Nurfadhlina
topic	Arabic language - Dialects Text processing (Computer science). Deep learning (Machine learning).
spellingShingle	Arabic language - Dialects Text processing (Computer science). Deep learning (Machine learning). Alali, Muath Mohammad Oqlah Multitasking deep neural network models for Arabic dialect sentiment analysis
description	Polarity classification or sentiment analysis is considered one of the opinion mining tasks which distinguishes between the polarities categories (two, three, and five) of opinions which focus on the degree of the sentiment (such as positive and negative for two polarities; and positive, neutral and negative for three polarities) that the text may contain. Limited deep neural network approaches are applied to this task for Arabic dialects (AD). On the other hand, traditional machine learning algorithms (ML) that are based on manually extracted features are considered tedious and time dunting, as Arabic language contains multiple dialects and no word-based order. Therefore, the process of extracting features such as syntactic and lexical information is more challenging for AD. According to the literature review, the best registered performance and most used deep learning model for Arabic sentiment analysis was Convolutional Neural Network (CNN). The existing convolutional network models are based on wide convolutional with shallow structure that represents less uniform importance to the features, which is not capable of representing the entire sentiment information in text sequence and leads to poor sentiment information detection. Therefore, a Narrow Convolutional Neural Network (NCNN) is proposed to extract comprehensive sentiment information of text sequence by maximizing the feature detection range, which gives large uniform importance to the words and improves the final performance for Arabic dialect classification tasks (two and three polarities). NCNN achieves its optimum performance when structured by three convolutional layers. Sensitivity analysis is conducted to evaluate the impact of various combinations of NCNN structural hyperparameters, such as the size of pooling, filters, and the number of convolutional filters on the classification performances. The proposed NCNN achieved a higher macro average recall (R) and outperforms Naive Bayes (NB) on task A (three polarities) and Voting model on task B (two polarities) on the SemEval-2017 Arabic dialect Twitter dataset. In addition, the NCNN model outperforms CNN-ASWAR on Arabic Sentiment Tweets Dataset (ASTD) with higher F1-score. The negation words in the Arabic language plays a significant role in SA. Negation words may cause a sentence's context to be reversed. So far, there has been no effort to handle the negation context in Arabic using a deep neural network. The existing approaches are based on traditional machine learning algorithms, such as support vector machine (SVM). However, these approaches did not consider Arabic dialect negation words. In addition, these approaches are based on domain specific features and lexicons, which might not work with other domains. Ordinal (five polarities) classification problem has received attention in Arabic sentiment analysis. Most of the applied approaches are based on single task learning (STL) using machine learning algorithms, such as Logistic Regression (LR) and Hierarchical Classifier (HC) based on the divide-and-conquer approach. However, these approaches are based on simple sentence representation. Moreover, these models are based on single task learning (STL) and lack the ability to learn the relativity between different tasks (cross-task transfer) and modelling several polarities jointly, such as three and five polarities. Therefore, a model called Multi-Tasking Learning based on Convolutional Hierarchical Attention Neural Network (MTL-CHAN) is proposed, comprising of (i) shared word encoder and word attention networks across classification tasks, (ii) task-specific layers with convolutional neural network-based attention (CNNA) on sentence-level; to handle the Arabic explicit negation words and improve the classification performance by training Arabic classification tasks (binary, ternary, and five) jointly. The experimental results showed outstanding performance of the proposed MTL-CHAN model, with high accuracy of 89.85%, 84.69%, 85.90 on HARD, LABR, and BRAD datasets, respectively, and higher macro average recall (R) of 0.680% and 0.810% on Twitter Arabic dialects datasets task A and B respectively. Also, the proposed model achieved higher accuracy of 95.25%, 87.75%, 86.01%, 90.95% on Hotel, Product, Movie, and Restaurant datasets, respectively.
format	Thesis
qualification_level	Doctorate
author	Alali, Muath Mohammad Oqlah
author_facet	Alali, Muath Mohammad Oqlah
author_sort	Alali, Muath Mohammad Oqlah
title	Multitasking deep neural network models for Arabic dialect sentiment analysis
title_short	Multitasking deep neural network models for Arabic dialect sentiment analysis
title_full	Multitasking deep neural network models for Arabic dialect sentiment analysis
title_fullStr	Multitasking deep neural network models for Arabic dialect sentiment analysis
title_full_unstemmed	Multitasking deep neural network models for Arabic dialect sentiment analysis
title_sort	multitasking deep neural network models for arabic dialect sentiment analysis
granting_institution	Universiti Putra Malaysia
publishDate	2022
url	http://psasir.upm.edu.my/id/eprint/113149/1/113149%20UPM.pdf
_version_	1818586142951342080
spelling	my-upm-ir.1131492024-10-28T02:53:32Z Multitasking deep neural network models for Arabic dialect sentiment analysis 2022-08 Alali, Muath Mohammad Oqlah Polarity classification or sentiment analysis is considered one of the opinion mining tasks which distinguishes between the polarities categories (two, three, and five) of opinions which focus on the degree of the sentiment (such as positive and negative for two polarities; and positive, neutral and negative for three polarities) that the text may contain. Limited deep neural network approaches are applied to this task for Arabic dialects (AD). On the other hand, traditional machine learning algorithms (ML) that are based on manually extracted features are considered tedious and time dunting, as Arabic language contains multiple dialects and no word-based order. Therefore, the process of extracting features such as syntactic and lexical information is more challenging for AD. According to the literature review, the best registered performance and most used deep learning model for Arabic sentiment analysis was Convolutional Neural Network (CNN). The existing convolutional network models are based on wide convolutional with shallow structure that represents less uniform importance to the features, which is not capable of representing the entire sentiment information in text sequence and leads to poor sentiment information detection. Therefore, a Narrow Convolutional Neural Network (NCNN) is proposed to extract comprehensive sentiment information of text sequence by maximizing the feature detection range, which gives large uniform importance to the words and improves the final performance for Arabic dialect classification tasks (two and three polarities). NCNN achieves its optimum performance when structured by three convolutional layers. Sensitivity analysis is conducted to evaluate the impact of various combinations of NCNN structural hyperparameters, such as the size of pooling, filters, and the number of convolutional filters on the classification performances. The proposed NCNN achieved a higher macro average recall (R) and outperforms Naive Bayes (NB) on task A (three polarities) and Voting model on task B (two polarities) on the SemEval-2017 Arabic dialect Twitter dataset. In addition, the NCNN model outperforms CNN-ASWAR on Arabic Sentiment Tweets Dataset (ASTD) with higher F1-score. The negation words in the Arabic language plays a significant role in SA. Negation words may cause a sentence's context to be reversed. So far, there has been no effort to handle the negation context in Arabic using a deep neural network. The existing approaches are based on traditional machine learning algorithms, such as support vector machine (SVM). However, these approaches did not consider Arabic dialect negation words. In addition, these approaches are based on domain specific features and lexicons, which might not work with other domains. Ordinal (five polarities) classification problem has received attention in Arabic sentiment analysis. Most of the applied approaches are based on single task learning (STL) using machine learning algorithms, such as Logistic Regression (LR) and Hierarchical Classifier (HC) based on the divide-and-conquer approach. However, these approaches are based on simple sentence representation. Moreover, these models are based on single task learning (STL) and lack the ability to learn the relativity between different tasks (cross-task transfer) and modelling several polarities jointly, such as three and five polarities. Therefore, a model called Multi-Tasking Learning based on Convolutional Hierarchical Attention Neural Network (MTL-CHAN) is proposed, comprising of (i) shared word encoder and word attention networks across classification tasks, (ii) task-specific layers with convolutional neural network-based attention (CNNA) on sentence-level; to handle the Arabic explicit negation words and improve the classification performance by training Arabic classification tasks (binary, ternary, and five) jointly. The experimental results showed outstanding performance of the proposed MTL-CHAN model, with high accuracy of 89.85%, 84.69%, 85.90 on HARD, LABR, and BRAD datasets, respectively, and higher macro average recall (R) of 0.680% and 0.810% on Twitter Arabic dialects datasets task A and B respectively. Also, the proposed model achieved higher accuracy of 95.25%, 87.75%, 86.01%, 90.95% on Hotel, Product, Movie, and Restaurant datasets, respectively. Arabic language - Dialects Text processing (Computer science). Deep learning (Machine learning). 2022-08 Thesis http://psasir.upm.edu.my/id/eprint/113149/ http://psasir.upm.edu.my/id/eprint/113149/1/113149%20UPM.pdf text en public doctoral Universiti Putra Malaysia Arabic language - Dialects Text processing (Computer science). Deep learning (Machine learning). Mohd Sharef, Nurfadhlina

Multitasking deep neural network models for Arabic dialect sentiment analysis

Similar Items