Sentiment classification for malay newspaper using clonal selection algorithm / Nur Fitri Nabila Mohamad Nasir

Sentiment classification is technique to analyze the subjective information in the text then mine the opinion. Mostly people are using blog or twitter to collect the sentiment data but not frequently used newspaper because not so many researchers are using newspaper to classify sentiment data as the...

Full description

Saved in:
Bibliographic Details
Main Author: Mohamad Nasir, Nur Fitri Nabila
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/35333/1/35333.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sentiment classification is technique to analyze the subjective information in the text then mine the opinion. Mostly people are using blog or twitter to collect the sentiment data but not frequently used newspaper because not so many researchers are using newspaper to classify sentiment data as the main source. In this study, sentiment classifier using clonal algorithm selection was developed to categorize sentiment in Malay newspaper (Berita Harian). Another objective was to evaluate the proposed model effectiveness in classifying Malay newspaper’s data. In our method, the training of clonal selection algorithm (CSA) is first used to teach algorithm which is intelligent to categorize the sentiment in newspaper’s sentences into the polarity (positive, negative and neutraljfrom the data are collected and the testing was implemented after did the training to test whether a word should be taught correctly or not. Firstly, the data was dividing by ratio 80:20 from 1000 sentences. Therefore, 80% from 1000 sentences will use for training and 20% from 1000 sentences use for testing. Secondly, the data was dividing by ratio 70:30 which are 700 newspaper’s sentences as the training data and 300 newspaper’s sentences as the testing data. The experimental results show that our method can achieve better performance in clonal selection algorithm sentiment classification and the data collected cannot be used at once in this model because training data is very time-consuming if using all the data. The experiment achieves the best accuracy at 89.0%for ratio 70:30.This model was built with capability to help user in classifying newspaper sentence in easy way.