Sentiment analysis for malay newspaper (SAMNews) using negative selection algorithm / Nur Amalina Redzuan
Newspapers express sentiments during reporting on recent events every day. It also can be known as a new domain in textual type for sentiment analysis that deals with many suggestions. The newspaper is documented in long sentences but the contents that express the sentiment is compressed and clearly...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/35332/1/35332.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Newspapers express sentiments during reporting on recent events every day. It also can be known as a new domain in textual type for sentiment analysis that deals with many suggestions. The newspaper is documented in long sentences but the contents that express the sentiment is compressed and clearly understand by human. However, for the machine learning text representation it cause some problems because of the noisy text. As the solution, this project is conducted on the purpose to determine the polarity of the sentiment in the newspapers sentences. This project is implemented based on five phases in methodology part which consists of background study, data collection and preparation, prototype design, prototype development and evaluation and dociunentation. Sentiment Analysis for Malay Newspaper (SAMNews) is constructed using the negative selection algorithm which is able to classify the sentiment in newspaper’s sentences into the polarity (positive, negative or neutral) automatically repose on detectors word. The sentiment analysis in this project utilized 1000 newspaper’s sentences for the training and classification phase and testing data to evaluate the average of accurateness. The evaluation is made on three experiments which in Experiment I used 700 newspaper’s sentences as the training data and 300 newspaper’s sentences as the testing data. The accuracy of this experiment is about 59.99%. In Experiment II, 800 newspaper’s sentences and 200 newspaper’s sentences are used as the training data and testing data. The accuracy of tins experiment is increased about 58.58%. While in Experiment III used 900 newspaper’s sentences as the training data and 100 newspaper’s sentences as the testing data and the accuracy is unproved to 65.81%. In future, a comparative study on Artificial Immune System and other techniques or algorithms can be carried out to enhance the performance of the classification model. |
---|