Comparison of different automatic text summarization systems using standard performance evaluations

There are many automatic summarization systems can be used to produce a summary from a single text documents. From the different automatic summarization system, it can be found that the system will produce a different content of summary results although the percentage of sentences out of whole singl...

全面介紹

Saved in:
書目詳細資料
主要作者: Abd Munir, Nur Hafizah
格式: Thesis
語言:English
出版: 2009
主題:
在線閱讀:http://eprints.utm.my/id/eprint/18202/1/NurhafizahAbdMunirMFSKSM2009.pdf
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:There are many automatic summarization systems can be used to produce a summary from a single text documents. From the different automatic summarization system, it can be found that the system will produce a different content of summary results although the percentage of sentences out of whole single text document is setting to the same value. Therefore, in this study, three automatic summarization systems are used to produce the summary results; Microsoft Word Automatic Summarization, Shvoong Summarization and Simple Text Summarization in PHP. The performance of those results are investigated and measured using standard performance evaluation such recall, precision and f-measure. The dataset collection used in this study is collected from The New Straits Time and The Stars online and it is about Iskandar Region Development Authority (IRDA). Two automatic summarization system are already existed which is Microsoft Word Automatic Summarization and Shvoong Summarization and only one summarization system is coded in PHP language, there is Simple Text Summarization in PHP. Many operations have been applied in this coded system such as removing stop word, stemming, normalizing, creating weighted term-frequency and applying the technique. The results from those systems are stored into the database. In this study, about 50 articles are used. The comparison between different automatic summarization systems was made using standard performance evaluation. The performance evaluation is fully analyzed without depending on human evaluator. One program of analyzing the performance is coded in PERL language to produce a statistic of all summary results from those three automatic summarization systems. From the experimental results, it can be concluded that the Shvoong Summarization is the most effective automatic summarization system for single text document.