Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli

Sentence boundary detection (SBD) or also known as sentence breaking decides where sentences begin and end. Sentence boundary detection is necessary in many applications, such as speech summarization, video summarization, speech document indexing and retrieval. This research describes sentence bound...

Full description

Saved in:
Bibliographic Details
Main Author: Ramli, Muhammad Izzad
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/47023/1/47023.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uitm-ir.47023
record_format uketd_dc
spelling my-uitm-ir.470232022-07-07T03:49:08Z Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli 2013 Ramli, Muhammad Izzad Response surfaces (Statistics) Evolutionary programming (Computer science). Genetic algorithms Web databases Sentence boundary detection (SBD) or also known as sentence breaking decides where sentences begin and end. Sentence boundary detection is necessary in many applications, such as speech summarization, video summarization, speech document indexing and retrieval. This research describes sentence boundary detection in spontaneous Malay language spoken audio. Spontaneous speech is a speech that is not planned or arranged beforehand. Related speech studies for spontaneous Malay language speech are still lacking and no work has been done on sentence boundary. Previous studies showed that combination of linguistic and acoustic approach for sentence boundary detection is able to provide better than using only one approach. However, linguistic model for Malay language is still not available, only acoustic approach is used for Malay language sentence boundary detection. Therefore, the combination of prosodic features with volume features and rate-of-speech (ROS) was proposed for sentence boundary detection of spontaneous speeches. The data used are from spontaneous speeches of Malaysian Parliament Hansard Document (MPHD). Experiments are conducted on 42 minutes of Malay language spontaneous speeches comprising of 6,413 speech and non-speech segments. Then, non-speech segments are selected as the candidates for the sentence boundary detection experimental data. The accuracy achieved for the proposed speech and non-speech detection method is 97.8% and the sentence boundary detection is 100% with false alert 19.44%. As the outcome, the proposed methods of sentence boundary detection using fusion of prosodic features, volume and rate-of-speech (ROS) and Adaboost managed to detect and label sentence boundary automatically. 2013 Thesis https://ir.uitm.edu.my/id/eprint/47023/ https://ir.uitm.edu.my/id/eprint/47023/1/47023.pdf text en public masters Universiti Teknologi MARA Faculty of Computer and Mathematical Sciences Jamil, Nursuriati (Assoc. Prof. Dr.)
institution Universiti Teknologi MARA
collection UiTM Institutional Repository
language English
advisor Jamil, Nursuriati (Assoc. Prof. Dr.)
topic Response surfaces (Statistics)
Response surfaces (Statistics)
Web databases
spellingShingle Response surfaces (Statistics)
Response surfaces (Statistics)
Web databases
Ramli, Muhammad Izzad
Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
description Sentence boundary detection (SBD) or also known as sentence breaking decides where sentences begin and end. Sentence boundary detection is necessary in many applications, such as speech summarization, video summarization, speech document indexing and retrieval. This research describes sentence boundary detection in spontaneous Malay language spoken audio. Spontaneous speech is a speech that is not planned or arranged beforehand. Related speech studies for spontaneous Malay language speech are still lacking and no work has been done on sentence boundary. Previous studies showed that combination of linguistic and acoustic approach for sentence boundary detection is able to provide better than using only one approach. However, linguistic model for Malay language is still not available, only acoustic approach is used for Malay language sentence boundary detection. Therefore, the combination of prosodic features with volume features and rate-of-speech (ROS) was proposed for sentence boundary detection of spontaneous speeches. The data used are from spontaneous speeches of Malaysian Parliament Hansard Document (MPHD). Experiments are conducted on 42 minutes of Malay language spontaneous speeches comprising of 6,413 speech and non-speech segments. Then, non-speech segments are selected as the candidates for the sentence boundary detection experimental data. The accuracy achieved for the proposed speech and non-speech detection method is 97.8% and the sentence boundary detection is 100% with false alert 19.44%. As the outcome, the proposed methods of sentence boundary detection using fusion of prosodic features, volume and rate-of-speech (ROS) and Adaboost managed to detect and label sentence boundary automatically.
format Thesis
qualification_level Master's degree
author Ramli, Muhammad Izzad
author_facet Ramli, Muhammad Izzad
author_sort Ramli, Muhammad Izzad
title Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
title_short Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
title_full Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
title_fullStr Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
title_full_unstemmed Automated sentence boundary detection for spontaneous speech in Malay language / Muhammad Izzad Ramli
title_sort automated sentence boundary detection for spontaneous speech in malay language / muhammad izzad ramli
granting_institution Universiti Teknologi MARA
granting_department Faculty of Computer and Mathematical Sciences
publishDate 2013
url https://ir.uitm.edu.my/id/eprint/47023/1/47023.pdf
_version_ 1783734776315772928