Morphological System For Under-Resourced Languages Using Hybrid Approach

Computational morphology covers the automatic analysis (recognition of the internal structure) and generation (formation of a word) of words. As such, it is an ineluctable step in many Natural Language Processing (NLP) applications. Over the last thirty years, the computational morphology area has b...

Full description

Saved in:
Bibliographic Details
Main Author: Saee, Suhaila
Format: Thesis
Published: 2016
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-mmu-ep.7188
record_format uketd_dc
spelling my-mmu-ep.71882018-07-13T16:47:42Z Morphological System For Under-Resourced Languages Using Hybrid Approach 2016-10 Saee, Suhaila QP Physiology Computational morphology covers the automatic analysis (recognition of the internal structure) and generation (formation of a word) of words. As such, it is an ineluctable step in many Natural Language Processing (NLP) applications. Over the last thirty years, the computational morphology area has been dominated by the finite state approach. This approach makes use of finite state transducer as its internal representation to describe the morphological information at the lexical and surface levels. Finite state morphology has been claimed as a language-independent component and is capable of handling all complex languages such as Finnish, Turkish, and Arabic, including under-resourced languages (U-RL). However, its requirements for a large amount of linguistic resources, which are morphological rules, lexicon, and language experts, lead to its limitations. Indeed, these limitations have resulted in the issues of the U-RL in computational morphology. The issues comprise the morphological data acquisition, language morphology representation, and general rule formalism. Hence, a morphological system that would be able to overcome these issues is needed especially when dealing with U-RL. In this research, there are two main issues to be highlighted: i) a workflow of the morphological system that can be used with the U-RL and ii) the internal representation of morphological information that complies with the selected framework, that is the Structured String Tree Correspondence (SSTC). The aim of this research is to propose a new Structured String Tree Correpondence+Morphology (SSTC+M) framework for constructing a morphological system for U-RL. Three primary levels are designed in the proposed framework, namely, morphological data acquisition (level 1), morphology theory adaptation (level 2), and computational morphology adaptation (level 3). The output of each level will be the input to the next level. 2016-10 Thesis http://shdl.mmu.edu.my/7188/ http://library.mmu.edu.my/diglib/onlinedb/dig_lib.php phd doctoral Multimedia University Faculty of Computing and Informatics
institution Multimedia University
collection MMU Institutional Repository
topic QP Physiology
spellingShingle QP Physiology
Saee, Suhaila
Morphological System For Under-Resourced Languages Using Hybrid Approach
description Computational morphology covers the automatic analysis (recognition of the internal structure) and generation (formation of a word) of words. As such, it is an ineluctable step in many Natural Language Processing (NLP) applications. Over the last thirty years, the computational morphology area has been dominated by the finite state approach. This approach makes use of finite state transducer as its internal representation to describe the morphological information at the lexical and surface levels. Finite state morphology has been claimed as a language-independent component and is capable of handling all complex languages such as Finnish, Turkish, and Arabic, including under-resourced languages (U-RL). However, its requirements for a large amount of linguistic resources, which are morphological rules, lexicon, and language experts, lead to its limitations. Indeed, these limitations have resulted in the issues of the U-RL in computational morphology. The issues comprise the morphological data acquisition, language morphology representation, and general rule formalism. Hence, a morphological system that would be able to overcome these issues is needed especially when dealing with U-RL. In this research, there are two main issues to be highlighted: i) a workflow of the morphological system that can be used with the U-RL and ii) the internal representation of morphological information that complies with the selected framework, that is the Structured String Tree Correspondence (SSTC). The aim of this research is to propose a new Structured String Tree Correpondence+Morphology (SSTC+M) framework for constructing a morphological system for U-RL. Three primary levels are designed in the proposed framework, namely, morphological data acquisition (level 1), morphology theory adaptation (level 2), and computational morphology adaptation (level 3). The output of each level will be the input to the next level.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Saee, Suhaila
author_facet Saee, Suhaila
author_sort Saee, Suhaila
title Morphological System For Under-Resourced Languages Using Hybrid Approach
title_short Morphological System For Under-Resourced Languages Using Hybrid Approach
title_full Morphological System For Under-Resourced Languages Using Hybrid Approach
title_fullStr Morphological System For Under-Resourced Languages Using Hybrid Approach
title_full_unstemmed Morphological System For Under-Resourced Languages Using Hybrid Approach
title_sort morphological system for under-resourced languages using hybrid approach
granting_institution Multimedia University
granting_department Faculty of Computing and Informatics
publishDate 2016
_version_ 1747829663360614400