Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form

The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format,...

Full description

Saved in:
Bibliographic Details
Main Author: Sidi, Fatimah
Format: Thesis
Language:English
Published: 2007
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-upm-ir.5887
record_format uketd_dc
spelling my-upm-ir.58872022-01-20T07:29:51Z Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form 2007-09 Sidi, Fatimah The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format, which also implies to Malay unstructured documents. Therefore, structuring characteristics must be imposed to unstructured documents in order to transform information available in unstructured documents into knowledge. A new approach has been established to transform extracted knowledge in Malay unstructured document by identifying, organizing, and structuring them into interrogative structured form. Its architecture is developed based on the implementation of (i) interrogative knowledge identification; (ii) interrogative contextual information; and (iii) interrogative knowledge organization and structuring with Malay knowledge representation by concepts. It utilizes the Malay language corpus; interrogative theory; as well as object-oriented, ontology, and database model. The research involves system development based on architecture of the MalaylK-Ontology, which is being measured by quantitative retrieval performance using the recall and precision metrics. The development of the Retrieval lnterrogative Ontology Analysis Application is used to verify fitness of task for the functionalities and usefulness on the utilization of interrogative contextual information with color coding supplement, additional information annotation, and Malay knowledge representation by concepts. A number of experiments are carried out to quantify the accuracy of knowledge extracted. The MalaylK-Ontology is tested by using stratified random sampling drawn from various sources of Malay unstructured documents such as news, e-mails, articles, magazines, and texts from children story books. The results of the experiments have proved that the approach of MalaylK-Ontology performed well as compared to knowledge extracted manually done by an expert. The results of questionnaires evaluation on the Retrieval lnterrogative Ontology Analysis Application have shown good achievement in understanding the main point of the unstructured document easily and clearly. This is to improve better understanding the process of making sense of information into knowledge, maintaining the meaning of the information and gaining the interpretation of the identical knowledge in unstructured document which facilitate identical knowledge perceived by different people. Knowledge acquisition (Expert systems) Databases 2007-09 Thesis http://psasir.upm.edu.my/id/eprint/5887/ http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf text en public doctoral Universiti Putra Malaysia Knowledge acquisition (Expert systems) Databases Computer Science and Information Technology Selamat, Mohd Hasan
institution Universiti Putra Malaysia
collection PSAS Institutional Repository
language English
advisor Selamat, Mohd Hasan
topic Knowledge acquisition (Expert systems)
Databases

spellingShingle Knowledge acquisition (Expert systems)
Databases

Sidi, Fatimah
Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
description The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format, which also implies to Malay unstructured documents. Therefore, structuring characteristics must be imposed to unstructured documents in order to transform information available in unstructured documents into knowledge. A new approach has been established to transform extracted knowledge in Malay unstructured document by identifying, organizing, and structuring them into interrogative structured form. Its architecture is developed based on the implementation of (i) interrogative knowledge identification; (ii) interrogative contextual information; and (iii) interrogative knowledge organization and structuring with Malay knowledge representation by concepts. It utilizes the Malay language corpus; interrogative theory; as well as object-oriented, ontology, and database model. The research involves system development based on architecture of the MalaylK-Ontology, which is being measured by quantitative retrieval performance using the recall and precision metrics. The development of the Retrieval lnterrogative Ontology Analysis Application is used to verify fitness of task for the functionalities and usefulness on the utilization of interrogative contextual information with color coding supplement, additional information annotation, and Malay knowledge representation by concepts. A number of experiments are carried out to quantify the accuracy of knowledge extracted. The MalaylK-Ontology is tested by using stratified random sampling drawn from various sources of Malay unstructured documents such as news, e-mails, articles, magazines, and texts from children story books. The results of the experiments have proved that the approach of MalaylK-Ontology performed well as compared to knowledge extracted manually done by an expert. The results of questionnaires evaluation on the Retrieval lnterrogative Ontology Analysis Application have shown good achievement in understanding the main point of the unstructured document easily and clearly. This is to improve better understanding the process of making sense of information into knowledge, maintaining the meaning of the information and gaining the interpretation of the identical knowledge in unstructured document which facilitate identical knowledge perceived by different people.
format Thesis
qualification_level Doctorate
author Sidi, Fatimah
author_facet Sidi, Fatimah
author_sort Sidi, Fatimah
title Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
title_short Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
title_full Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
title_fullStr Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
title_full_unstemmed Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form
title_sort transformation of extracted knowledge in malay unstructured documents into an interrogative structured form
granting_institution Universiti Putra Malaysia
granting_department Computer Science and Information Technology
publishDate 2007
url http://psasir.upm.edu.my/id/eprint/5887/1/FSKTM_2007_10%20IR.pdf
_version_ 1747810502554157056