Enhance efficiency of answering XML keyword query using incompact structure of MCCTree

People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen...

Full description

Saved in:
Bibliographic Details
Main Author: Sazaly, Ummu Sulaim
Format: Thesis
Language:English
Published: 2012
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-upm-ir.38635
record_format uketd_dc
spelling my-upm-ir.386352024-08-30T07:36:36Z Enhance efficiency of answering XML keyword query using incompact structure of MCCTree 2012-11 Sazaly, Ummu Sulaim People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen as the most important data sharing medium as it is very friendly for human and machine to interpret. Due to the importance of it, many studies have been done to increase the effectiveness of retrieving information from XML file. Many notions and techniques have been introduced especially to process query of information. Compact Lowest Common Ancestor (CLCA) and Maximal Compact Lowest Common Ancestor (MCLCA) implemented in algorithms named CGTreeGenerator and MCCTreeGenerator has been proven in returning an accurate result in answering XML keyword query. CGTreeGenerator compacted the XML tree by eliminating irrelevant nodes based on CLCA notion, which produced Compact Global Tree (CGTree). MCCTreeGenerator used CGTree to select subtree called Maximal Compact Connected Tree (MCCTree) as query result based on MCLCA notion. However, the MCCTree cannot be used directly in its ranking method because calculation in ranking method used the structure of subtree as before it has been compacted. If the result cannot be used directly by the ranking method, the algorithm has an ineffective process. Moreover, if the ineffective process requires re-examining the original tree, the efficiency of the process of the algorithm will be reduced. This study is a response to these weaknesses. This study proposes a new algorithm, namely XMCCTreeGenerator, to enhance the efficiency of the CGTree- MCCTreeGenerator. This study identifies the effective processes needed in producing XML query result using MCLCA notion and without compacting it. Those processes constructed MCCTreeGenerator algorithm which will produce the same subtree as MCCTree but difference in its structure. This new returned subtree called Extended MCCTree(XMCCTree) can be used directly by the ranking method because it is in an incompact structure. An experiment is run using XML datasets available in XML Data Repository from University of Washington’s website. Two files are selected which consist of different data structure and divided into three ranges of size. Keywords are manually randomly selected from the files and executed between three to five numbers of keyword. Two prototypes are developed which implement CGTree-MCCTreeGenerator and XMCCTreeGenerator. Since this study focuses on efficiency of the algorithm, elapsed time for each execution is collected from the experiment. In conclusion, the proposed XMCCTreeGenerator is more efficient than the previous CGTree- MCCTreeGenerator in answering XML keyword query using MCLCA. XML (Document markup language) Keyword searching 2012-11 Thesis http://psasir.upm.edu.my/id/eprint/38635/ http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203.pdf text en public masters Universiti Putra Malaysia XML (Document markup language) Keyword searching Selamat, Mohd Hasan
institution Universiti Putra Malaysia
collection PSAS Institutional Repository
language English
advisor Selamat, Mohd Hasan
topic XML (Document markup language)
Keyword searching

spellingShingle XML (Document markup language)
Keyword searching

Sazaly, Ummu Sulaim
Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
description People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen as the most important data sharing medium as it is very friendly for human and machine to interpret. Due to the importance of it, many studies have been done to increase the effectiveness of retrieving information from XML file. Many notions and techniques have been introduced especially to process query of information. Compact Lowest Common Ancestor (CLCA) and Maximal Compact Lowest Common Ancestor (MCLCA) implemented in algorithms named CGTreeGenerator and MCCTreeGenerator has been proven in returning an accurate result in answering XML keyword query. CGTreeGenerator compacted the XML tree by eliminating irrelevant nodes based on CLCA notion, which produced Compact Global Tree (CGTree). MCCTreeGenerator used CGTree to select subtree called Maximal Compact Connected Tree (MCCTree) as query result based on MCLCA notion. However, the MCCTree cannot be used directly in its ranking method because calculation in ranking method used the structure of subtree as before it has been compacted. If the result cannot be used directly by the ranking method, the algorithm has an ineffective process. Moreover, if the ineffective process requires re-examining the original tree, the efficiency of the process of the algorithm will be reduced. This study is a response to these weaknesses. This study proposes a new algorithm, namely XMCCTreeGenerator, to enhance the efficiency of the CGTree- MCCTreeGenerator. This study identifies the effective processes needed in producing XML query result using MCLCA notion and without compacting it. Those processes constructed MCCTreeGenerator algorithm which will produce the same subtree as MCCTree but difference in its structure. This new returned subtree called Extended MCCTree(XMCCTree) can be used directly by the ranking method because it is in an incompact structure. An experiment is run using XML datasets available in XML Data Repository from University of Washington’s website. Two files are selected which consist of different data structure and divided into three ranges of size. Keywords are manually randomly selected from the files and executed between three to five numbers of keyword. Two prototypes are developed which implement CGTree-MCCTreeGenerator and XMCCTreeGenerator. Since this study focuses on efficiency of the algorithm, elapsed time for each execution is collected from the experiment. In conclusion, the proposed XMCCTreeGenerator is more efficient than the previous CGTree- MCCTreeGenerator in answering XML keyword query using MCLCA.
format Thesis
qualification_level Master's degree
author Sazaly, Ummu Sulaim
author_facet Sazaly, Ummu Sulaim
author_sort Sazaly, Ummu Sulaim
title Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_short Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_full Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_fullStr Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_full_unstemmed Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_sort enhance efficiency of answering xml keyword query using incompact structure of mcctree
granting_institution Universiti Putra Malaysia
publishDate 2012
url http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203.pdf
_version_ 1811767727996534784