A comparative study of instance-based schema matching in relational database /
Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to supp...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
Kuala Lumpur :
Kulliyyah of Information and Communication Technology, International islamic University Malaysia,
2017
|
Subjects: | |
Online Access: | http://studentrepo.iium.edu.my/handle/123456789/5655 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to support the merging decision advocating correspondence among attributes of heterogeneous data sources. Numerous schema matching techniques have been suggested in literature for utilizing database instances in detecting correspondence between attributes. However, no single technique managed to provide an accurate and comprehensive match for different types of data. In other words, some of the techniques treat numeric values as strings which undoubtedly adversely affected the match and further, the quality result of the matches. Likewise, other techniques tend to treat textual instances as numeric which might negatively influence the accuracy of the match. Thus, this thesis aims at investigating the performance of two different instance-based schema matching techniques. The study emphasizes on exploring the strengths and the weaknesses of each technique over various types of data sets. The study focuses on developing a syntactic instance-based schema matching technique named Regular Expression (RegEx) with WordNet database. While selecting Google similarity as a semantic instance-based schema matching technique. Both methods have been evaluated over three different data types, namely: (i) numeric, (ii) alphabetic, and (iii) mixed data types. Several analyses have been performed on real and synthetic data sets aiming at examining the match accuracy with respect to precision (P), recall (R) and F-measure (F). |
---|---|
Physical Description: | xi, 104 leaves : illustrations ; 30cm. |
Bibliography: | Includes bibliographical references (leaves 94-104). |