A comparative study of instance-based schema matching in relational database /

Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to supp...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Khaleel, Anas Abdulmunem (مؤلف)
التنسيق: أطروحة
اللغة:English
منشور في: Kuala Lumpur : Kulliyyah of Information and Communication Technology, International islamic University Malaysia, 2017
الموضوعات:
الوصول للمادة أونلاين:http://studentrepo.iium.edu.my/handle/123456789/5655
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Schema matching is deemed to be indispensable process for database integration in many contemporary database systems. The aim of schema matching is to identify the correlation across a schema which eventually serves the data integration process. The main issue concern for data integration is to support the merging decision advocating correspondence among attributes of heterogeneous data sources. Numerous schema matching techniques have been suggested in literature for utilizing database instances in detecting correspondence between attributes. However, no single technique managed to provide an accurate and comprehensive match for different types of data. In other words, some of the techniques treat numeric values as strings which undoubtedly adversely affected the match and further, the quality result of the matches. Likewise, other techniques tend to treat textual instances as numeric which might negatively influence the accuracy of the match. Thus, this thesis aims at investigating the performance of two different instance-based schema matching techniques. The study emphasizes on exploring the strengths and the weaknesses of each technique over various types of data sets. The study focuses on developing a syntactic instance-based schema matching technique named Regular Expression (RegEx) with WordNet database. While selecting Google similarity as a semantic instance-based schema matching technique. Both methods have been evaluated over three different data types, namely: (i) numeric, (ii) alphabetic, and (iii) mixed data types. Several analyses have been performed on real and synthetic data sets aiming at examining the match accuracy with respect to precision (P), recall (R) and F-measure (F).
وصف مادي:xi, 104 leaves : illustrations ; 30cm.
بيبلوغرافيا:Includes bibliographical references (leaves 94-104).