DNA enhancer prediction using machine learning techniques with novel feature representation

Identification of regulatory elements particularly enhancer region plays an important role in comprehending the regulation of gene expression. Current computational enhancer prediction tools are centred at Support Vector Machine (SVM) utilizing sequence content feature—the k-mer. While content featu...

Full description

Saved in:
Bibliographic Details
Main Author: Fong, Pui Kwan
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://ir.unimas.my/id/eprint/20988/3/Fong%20Pui.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Identification of regulatory elements particularly enhancer region plays an important role in comprehending the regulation of gene expression. Current computational enhancer prediction tools are centred at Support Vector Machine (SVM) utilizing sequence content feature—the k-mer. While content feature is shown to be promising, it suffers from several critical weaknesses such as: 1) features associated with enhancer regions are ill-defined and poorly understood. The content feature is unable to represent the complex properties of deoxyribonucleic acid (DNA) sequences; 2) the k-mer feature represents only the global property of DNA sequences but not the localized property; and 3) lack of feature extraction, generation and selection techniques in the algorithm design. This dissertation aims to develop novel feature representations of histone DNA sequences which are associated with enhancer locations. Technical contributions of this study are: 1) complex tree-feature modelling using genetic algorithm (CTreeGA): Automated feature generation framework to capture patterns of interactions among short DNA segments in histone sequences.