A hybrid model for discovering significant patterns in data mining

A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent p...

Full description

Saved in:

Bibliographic Details
Main Author:	Abdullah, Zailani
Format:	Thesis
Language:	English English English
Published:	2012
Subjects:	QA Mathematics QA76 Computer software
Online Access:	http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-uthm-ep.2207
record_format	uketd_dc
spelling	my-uthm-ep.22072021-10-31T04:06:28Z A hybrid model for discovering significant patterns in data mining 2012-07 Abdullah, Zailani QA Mathematics QA76 Computer software A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent patterns, least patterns and significant least patterns. Typically, these patterns may derive from the absolute frequent patterns or mixed up with the least patterns. In market-basket analysis, frequent patterns are considered as significant patterns and already make a lot of contribution. Frequent Pattern Tree (FP-Tree) is one of the famous data structure to deal with batched frequent patterns but it must rely on the original database. For detecting the exceptional occurrences or events that have a high implication such as unanticipated substances that cause air pollution, unexpected degree programs selected by students, unpredictable motorcycle models preferred by customers; the least patterns are very meaningful as compared to the frequent one. However, in this category of patterns, the generation of standard tree data structure may trigger the memory overflow due to the requirement of lowering the minimum support threshold. Furthermore, the classical support-confidence measure has many limitations such as tricky in choosing the right support-confidence value, misleading interpretation based on support-confidence combination and not scalable enough to deal with significant least patterns. Therefore, to overcome these drawbacks, in this thesis we proposed a Hybrid Model for Discovering Significant Patterns (Hy-DSP) which consist of the combination of Efficient Frequent Pattern Mining Model (EFP�M2), Efficient Least Pattern Mining Model (ELP-M2) and Significant Least Pattern Mining Model (SLP-M2). The proposed model is developed using the latest .NET framework and C# as a programming language. Experiments with the UCI datasets showed that the Hy-DSP which consist of DOSTrieIT and LP-Growth* outperformed the benchmarked CanTree and FP-Growth up to 4.13 times (75.78%) v and 10.37 times (90.31%), respectively, thus verify its efficiency. In fact, the number of patterns produce by the models is also less than the standard measures. 2012-07 Thesis http://eprints.uthm.edu.my/2207/ http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf text en public http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf text en staffonly http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf text en validuser phd masters Universiti Tun Hussein Onn Malaysia Fakulti Sains Komputer dan Teknologi Maklumat
institution	Universiti Tun Hussein Onn Malaysia
collection	UTHM Institutional Repository
language	English English English
topic	QA Mathematics QA76 Computer software
spellingShingle	QA Mathematics QA76 Computer software Abdullah, Zailani A hybrid model for discovering significant patterns in data mining
description	A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent patterns, least patterns and significant least patterns. Typically, these patterns may derive from the absolute frequent patterns or mixed up with the least patterns. In market-basket analysis, frequent patterns are considered as significant patterns and already make a lot of contribution. Frequent Pattern Tree (FP-Tree) is one of the famous data structure to deal with batched frequent patterns but it must rely on the original database. For detecting the exceptional occurrences or events that have a high implication such as unanticipated substances that cause air pollution, unexpected degree programs selected by students, unpredictable motorcycle models preferred by customers; the least patterns are very meaningful as compared to the frequent one. However, in this category of patterns, the generation of standard tree data structure may trigger the memory overflow due to the requirement of lowering the minimum support threshold. Furthermore, the classical support-confidence measure has many limitations such as tricky in choosing the right support-confidence value, misleading interpretation based on support-confidence combination and not scalable enough to deal with significant least patterns. Therefore, to overcome these drawbacks, in this thesis we proposed a Hybrid Model for Discovering Significant Patterns (Hy-DSP) which consist of the combination of Efficient Frequent Pattern Mining Model (EFP�M2), Efficient Least Pattern Mining Model (ELP-M2) and Significant Least Pattern Mining Model (SLP-M2). The proposed model is developed using the latest .NET framework and C# as a programming language. Experiments with the UCI datasets showed that the Hy-DSP which consist of DOSTrieIT and LP-Growth* outperformed the benchmarked CanTree and FP-Growth up to 4.13 times (75.78%) v and 10.37 times (90.31%), respectively, thus verify its efficiency. In fact, the number of patterns produce by the models is also less than the standard measures.
format	Thesis
qualification_name	Doctor of Philosophy (PhD.)
qualification_level	Master's degree
author	Abdullah, Zailani
author_facet	Abdullah, Zailani
author_sort	Abdullah, Zailani
title	A hybrid model for discovering significant patterns in data mining
title_short	A hybrid model for discovering significant patterns in data mining
title_full	A hybrid model for discovering significant patterns in data mining
title_fullStr	A hybrid model for discovering significant patterns in data mining
title_full_unstemmed	A hybrid model for discovering significant patterns in data mining
title_sort	hybrid model for discovering significant patterns in data mining
granting_institution	Universiti Tun Hussein Onn Malaysia
granting_department	Fakulti Sains Komputer dan Teknologi Maklumat
publishDate	2012
url	http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf
_version_	1747830924765036544

A hybrid model for discovering significant patterns in data mining

Similar Items