Prime-based method for interactive mining of frequent patterns

Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different appl...

Full description

Saved in:
Bibliographic Details
Main Author: Nadimi-Shahraki, Mohammad-Hossein
Format: Thesis
Language:English
Published: 2010
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-upm-ir.19628
record_format uketd_dc
spelling my-upm-ir.196282013-05-27T08:02:41Z Prime-based method for interactive mining of frequent patterns 2010-12 Nadimi-Shahraki, Mohammad-Hossein Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different applications is extremely difficult. Since rerunning the mining algorithms from scratch can be very time consuming, researchers have introduced interactive mining to find proper patterns by using the current mining model with various minsup. Thus far, a few efficient interactive mining algorithms have been proposed. However, their runtime do not fulfill the need of short runtime in real time applications especially where data is sparse and proper frequent patterns are mined with very low values of minsup. As response to the above-mentioned challenges, this study is devoted towards developing an interactive mining method based on prime number and its special characteristic “uniqueness” by which the content of the relevant data is transformed into a compact layout. At first, a general architecture for interactive mining is proposed consisting of two isolated components: mining model and mining process. Then, the proposed method is developed based on the architecture such that the mining model is constructed once, and it can be frequently mined by various minsup. In the mining model construction, the content of relevant data is captured by a novel tree structure called PC-tree with one database scan and mining materials are consequently formed. The PC-tree is a well-organized tree structure, which is systematically built based on descendant making introduced in this study. Moreover, this study introduces a mining algorithm called PC-miner to mine the mining model frequently with various values of minsup. It grows an effective candidate head set introduced in this study starting from the longest candidate patterns by using the Apriori principle. Meanwhile, during the growing of the candidate head set in each round, the longest candidate patterns are used to find maximal frequent patterns from which the frequent patterns can be derived. Moreover, the PC-miner reduces the number of candidate patterns and comparisons by using several pruning techniques. A comprehensive experimental analysis is conducted by several experiments and scenarios to evaluate the correctness and effectiveness of the proposed method especially for interactive mining. The experimental results verify that the proposed method constructs the mining model independent of minsup once and this enable the model to be frequently mined. The results also show that the proposed method mines frequent patterns correctly and efficiently. Moreover, the results verify that the proposed method speeds up interactive mining of frequent patterns over both sparse and dense datasets with more scalable total runtime for very low values of minsup over sparse datasets as compared to results from the previous work. Data mining. Database management - Computer programs. 2010-12 Thesis http://psasir.upm.edu.my/id/eprint/19628/ http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf application/pdf en public phd doctoral Universiti Putra Malaysia Data mining. Database management - Computer programs. Faculty of Computer Science and Information Technology
institution Universiti Putra Malaysia
collection PSAS Institutional Repository
language English
topic Data mining.
Database management - Computer programs.

spellingShingle Data mining.
Database management - Computer programs.

Nadimi-Shahraki, Mohammad-Hossein
Prime-based method for interactive mining of frequent patterns
description Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different applications is extremely difficult. Since rerunning the mining algorithms from scratch can be very time consuming, researchers have introduced interactive mining to find proper patterns by using the current mining model with various minsup. Thus far, a few efficient interactive mining algorithms have been proposed. However, their runtime do not fulfill the need of short runtime in real time applications especially where data is sparse and proper frequent patterns are mined with very low values of minsup. As response to the above-mentioned challenges, this study is devoted towards developing an interactive mining method based on prime number and its special characteristic “uniqueness” by which the content of the relevant data is transformed into a compact layout. At first, a general architecture for interactive mining is proposed consisting of two isolated components: mining model and mining process. Then, the proposed method is developed based on the architecture such that the mining model is constructed once, and it can be frequently mined by various minsup. In the mining model construction, the content of relevant data is captured by a novel tree structure called PC-tree with one database scan and mining materials are consequently formed. The PC-tree is a well-organized tree structure, which is systematically built based on descendant making introduced in this study. Moreover, this study introduces a mining algorithm called PC-miner to mine the mining model frequently with various values of minsup. It grows an effective candidate head set introduced in this study starting from the longest candidate patterns by using the Apriori principle. Meanwhile, during the growing of the candidate head set in each round, the longest candidate patterns are used to find maximal frequent patterns from which the frequent patterns can be derived. Moreover, the PC-miner reduces the number of candidate patterns and comparisons by using several pruning techniques. A comprehensive experimental analysis is conducted by several experiments and scenarios to evaluate the correctness and effectiveness of the proposed method especially for interactive mining. The experimental results verify that the proposed method constructs the mining model independent of minsup once and this enable the model to be frequently mined. The results also show that the proposed method mines frequent patterns correctly and efficiently. Moreover, the results verify that the proposed method speeds up interactive mining of frequent patterns over both sparse and dense datasets with more scalable total runtime for very low values of minsup over sparse datasets as compared to results from the previous work.
format Thesis
qualification_name Doctor of Philosophy (PhD.)
qualification_level Doctorate
author Nadimi-Shahraki, Mohammad-Hossein
author_facet Nadimi-Shahraki, Mohammad-Hossein
author_sort Nadimi-Shahraki, Mohammad-Hossein
title Prime-based method for interactive mining of frequent patterns
title_short Prime-based method for interactive mining of frequent patterns
title_full Prime-based method for interactive mining of frequent patterns
title_fullStr Prime-based method for interactive mining of frequent patterns
title_full_unstemmed Prime-based method for interactive mining of frequent patterns
title_sort prime-based method for interactive mining of frequent patterns
granting_institution Universiti Putra Malaysia
granting_department Faculty of Computer Science and Information Technology
publishDate 2010
url http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf
_version_ 1747811428320935936