A modified multi-class association rule for text mining

Classification and association rule mining are significant tasks in data mining. Integrating association rule discovery and classification in data mining brings us an approach known as the associative classification. One common shortcoming of existing Association Classifiers is the huge number of ru...

Full description

Saved in:
Bibliographic Details
Main Author: Al-Refai, Mohammad Hayel Abdel Karim
Format: Thesis
Language:eng
eng
Published: 2015
Subjects:
Online Access:https://etd.uum.edu.my/5767/1/depositpermission_s91487.pdf
https://etd.uum.edu.my/5767/2/s91487_01.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.5767
record_format uketd_dc
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
advisor Yusof, Yuhanis
topic QA71-90 Instruments and machines
spellingShingle QA71-90 Instruments and machines
Al-Refai, Mohammad Hayel Abdel Karim
A modified multi-class association rule for text mining
description Classification and association rule mining are significant tasks in data mining. Integrating association rule discovery and classification in data mining brings us an approach known as the associative classification. One common shortcoming of existing Association Classifiers is the huge number of rules produced in order to obtain high classification accuracy. This study proposes s a Modified Multi-class Association Rule Mining (mMCAR) that consists of three procedures; rule discovery, rule pruning and group-based class assignment. The rule discovery and rule pruning procedures are designed to reduce the number of classification rules. On the other hand, the group-based class assignment procedure contributes in improving the classification accuracy. Experiments on the structured and unstructured text datasets obtained from the UCI and Reuters repositories are performed in order to evaluate the proposed Association Classifier. The proposed mMCAR classifier is benchmarked against the traditional classifiers and existing Association Classifiers. Experimental results indicate that the proposed Association Classifier, mMCAR, produced high accuracy with a smaller number of classification rules. For the structured dataset, the mMCAR produces an average of 84.24% accuracy as compared to MCAR that obtains 84.23%. Even though the classification accuracy difference is small, the proposed mMCAR uses only 50 rules for the classification while its benchmark method involves 60 rules. On the other hand, mMCAR is at par with MCAR when unstructured dataset is utilized. Both classifiers produce 89% accuracy but mMCAR uses less number of rules for the classification. This study contributes to the text mining domain as automatic classification of huge and widely distributed textual data could facilitate the text representation and retrieval processes.
format Thesis
qualification_name Ph.D.
qualification_level Doctorate
author Al-Refai, Mohammad Hayel Abdel Karim
author_facet Al-Refai, Mohammad Hayel Abdel Karim
author_sort Al-Refai, Mohammad Hayel Abdel Karim
title A modified multi-class association rule for text mining
title_short A modified multi-class association rule for text mining
title_full A modified multi-class association rule for text mining
title_fullStr A modified multi-class association rule for text mining
title_full_unstemmed A modified multi-class association rule for text mining
title_sort modified multi-class association rule for text mining
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2015
url https://etd.uum.edu.my/5767/1/depositpermission_s91487.pdf
https://etd.uum.edu.my/5767/2/s91487_01.pdf
_version_ 1747827978770841600
spelling my-uum-etd.57672016-07-20T10:01:45Z A modified multi-class association rule for text mining 2015 Al-Refai, Mohammad Hayel Abdel Karim Yusof, Yuhanis Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Arts and Sciences QA71-90 Instruments and machines Classification and association rule mining are significant tasks in data mining. Integrating association rule discovery and classification in data mining brings us an approach known as the associative classification. One common shortcoming of existing Association Classifiers is the huge number of rules produced in order to obtain high classification accuracy. This study proposes s a Modified Multi-class Association Rule Mining (mMCAR) that consists of three procedures; rule discovery, rule pruning and group-based class assignment. The rule discovery and rule pruning procedures are designed to reduce the number of classification rules. On the other hand, the group-based class assignment procedure contributes in improving the classification accuracy. Experiments on the structured and unstructured text datasets obtained from the UCI and Reuters repositories are performed in order to evaluate the proposed Association Classifier. The proposed mMCAR classifier is benchmarked against the traditional classifiers and existing Association Classifiers. Experimental results indicate that the proposed Association Classifier, mMCAR, produced high accuracy with a smaller number of classification rules. For the structured dataset, the mMCAR produces an average of 84.24% accuracy as compared to MCAR that obtains 84.23%. Even though the classification accuracy difference is small, the proposed mMCAR uses only 50 rules for the classification while its benchmark method involves 60 rules. On the other hand, mMCAR is at par with MCAR when unstructured dataset is utilized. Both classifiers produce 89% accuracy but mMCAR uses less number of rules for the classification. This study contributes to the text mining domain as automatic classification of huge and widely distributed textual data could facilitate the text representation and retrieval processes. 2015 Thesis https://etd.uum.edu.my/5767/ https://etd.uum.edu.my/5767/1/depositpermission_s91487.pdf text eng staffonly https://etd.uum.edu.my/5767/2/s91487_01.pdf text eng public Ph.D. doctoral Universiti Utara Malaysia [1] U. Fayyad, et al., "From data mining to knowledge discovery in databases," AI magazine, vol. 17, p. 37, 1996. [2] I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann Pub, 2005. [3] W. Han, et al., "Research on the Problem Model of GUI based on Knowledge Discovery in Database," in 2013 International Conference on Software Engineering and Computer Science, 2013. [4] A. Sharafi, et al., "Knowledge Discovery in Databases on the Example of Engineering Change Management," in Industrial Conference on Data Mining-Poster and Industry Proceedings, 2010, pp. 9-16. [5] C. M. L. Antonie, "Associative classifiers: Improvements and potential," UNIVERSITY OF ALBERTA, 2009. [6] T. Dong, et al., "The Research of kNN Text Categorization Algorithm Based on Eager Learning," in Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on, 2012, pp. 1120-1123. [7] A. C. Neocleous, et al., "Artificial neural networks to investigate the importance and the sensitivity to various parameters used for the prediction of chromosomal abnormalities," in Artificial Intelligence Applications and Innovations, ed: Springer, 2012, pp. 46-55. [8] B. Sriram, et al., "Short text classification in twitter to improve information filtering," in Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010, pp. 841-842. [9] R. Brause, "Medical analysis and diagnosis by neural networks," Medical data analysis, pp. 1-13, 2001. [10] G. J. Simon, et al., "A simple statistical model and association rule filtering for classification," in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011, pp. 823-831. [11] W. Zhang, et al., "A comparative study of TF* IDF, LSI and multi-words for text classification," Expert Systems with Applications, vol. 38, pp. 2758-2765, 2011. [12] F. Thabtah, et al., "Arabic Text Mining Using Rule Based Classification," Journal of Information & Knowledge Management, vol. 11, 2012. [13] C. C. Aggarwal and C. Zhai, Mining text data: Springer, 2012. [14] S. Kiritchenko and S. Matwin, "Email classification with co-training," in Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research, 2011, pp. 301-312. [15] H. Dag, et al., "Comparison of feature selection algorithms for medical data," in Innovations in Intelligent Systems and Applications (INISTA), 2012 International Symposium on, 2012, pp. 1-5. [16] A. James, et al., "Research Directions in Database Architectures for the Internet of Things: A Communication of the First International Workshop on Database Architectures for the Internet of Things (DAIT 2009)," Dataspace: The Final Frontier, pp. 225-233, 2009. [17] Y. Zhu, et al., "Font recognition based on global texture analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1192-1200, 2001. [18] J. R. Quinlan, C4. 5: programs for machine learning. San Mateo: Morgan Kaufmann, 1993. [19] X. Qi and B. D. Davison, "Web page classification: Features and algorithms," ACM computing surveys (CSUR), vol. 41, p. 12, 2009. [20] Wu, Ho Chung, et al. "Interpreting tf-idf term weights as making relevance decisions." ACM Transactions on Information Systems (TOIS) 26.3 (2011) [21] G. Cormode and M. Hadjieleftheriou, "Methods for finding frequent items in data streams," The VLDB Journal, vol. 19, pp. 3-20, 2010. [22] D. Meretakis and B. Wüthrich, "Extending naïve Bayes classifiers using long itemsets," 1999, pp. 165-174. [23] M. Henning, "The rise and fall of CORBA," Communications of the ACM, vol. 51, pp. 52-57, 2008. [24] L. Shi, et al., "Cross language text classification by model translation and semi-supervised learning," in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, pp. 1057-1067. [25] J. E. Gentle, et al., Handbook of computational statistics: concepts and methods: Springer, 2012. [26] E. Wiener, et al., "A neural network approach to topic spotting," in Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval, 1995, pp. pp. 317-332. [27] K. Hornik, "Snowball: Snowball Stemmers," Rpackage version 0.0-7, URL http://CRAN.R-project.org/package=Snowball, 2009. [28] J. Duan, et al., "Scaling up the accuracy of Bayesian classifier based on frequent itemsets by m-estimate," in Artificial Intelligence and Computational Intelligence, ed: Springer, 2010, pp. 357-364. [29] G. Dong, et al., "CAEP: Classification by aggregating emerging patterns," Japan, 1999, pp. 737-737. [30] W. Li, et al., "CMAR: Accurate and efficient classification based on multiple class-association rules," in Proceedings of the ICDM’01, San Jose, CA, 2001, p. 369. [31] F. Thabtah, et al., "A New Classi cation Based on Association Algorithm," Journal of Information & Knowledge Management, vol. 9, p. 55 64, 2010. [32] J. Read, et al., "Classifier chains for multi-label classification," Machine learning, vol. 85, pp. 333-359, 2011. [33] K. Yu, et al., "Mining emerging patterns by streaming feature selection," in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 60-68. [34] E. Baralis, et al., "A lazy approach to associative classification," Knowledge and Data Engineering, IEEE Transactions on, vol. 20, pp. 156-171, 2008. [35] X. Y. J. Han, "CPAR: Classification based on predictive association rules," 2003, p. 331. [36] E. Baralis, et al., "On support thresholds in associative classification," in Proceedings of the 2004 ACM Symposium on Applied Computing, Nicosia, Cyprus, 2004, pp. 553-558. [37] F. Thabtah, et al., "MCAR: multi-class classification based on association rule," in Proceeding of the 3rd IEEE International Conference on Computer Systems and Applications, 2005, p. 33. [38] Z. Tang and Q. Liao, "A new class based associative classification algorithm," IAENG International Journal of Applied Mathematics.–1998.– 36: 2, IJAM.– . 136, vol. 141, 2007. [39] Y. Yoon and G. G. Lee, "Text categorization based on boosting association rules," 2008, pp. 136-143. [40] D. Meretakis and B. Wüthrich, "Extending naïve Bayes classifiers using long itemsets," in Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, 1999, pp. 165-174. [41] R. Quinlan, "Data mining tools See5 and C5. 0," Artificial Intelligence, 2004. [42] E. Wiener, et al., "A neural network approach to topic spotting," in Fourth Annual Symposium on Document Analysis and Information Retrieval (SDAIR'95), 1995. [43] B. Liu, et al., "Integrating classification and association rule mining," Knowledge discovery and data mining, pp. 80–86, 1998. [44] M. L. Antonie and O. Zaïane, "Mining positive and negative association rules: an approach for confined rules," Knowledge Discovery in Databases: PKDD 2004, pp. 27-38, 2004. [45] G. Kundu, et al., "ACN: An associative classifier with negative rules," 2008, pp. 369-375. [46] F. A. Thabtah, et al., "MMAC: A new multi-class, multi-label associative classification approach," 2004. [47] R. Agrawal and R. Srikant, "Fast algorithms for mining association rules," in Proceedings of the 20th International Conference on Very Large Data Bases Santiago, Chile, 1994, pp. 487-499. [48] M. J. Zaki, et al., "New algorithms for fast discovery of association rules," in 3rd KDD Conference New York, 1997. [49] M. J. Zaki and K. Gouda, "Fast vertical mining using diffsets," in Proceedings of the ninth ACM Washington, D.C, 2003, pp. 326-335. [50] J. R. Quinlan, "Generating production rules from decision trees," in Artificial Intelligence, Milan, Italy., 1987, pp. 304-307. [51] G. Salton, "Automatic text processing: the transformation," Analysis and Retrieval of Information by Computer, vol. 14, p. 15, 1989. [52] L. T. Nguyen, et al., "Classification based on association rules: A latticebased approach," Expert Systems with Applications, vol. 39, pp. 11357- 11366, 2012. [53] E. Baralis and P. Garza, "I‐prune: Item selection for associative classification," International Journal of Intelligent Systems, vol. 27, pp. 279-299, 2012. [54] C.-H. Chen, et al., "Improving the performance of association classifiers by rule prioritization," Knowledge-Based Systems, vol. 36, pp. 59-67, 2012. [55] M. G. Al Zamil and A. B. Can, "ROLEX-SP: Rules of lexical syntactic patterns for free text categorization," Knowledge-Based Systems, vol. 24, pp. 58-65, 2011. [56] Z. Zhou, et al., "Association classification algorithm based on structure sequence in protein secondary structure prediction," Expert Systems with Applications, vol. 37, pp. 6381-6389, 2010. [57] J. Alcalá-Fdez, et al., "A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning," Fuzzy Systems, IEEE Transactions on, vol. 19, pp. 857-872, 2011. [58] Z. Zhang and R. S. Blum, "A categorization of multiscale-decomposition based image fusion schemes with a performance study for a digital camera application," Proceedings of the IEEE, vol. 87, pp. 1315-1326, 1999. [59] B. Starfield, et al., "Ambulatory care groups: a categorization of diagnoses for research and management," Health Services Research, vol. 26, p. 53, 1991. [60] P. C. Austin, et al., "Comparative ability of comorbidity classification methods for administrative data to predict outcomes in patients with chronic obstructive pulmonary disease," Annals of epidemiology, 2012. [61] H. Shatkay, et al., "Integrating image data into biomedical text categorization," Bioinformatics, vol. 22, p. e446, 2006. [62] F. Thabtah, et al., "MCAR: multi-class classification based on association rule," 2005, p. 33. [63] A. Chang, et al., "An Integer Optimization Approach to Associative Classification," in Advances in Neural Information Processing Systems, 2012, pp. 269-277. [64] M. L. G. a. t. U. o. Waikato. stemmer. Available: http://www.cs.waikato.ac.nz/~ml/weka/ index_downloading.html [65] F. THABTAH and S. HAMMOUD, "MR-ARM: A MAP-REDUCE ASSOCIATION RULE MINING FRAMEWORK," Parallel Processing Letters, vol. 23, 2013. [66] S. Z. H. Zaidi, et al., "Distributed data mining from heterogeneous healthcare data repositories: towards an intelligent agent-based framework," 2002, pp. 339-342. [67] I. Yeh, et al., "Applications of web mining for marketing of online bookstores," Expert Systems with Applications, vol. 36, pp. 11249-11256, 2009. [68] C. C. Aggarwal, "Collaborative crawling: Mining user experiences for topical resource discovery," 2002, pp. 423-428. [69] D. D. Lewis. (2004, Reuters-21578. Available: http://www.daviddlewis.com/resources/test collections/reuters21578/ [70] G. Chen, et al., "A new approach to classification based on association rule mining," Decision Support Systems, vol. 42, pp. 674-689, 2006. [71] G. Tsoumakas and I. Katakis, "Multi-label classification: An overview," International Journal of Data Warehousing and Mining, vol. 3, pp. 1-13, 2007. [72] J. Balcázar, "Minimum-size bases of association rules," Machine Learning and Knowledge Discovery in Databases, vol. 5211, pp. 86-101, 2008. [73] Q. Niu, et al., "Association Classification Based on Compactness of Rules," in Second International Workshop on Knowledge Discovery and Data Mining, 2009, pp. 245-247. [74] H. Ishibuchi, et al., "Prescreening of candidate rules using association rule mining and Pareto-optimality in genetic rule selection," 2007, pp. 509-516. [75] J. Han, et al., Data mining: concepts and techniques: Morgan Kaufmann Pub, 2011. [76] C. Merz and P. Murphy, "UCI repository of machine learning databases, 1996," FTP from ics. uci. edu in the directory pub/machine-learning databases. [77] D. Lewis, "Naive (Bayes) at forty: The independence assumption in information retrieval," Machine Learning: ECML-98, pp. 4-15, 1998. [78] L. Alvim, et al., "Sentiment of financial news: a natural language processing approach," in 1st Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, 2010. [79] T. Joachims, "Text categorization with support vector machines: Learning with many relevant features," Machine Learning: ECML-98, pp. 137-142, 1998. [80] Y. Yang and J. O. Pedersen, "A comparative study on feature selection in text categorization," Nashville, TN, 1997, pp. 412-420. [81] F. Sebastiani, "A tutorial on automated text categorisation," in 1st Argentinian Symposium on Artificial Intelligence, 1999, pp. 7-35. [82] T. Tokunaga and I. Makoto, "Text categorization based on weighted inverse document frequency," in the Special Interest Groups and Information Process Society of Japan (SIG-IPSJ), Tokyo, Japan, 1994. [83] C. Deisy, et al., "A novel term weighting scheme MIDF for Text Categorization," Journal of Engineering Science and Technology, vol. 5, pp. 94-107, 2010. [84] R. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval vol. 463: ACM press New York, 1999. [85] A. R. Pal, et al., "An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm And Wordnet," International Journal of Control Theory & Computer Modeling, vol. 3, 2013. [86] C. J. Rijsbergen, "Information retrieval," A statistical interpretation of term specificity and its application in retrieval," Journal of documentation, vol. 28, 1979. [87] K. S. Jones, "A statistical interpretation of term specificity and its application in retrieval," Journal of documentation, vol. 28, pp. 11-21, 1972. [88] F. Thabtah and H. Abdel-jaber, "A Comparative Study using Vector Space Model with K-Nearest Neighbor on Text Categorization Data," in Proceedings of the 2007 International Conference of Data Mining and Knowledge Engineering, London, UK, 2007. [89] J. R. Quinlan, "Induction of decision trees," Machine learning, vol. 1, pp. 81-106, 1986. [90] G. W. Snedecor and W. Cochran, "Statical methods," Statical methods, 1989. [91] T. M. Mitchell, Machine learning. WCB/McGraw-Hill, New York, New York: Artificial Neural Networks, 1997. [92] V. N. Vapnik, The nature of statistical learning theory. New York: Springer Verlag, 2000. [93] Y. Yang and X. Liu, "A re-examination of text categorization methods," in Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99), 1999, pp. 42-49. [94] X. Zhang and H. Huang, "An improved KNN text categorization algorithm by adopting cluster technology," Pattern Recognit Artif Intell, vol. 22, pp. 936-940, 2009. [95] B. Xu, et al., "An Improved Random Forest Classifier for Text Categorization," Journal of Computers, vol. 7, pp. 2913-2920, 2012. [96] K. Tzeras and S. Hartmann, "Automatic indexing based on Bayesian inference networks," in Proceedings of the 16th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'93), 1993, pp. 22-35. [97] S. Jiang, et al., "An improved< i> K</i>-nearest-neighbor algorithm for text categorization," Expert Systems with Applications, vol. 39, pp. 1503-1509, 2012. [98] P. Cunningham and S. J. Delany, "k-Nearest neighbour classifiers," Multiple Classifier Systems, pp. 1-17, 2007. [99] M. F. Othman and T. M. S. Yau, "Comparison of different classification techniques using WEKA for breast cancer," in IFMBE Proceedings Springer, Malaysia, 2007, pp. 520-523. [100] G. A. Wa’el Musa Hadi and F. Thabtah, "VSMs with K-Nearest Neighbour to Categorise Arabic Text Data," in Proceedings of the European Simulation and Modelling Conference, Le Havre, France, 2008. [101] R. O. Duda and P. E. Hart, "Pattern classification and scene analysis," A Wiley-Interscience Publication, New York: Wiley, 1973, vol. 1, 1973. [102] X. Ma, et al., "Combining Naive Bayes and Tri-gram Language Model for Spam Filtering," in Knowledge Engineering and Management, ed: Springer, 2012, pp. 509-520. [103] M. Elmarhoumy, et al., "A new modified centroid classifier approach for automatic text classification," IEEJ Transactions on Electrical and Electronic Engineering, 2013. [104] F. Denis, et al., "Efficient learning of Naive Bayes classifiers under classconditional classification noise," in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 265-272. [105] R. E. Schapire, et al., "Boosting and Rocchio applied to text filtering," in ACM, 1998, pp. 215-223. [106] D. D. Jensen and P. R. Cohen, "Multiple comparisons in induction algorithms," Machine learning, vol. 38, pp. 309-338, 2000. [107] Z. Wang, et al., "A Multiclass SVM Method via Probabilistic Error- Correcting Output Codes," in Internet Technology and Applications, 2010 International Conference on, 2010, pp. 1-4. [108] P. Y. Pawar and S. Gawande, "A Comparative Study on Different Types of Approaches to Text Categorization," International Journal of Machine Learning and Computing, vol. 2, 2011. [109] T. Kohonen and P. Somervuo, "Self-organizing maps of symbol strings," Neurocomputing, vol. 21, pp. 19-30, 1998. [110] T. S. Lim, et al., "A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms," Machine learning, vol. 40, pp. 203-228, 2000. [111] F. Odeh and N. Al-Najdawi, "ACNB: Associative Classification Mining Based on Naïve Bayesian Method," International Journal of Information Technology and Web Engineering (IJITWE), vol. 8, pp. 23-35, 2013. [112] X. Li, et al., "ACCF: Associative Classification Based on Closed Frequent Itemsets," 2008, pp. 380-384. [113] U. Fayyad and K. Irani, "Multi-interval discretization of continuous-valued attributes for classification learning," 1993. [114] W. Li, "Classification based on multiple association rules," Citeseer, 2001. [115] T. Qian, et al., "2-ps based associative text classification," Data Warehousing and Knowledge Discovery, pp. 378-387, 2005. [116] M. J. Zaki and C. J. Hsiao, "CHARM: An efficient algorithm for closed item set mining," 2002. [117] R. E. Schapire, "Using output codes to boost multiclass learning problems," in Machine Learning, 1997, pp. 313-321. [118] Q. Niu, et al., "Association Classification Based on Compactness of Rules," in Second International Workshop on Knowledge Discovery and Data Mining., 2009, pp. 245-247. [119] J. Han, et al., "Mining frequent patterns without candidate generation," 2000, pp. 1-12. [120] F. A. Thabtah and P. I. Cowling, "A greedy classification algorithm based on association rule," Applied Soft Computing, vol. 7, pp. 1102-1111, 2007. [121] B. Cule and B. Goethals, "Mining association rules in long sequences," in Advances in Knowledge Discovery and Data Mining, ed: Springer, 2010, pp. 300-309. [122] O. R. Zaïane and M. L. Antonie, "Classifying text documents by associating terms with text categories," in Australasian conference on database technologies, Melbourne, Australia, 2003, pp. 215-222. [123] I. H. Witten, et al., Data Mining: Practical Machine Learning Tools and Techniques: Practical Machine Learning Tools and Techniques: Morgan Kaufmann, 2011. [124] J. Jabez Christopher, "A Statistical Approach for Associative Classification," European Journal of Scientific Research, vol. 58, pp. 140-147, 2011. [125] S. Maffeis and D. C. Schmidt, "Constructing reliable distributed communication systems with CORBA," Communications Magazine, IEEE, vol. 35, pp. 56-60, 1997. [126] F. Thabtah, et al., "Rule Pruning Methods in Associative Classification Text Mining," Journal of Intelligent Computing Volume, vol. 1, p. 1, 2010. [127] S. Sangsuriyun, et al., "Hierarchical Multi-label Associative Classification (HMAC) using negative rules," in IEEE International Conference, Bangkok, 2010, pp. 919-924. [128] P. Clark and R. Boswell, "Rule induction with CN2: Some recent improvements," in Machine Learning, Berlin, 1991, pp. 151-163. [129] M. L. Antonie and O. R. Zaïane, "Text document categorization by term association," 2002. [130] M. L. Antonie, et al., "Associative classifiers for medical images," Mining Multimedia and Complex Data, pp. 68-83, 2003. [131] W. C. Chen, et al., "Increasing the effectiveness of associative classification in terms of class imbalance by using a novel pruning algorithm," Expert Systems with Applications, 2012. [132] E. Baralis and J. Widom, "An algebraic approach to static analysis of active database rules," ACM Transactions on Database Systems (TODS), vol. 25, pp. 269-332, 2000. [133] F. Thabtah, et al., "MCAR: multi-class classification based on association rule," in Proceeding of the 3rd IEEE International Conference on Computer Systems and Applications Cairo, Egypt., 2005, pp. 1-7. [134] M. L. Antonie and O. R. Zaïane, "Text document categorization by term association," 2002, pp. 19-26. [135] F. Thabtah, et al., "Comparison of rule based classification techniques for the Arabic textual data," 2011, pp. 105-111. [136] T. D. Do, et al., "Prediction confidence for associative classification," Singapore 2005, pp. 1993-1998. [137] M. Hall, et al., "The WEKA data mining software: an update," ACM SIGKDD Explorations Newsletter, vol. 11, pp. 10-18, 2009. [138] A. A. Freitas, "Understanding the crucial differences between classification and discovery of association rules: a position paper," ACM SIGKDD Explorations Newsletter, vol. 2, pp. 65-69, 2000. [139] M. Kantardzic and A. Badia, "Efficient Implementation of Strong Negative Association Rules," 2003, pp. 23-24. [140] R. Feldman and J. Sanger, The text mining handbook: advanced approaches in analyzing unstructured data: Cambridge Univ Pr, 2007. [141] B. Baharudin, et al., "A review of machine learning algorithms for text documents classification," Journal of Advances in Information Technology, vol. 1, pp. 4-20, 2010. [142] M. Lan, et al., "Supervised and traditional term weighting methods for automatic text categorization," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 721-735, 2009. [143] S. M. Weiss, Text mining: predictive methods for analyzing unstructured information: Springer-Verlag New York Inc, 2005. [144] M. J. Zaki and K. Gouda, "Fast vertical mining using diffsets," in Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C, 2003, pp. 326-335. [145] Y. Yusof and M. H. Refai, "MMCAR: Modified multi-class classification based on association rule," in Information Retrieval & Knowledge Management (CAMP), 2012 International Conference on, 2012, pp. 6-11. [146] W. W. Cohen, "Fast effective rule induction," 1995, pp. 115-123. [147] B. Atmani and B. Beldjilali, "Knowledge discovery in database: Induction graph and cellular automaton," Computing and Informatics, vol. 26, pp. 171-197, 2012. [148] N. Japkowicz and S. Stephen, "The class imbalance problem: A systematic study," Intelligent Data Analysis, vol. 6, pp. 429-449, 2002. [149] Credé, Marcus, et al. "An evaluation of the consequences of using short measures of the Big Five personality traits." Journal of personality and social psychology 102.4 (2012): 874. [150] Quinlan, J. Ross. C4. 5: programs for machine learning. Elsevier, 2014.