Rule ensemble classification for school children in Terengganu
Childhood obesity is proven to increase the risks of early-onset diabetes which leads to cardiovascular risk during adulthood and premature mortality. Its prevalence is increasing at an alarming rate worldwide and imposes an enormous financial burden to the government to treat its related co-morbidi...
Saved in:
Main Author: | |
---|---|
Format: | Thesis Book |
Language: | English |
Subjects: | |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Childhood obesity is proven to increase the risks of early-onset diabetes which leads to cardiovascular risk during adulthood and premature mortality. Its prevalence is increasing at an alarming rate worldwide and imposes an enormous financial burden to the government to treat its related co-morbidities. There is no national survey carried out to determine the prevalence and trend of childhood obesity in Malaysia especially in rural states such as Terengganu. In this thesis, the data collection and classification of childhood obesity among year six school children from two districts in Terengganu; Besut and Kuala Terengganu are discussed. The 4,245 data were collected from two main sources; National Physical Fitness Standard for Malaysian School Children Assessment Program (SEGAK) and a set of distributed questionnaires. An integrated and automated SEGAK data collection and analysis system is proposed in this thesis. The system, which is known as Health Monitoring System (HEMS), is a web-based system developed with an automated data preprocessing using a three-tier system architecture. This thesis will also introduce the use of data mining method by proposing the best classifier for the classification of childhood obesity among year six school children in Terengganu. Various combinations of attribute evaluator and search method for feature selection were used to identify significant factors that can be considered as potential risks that may influence the childhood obesity. Combination of feature selection methods are then tested on different classifiers namely BayesNet, Naive Bayes, J48, IBk and SMO. Using majority voting, the classifiers was combined. Since the result of multi classifier is not satisfied, a rule was applied. The result showed that ensemble classification using Consistency with Genetic Search for feature selection gives the highest accuracy either with or without rule with the percentage of 76.61 % and 75.22%, respectively. This study proved that an accurate classification model using ensemble classification can be used to classify the childhood obesity among school children in Terengganu. Other than that, the potentialrisk factors of childhood obesity can be listed based on the features that were selected using Consistency with Genetic Search. |
---|---|
Physical Description: | xiii,162 leaves; 31cm. |
Bibliography: | Includes bibliographical references(leaves 121-134) |