Automated plant classification system using a hybrid of shape and color features of the leaf

Automated plant leaf classification is a computerized approach that employs computer vision and machine learning algorithms to identify a plant based on the features of its leaf. The last few decades have witnessed various approaches to implement plant classification systems. Several approaches h...

Full description

Saved in:
Bibliographic Details
Main Author: Hamid, Laith Emad
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/67103/1/FK%202016%20128%20IR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automated plant leaf classification is a computerized approach that employs computer vision and machine learning algorithms to identify a plant based on the features of its leaf. The last few decades have witnessed various approaches to implement plant classification systems. Several approaches have been proposed using different features and classifiers. However, the majority of the existing methods either rely on large numbers of training samples or select certain leaves within a dataset to achieve high accuracy rates. The disadvantage of such practices is that the results may not reflect the actual expressiveness of the features to tackle the high interclass similarity among different species. Furthermore, most of the existing systems rely on human intervention to select certain points of the leaf to help the system align the leaf or to select the best result among a few candidates after the classification is done. An Automated Plant Classification System (APCS) is introduced in this thesis to overcome the aforementioned limitations by proposing an automated alignment algorithm to eliminate the need for human intervention to align the leaf. A new set of Quartile Features (QF) is also proposed to express the partial shape of the leaf. Furthermore, optimizing the performance is also targeted in this research by integrating the proposed Quartile Features with the most discriminant shape and color features in the literature, in order to select the optimal feature vector for the proposed system. The proposed automated alignment algorithm is based on a similarity measure between the vertical and horizontal halves of the leaf. Once the leaf is aligned, the image is sliced into horizontal and vertical quartiles, and the area of each quartile is calculated to extract the proposed Quartile Features. To optimize the performance and select the final features for the proposed system, Quartile Features and the other categories of shape and color features investigated in this research have been tested and evaluated individually and in combinations. The most discriminant features in each category are then combined to form the final feature vector input to the classifier. A Nearest Neighbor classifier (1-NN) is used to compute the similarity of a query leaf image with all the samples in the database by calculating the distance between their respective feature vectors. The experiments in this research have been conducted using two leaf datasets. The first is Flavia dataset which has been used as a benchmark by several researchers in the field of plant recognition. The second dataset is collected by the author, from Putrajaya and Perdana Botanical gardens, containing a total of 396 leaves from 17 species endemic to Malaysia and Tropical Asia. The experimental results and comparisons indicate the efficiency of the proposed automated alignment algorithm and the proposed Quartile Features. The results of using the final selected features have shown an impressive performance, achieving an average accuracy rate of 98.32% for Flavia dataset and 91.29% for Leaves dataset, using k-fold cross-validation.