Scene Recognition and Classification Model Based on Human Pre-attentive Visual Attention

Scene recognition is considered as one of the most important functionalities of human vision. In the field of computer vision, scene recognition problem is very significant and important. Scene recognition or classification is a process of organizing images and predicting the class category of a sce...

Full description

Saved in:
Bibliographic Details
Main Author: Ahmad Ridzuan, Kudus
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://ir.unimas.my/id/eprint/34925/2/Ahmad%20Ridzuan%20Kudus%20ft.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Scene recognition is considered as one of the most important functionalities of human vision. In the field of computer vision, scene recognition problem is very significant and important. Scene recognition or classification is a process of organizing images and predicting the class category of a scene image. Human can accurately classify scene effortlessly within short period of time. Using this concept, a novel approach of scene classification model which built based on human pre-attentive visual attention has been proposed in this study by utilizing one of the earliest saliency model to generate a set of high-quality regions potentially contain salient objects. An experimental study was performed to investigate the efficiency of Saliency Toolbox on natural indoor scene images when its parameters are manipulated. At the end of this experiment, an acceptable parameter scales have been finalized for the use of Saliency Toolbox in the proposed scene classification model. The proposed model is developed with three main operations; (i) salient region proposals generation, (ii) feature extraction and concatenation, and (iii) classification. The proposed model has been trained and tested on MIT Indoor 67 dataset. An experiment and a benchmarking testing have been conducted on the proposed model. The results of the experiment have clearly shown providing more salient regions means providing more meaningful details of an input image. For the benchmarking testing, the result has proved that saliency model used in this study is capable to generate high-quality informative salient regions that lead to good classification accuracy. The proposed model achieves a higher average accuracy percentage than a standard approach model, which classifies based on one whole image. This indicates the advantages of using deep features of local salient objects over global deep features. Two experiments have been conducted in this study to test and evaluate human performance on scene classification for various visual input conditions. The accuracy of human classification on complete scene images for a brief period of time in Experiment 1 is compared to the accuracy obtained by the proposed scene classification model. Furthermore, the accuracy of human classification in Experiment 1 is also compared to the accuracy obtained by human in Experiment 2, where their classification performance is tested on cropped salient regions. Evaluation of results from these experiments have shown that the proposed model has not achieved the same standard as human. Using only object features to differentiate between two different scenes is not enough to achieve the best classification accuracy as human. The scene background and layout, relationship between objects and human memory are the other features that affect human classification performance. These other attributes of scene need to be taken in the process of recognition and classification of scene images in further study.