Object Detection Using Deep Learning - Learning where to search using visual attention




Detecting and identifying the different objects in an image fast and reliably is an important skill for interacting with one’s environment. The main problem is that in theory, all parts of an image have to be searched for objects on many different scales to make sure that no object instance is missed. It however takes considerable time and effort to actually classify the content of a given image region and both time and computational capacities that an agent can spend on classification are limited. Humans use a process called visual attention to quickly decide which locations of an image need to be processed in detail and which can be ignored. This allows us to deal with the huge amount of visual information and to employ the capacities of our visual system efficiently. For computer vision, researchers have to deal with exactly the same problems, so learning from the behaviour of humans provides a promising way to improve existing algorithms. In the presented master’s thesis, a model is trained with eye tracking data recorded from 15 participants that were asked to search images for objects from three different categories. It uses a deep convolutional neural network to extract features from the input image that are then combined to form a saliency map. This map provides information about which image regions are interesting when searching for the given target object and can thus be used to reduce the parts of the image that have to be processed in detail. The method is based on a recent publication of Kümmerer et al., but in contrast to the original method that computes general, task independent saliency, the presented model is supposed to respond differently when searching for different target categories.

Author(s): Alina Kloss
Year: 2015
Month: May
Day: 26

Department(s): Autonomous Motion
Research Project(s): Modeling Top-Down Saliency for Visual Object Search
Bibtex Type: Thesis (mastersthesis)

School: Eberhard Karls Universität Tübingen
Attachments: PDF


  title = {Object Detection Using Deep Learning - Learning where to search using visual attention},
  author = {Kloss, Alina},
  school = {Eberhard Karls Universität Tübingen},
  month = may,
  year = {2015},
  month_numeric = {5}