2023_programme: On the detection and classification of objects in scarce sidescan sonar image dataset with deep learning methods



  • Session: 15. Towards Automatic Target Recognition. Detection, Classification and Modelling
    Organiser(s): Johannes Groen, Yan Pailhas, Roy Edgar Hansen, Jessica Topple and Narada Warakagoda
  • Lecture: On the detection and classification of objects in scarce sidescan sonar image dataset with deep learning methods [invited]
    Paper ID: 1908
    Author(s): Steiniger Yannik, Stoppe Jannis, Kraus Dieter, Meisen Tobias
    Presenter: Steiniger Yannik
    Abstract: While the past years have shown that deep learning (DL) methods like convolutional neural networks (CNN) can achieve excellent results in classifying sonar images, less research has been done regarding the DL based detection of objects. \nIn classical automatic target recognition, first regions of interest are localized. The corresponding snippets are then filtered to reduce false alarms. Finally, a classification is carried out, e.g., to distinguish rocks from cylinders. In contrast to this, DL-based detection typically combines the localization and classification into one model. However, state-of-the-art DL detectors rely on a large training dataset, which is critical when dealing with sonar imagery, since training data is scarce. Thus, it is not clear whether a standard DL detector is preferable over a two-step approach with two distinct DL models.\nIn this work, we report on several experiments to answer this question. More precisely, the one-stage detector YOLOv3 and two-stage detector CenterNet2 are used to detect four different object classes in sidescan sonar images. Furthermore, for the two-step approach, the models are trained to detect all objects as general targets. In the second step a CNN is used to carry out the classification. In addition, we also compare this to a classical template matching approach for localization combined with the same CNN.\nOur results show that with only 769 training images available a one-step approach is beneficial. Additional ablations studies show that in the two-step approach the localization is not precise enough to allow the CNN a correct classification of the extracted snippets. Furthermore, both DL models outperform the template matching approach. However, the performance for underrepresented classes, where only a few samples are available, is still not optimal and further research needs to be done in order to improve the models for these classes.
      Download the full paper
  • Corresponding author: Mr Yannik Steiniger
    Affiliation: German Aerospace Center (DLR)
    Country: Germany
    e-mail: