2025_programme: A First Comparison of GANs and Diffusion Models for Generating Sidescan Sonar Images



  • Day: June 17, Tuesday
      Location / Time: D. CHLOE at 11:20-11:40
  • Last minutes changes: -
  • Session: 18. Towards Automatic Target Recognition. Detection, Classification and Modelling
    Organiser(s): Johannes Groen, Yan Pailhas, Roy Edgar Hansen, Narada Warakagoda
    Chairperson(s): Johannes Groen, Yan Pailhas
  • Lecture: A First Comparison of GANs and Diffusion Models for Generating Sidescan Sonar Images [Invited]
    Paper ID: 2131
    Author(s): Yannik Steiniger, Benjamin Lehmann
    Presenter: Yannik Steiniger
    Abstract: With the analysis of sonar images being more and more shifted towards using deep learning models the need for sufficient training datasets grows. Compared to other computer vision applications, like autonomous driving, there exists no large, publicly available dataset of sonar images for a broad range of object classes. Furthermore, gathering and labelling real world data with a high degree of variability is cumbersome and expensive. Thus, research on generating synthetic sonar images gained more attention in recent years. Applications range from highly realistic but computational slow models based on ray tracing or finite elements to faster but less transparent deep learning based generative models, like generative adversarial networks (GAN). In the past few years diffusion models (DM) have been established as another standard in the computer vision domain for generating synthetic images, especially to generate natural RGB images. This alternative approach to GANs does not suffer from training instabilities, like mode collapse, and can produce more detailed images, but typically requires more training data.\n\nIn this work we, for the first time, train a GAN as well as a DM model to generate realistic sidescan sonar images of multiple object classes. The aim of these synthetic images is to augment a dataset of real images for the training of a convolutional neural network (CNN) for classification. In our study, we take important criterions, like size of training dataset, variability of generated images, degree of details as well as computational cost of both methods into account. Finally, we compare the images generated by GAN and DM based on qualitative measures, e.g., realistic object and shadow shapes, and on their ability to increase the performance of the classifier CNN when being used to augment the training dataset. \n
      Download the full paper
  • Corresponding author: Dr Yannik Steiniger
    Affiliation: German Aerospace Center (DLR)
    Country: Germany