Domain generalization for image recognition

Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
This thesis addresses the prevalent challenge in contemporary deep neural networks (DNNs), where performance diminishes while dealing testing data having different distribution than the training data. We solve it using Domain Generalization (DG) which involves training the model on multiple related source domains which in turn enhances its ability to deal with never seen test domain. Our approach is a tripartite solution involving data augmentation, mixing of style information, and consistency regularization. For data augmentation we mix the amplitude spectrum of two distinct images from two distinct randomly chosen domains. Given that image style is intrinsically tied to the visual domain, the style information is mixed into the lower layers of the neural network. This step familiarizes the model with a range of features, enhancing its ability to generalize on unseen target data. Consistency regularization is then introduced to reduce the prediction error between the original and augmented samples, further elevating the performance. Using three distinct benchmarks, extensive experiments were performed and the Augmentation, Mixing, and Consistency Regularization (AMCR) framework surpasses existing state-of-the-art (SOTA) methods. The outcome also highlight the value of DNNs that can effectively generalize across diverse environment like in the case of self driving cars.
Description
Keywords
Deep neural networks, Data augmentation, Image recognition
Citation