Notice

Hit 488
Subject [CVPR 2022] Weakly Paired Associative Learning for Sound and Image Representations via Bimodal Associative Memory (by Sangmin Lee) is accepted in CVPR 2022
Name IVY Lab. KAIST
Date 2022-03-04
Title: Weakly Paired Associative Learning for Sound and Image Representations via Bimodal Associative Memory
Authors: Sangmin Lee, Hyung-Il Kim, and Yong Man Ro
 
Data representation learning without labels has attracted increasing attention due to its nature that does not require human annotation. Recently, as data samples are acquired in multi-sensory environments, representation learning has been extended to bimodal data, especially sound and image which are closely related to basic human senses. Existing sound and image representation learning methods necessarily require a large number of sound and image with corresponding pairs. Therefore, it is difficult to ensure the effectiveness of the methods in the weakly paired condition, which lacks paired bimodal data. In fact, according to human cognitive studies, the cognitive functions in the human brain for a certain modality can be enhanced by receiving other modalities, even not directly paired ones. Based on the observation, we propose a new problem to deal with the weakly paired condition: How to boost a certain modal representation even by using other unpaired modal data. To address the issue, we introduce a novel bimodal associative memory (BMA-Memory) with key-value switching that can store bimodal features in sound-image sub-memories and naturally associate with one another. BMA-Memory enables to build sound-image association with small paired bimodal data and to boost the built association with easily obtainable large amount of unpaired data. Through the proposed associative learning, it is possible to reinforce the representation of a certain modality (e.g., sound) even by using other unpaired modal data (e.g., images).