HOME Board


Hit 542
Subject [ICIP 2023] 2 papers have been accepted (Sungjune Park and Yeon Ju Kim) in IEEE ICIP 2023
Name 관리자
Date 2023-06-23
1. Title: Robust multispectral pedestrian detection via spectral position-free feature mapping

Authors: Sungjune Park, Jung Uk Kim, Jin Mo Song, and Yong Man Ro

Abstract: Recently, although multispectral pedestrian detection has achieved remarkable performances, there is still a problem to be handled, position shift problem. Due to the problem, a pedestrian looks like existing in different positions between each modal image. Then, a single bounding box usually fails to capture an entire pedestrian properly in both modal images at the same time, which means it would not contain some parts of a pedestrian and includes noisy backgrounds instead. In this paper, we propose a novel approach, that is, a pedestrian feature mapping from mis-captured pedestrian features to well-captured pedestrian features which encode an entire pedestrian properly in both modal images. To this end, we utilize a memory architecture which stores well-captured pedestrian features, and then, the well-captured features can enhance the quality of pedestrian representation by providing the distinctive information of a pedestrian. We validate the effectiveness of our approach with comprehensive experiments on two multispectral pedestrian detection datasets, achieving state-of-the-art performances.

2. Title: Mitigating Dataset Bias in Image Captioning through CLIP Confounder-free Captioning Network

Authors: YeonJu Kim, Junho Kim, Byung-Kwan Lee, Sebin Shin, and Yong Man Ro

Abstract: The dataset bias has been identified as a major challenge in image captioning. When the image captioning model predicts a word, it should consider the visual evidence associated with the word, but the model tends to use contextual evidence from the dataset bias and results in biased captions, especially when the dataset is biased toward some specific situations. To solve this problem, we approach from the causal inference perspective and design a causal graph. Based on the causal graph, we propose a novel method named C 2Cap which is CLIP  confounder-free captioning network. We use the global visual confounder to control the confounding factors in the image and train the model  to produce debiased captions. We validate our proposed method on MSCOCO benchmark and demonstrate the effectiveness of our method.