KAIST Image and Video Systems lab

International Conference

	No. 354 Title Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens Date December 2023 Authors Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, and Yong Man Ro Publisher IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 URL

	No. 353 Title Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper Date December 2023 Authors Jeong Hun Two, Minsu Kim, Shinji Watanabe, and Yong Man Ro Publisher IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 URL

	No. 352 Title Persona Extraction through Semantic Similarity for Emotional Support Conversation Generation Date December 2023 Authors Seunghee Han, Se Jin Park, Chae Won Kim, and Yong Man Ro Publisher IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 URL

	No. 351 Title Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models Date December 2023 Authors Jeongsoo Choi, Minsu Kim, Se Jin Park, and Yong Man Ro Publisher IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 URL

	No. 350 Title Exploring Phonetic Context-aware Lip-Sync for Talking Face Generation Date December 2023 Authors Se Jin Park, Minsu Kim, Jeongsoo Choi, and Yong Man Ro Publisher IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 URL

	No. 349 Title Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge Date December 2023 Authors Seongyeop Kim, Hyung-Il Kim, and Yong Man Ro Publisher AAAI Conference on Artificial Intelligence (AAAI) 2023 URL

	No. 348 Title Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model Date October 2023 Authors Joanna Hong, Se Jin Park, and Yong Man Ro Publisher Findings of the Association for Computational Linguistics, (EMNLP) 2023 URL

	No. 347 Title Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge Date July 2023 Authors Minsu Kim, Jeong Hun Yeo, Jeongsoo Choi, and Yong Man Ro (* equal contribution) Publisher International Conference on Computer Vision (ICCV), 2023 URL

	No. 346 Title Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning Date July 2023 Authors Byung-Kwan Lee, Junho Kim, and Yong Man Ro (* equally contributed) Publisher International Conference on Computer Vision (ICCV), 2023 URL

	No. 345 Title DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding Date July 2023 Authors Jeongsoo Choi, Joanna Hong, and Yong Man Ro (* equally contributed) Publisher International Conference on Computer Vision (ICCV), 2023 URL

	1 2 3 4 5 6 7 8 9 10