|
-
No.
354
-
Title
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
-
Date
December 2023
-
Authors
Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, and Yong Man Ro
-
Publisher
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
-
URL
|
|
|
-
No.
353
-
Title
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
-
Date
December 2023
-
Authors
Jeong Hun Two, Minsu Kim, Shinji Watanabe, and Yong Man Ro
-
Publisher
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
-
URL
|
|
|
-
No.
352
-
Title
Persona Extraction through Semantic Similarity for Emotional Support Conversation Generation
-
Date
December 2023
-
Authors
Seunghee Han, Se Jin Park, Chae Won Kim, and Yong Man Ro
-
Publisher
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
-
URL
|
|
|
-
No.
351
-
Title
Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models
-
Date
December 2023
-
Authors
Jeongsoo Choi, Minsu Kim, Se Jin Park, and Yong Man Ro
-
Publisher
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
-
URL
|
|
|
-
No.
350
-
Title
Exploring Phonetic Context-aware Lip-Sync for Talking Face Generation
-
Date
December 2023
-
Authors
Se Jin Park, Minsu Kim, Jeongsoo Choi, and Yong Man Ro
-
Publisher
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
-
URL
|
|
|
-
No.
349
-
Title
Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge
-
Date
December 2023
-
Authors
Seongyeop Kim, Hyung-Il Kim, and Yong Man Ro
-
Publisher
AAAI Conference on Artificial Intelligence (AAAI) 2023
-
URL
|
|
|
-
No.
348
-
Title
Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model
-
Date
October 2023
-
Authors
Joanna Hong, Se Jin Park, and Yong Man Ro
-
Publisher
Findings of the Association for Computational Linguistics, (EMNLP) 2023
-
URL
|
|
|
-
No.
347
-
Title
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
-
Date
July 2023
-
Authors
Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, and Yong Man Ro (* equal contribution)
-
Publisher
International Conference on Computer Vision (ICCV), 2023
-
URL
|
|
|
-
No.
346
-
Title
Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning
-
Date
July 2023
-
Authors
Byung-Kwan Lee*, Junho Kim*, and Yong Man Ro (* equally contributed)
-
Publisher
International Conference on Computer Vision (ICCV), 2023
-
URL
|
|
|
-
No.
345
-
Title
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
-
Date
July 2023
-
Authors
Jeongsoo Choi*, Joanna Hong*, and Yong Man Ro (* equally contributed)
-
Publisher
International Conference on Computer Vision (ICCV), 2023
-
URL
|
|
|