Picture for Jiliang Hu

Jiliang Hu

VHASR: A Multimodal Speech Recognition System With Vision Hotwords

Add code
Oct 01, 2024
Viaarxiv icon