Picture for Jiliang Hu

Jiliang Hu

VHASR: A Multimodal Speech Recognition System With Vision Hotwords

Add code
Oct 01, 2024
Figure 1 for VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Figure 2 for VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Figure 3 for VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Figure 4 for VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Viaarxiv icon