Picture for Yakun Zhang

Yakun Zhang

AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals

Add code
Jan 28, 2025
Figure 1 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 2 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 3 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Figure 4 for AVE Speech Dataset: A Comprehensive Benchmark for Multi-Modal Speech Recognition Integrating Audio, Visual, and Electromyographic Signals
Viaarxiv icon

LLM-based Abstraction and Concretization for GUI Test Migration

Add code
Sep 08, 2024
Figure 1 for LLM-based Abstraction and Concretization for GUI Test Migration
Figure 2 for LLM-based Abstraction and Concretization for GUI Test Migration
Figure 3 for LLM-based Abstraction and Concretization for GUI Test Migration
Figure 4 for LLM-based Abstraction and Concretization for GUI Test Migration
Viaarxiv icon

Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization

Add code
Mar 24, 2024
Figure 1 for Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Figure 2 for Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Figure 3 for Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Figure 4 for Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Viaarxiv icon

Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation

Add code
Aug 24, 2023
Viaarxiv icon