Picture for Jian Xue

Jian Xue

Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages

Add code
Nov 11, 2024
Viaarxiv icon

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation

Add code
Oct 17, 2024
Figure 1 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 2 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 3 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 4 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Viaarxiv icon

Towards Unified Facial Action Unit Recognition Framework by Large Language Models

Add code
Sep 13, 2024
Figure 1 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 2 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 3 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 4 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Viaarxiv icon

MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis

Add code
Sep 11, 2024
Viaarxiv icon

ExpLLM: Towards Chain of Thought for Facial Expression Recognition

Add code
Sep 04, 2024
Viaarxiv icon

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

Add code
Jun 12, 2024
Figure 1 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Figure 2 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Figure 3 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Viaarxiv icon

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Add code
Oct 23, 2023
Figure 1 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 2 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 3 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Viaarxiv icon

Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach

Add code
Oct 06, 2023
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Sep 14, 2023
Viaarxiv icon

FoodSAM: Any Food Segmentation

Add code
Aug 11, 2023
Figure 1 for FoodSAM: Any Food Segmentation
Figure 2 for FoodSAM: Any Food Segmentation
Figure 3 for FoodSAM: Any Food Segmentation
Figure 4 for FoodSAM: Any Food Segmentation
Viaarxiv icon