Picture for Xiaofei Wang

Xiaofei Wang

Joint Modelling Histology and Molecular Markers for Cancer Classification

Add code
Feb 11, 2025
Viaarxiv icon

Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation

Add code
Feb 04, 2025
Viaarxiv icon

Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings

Add code
Jan 28, 2025
Viaarxiv icon

Imperceptible Adversarial Attacks on Point Clouds Guided by Point-to-Surface Field

Add code
Dec 26, 2024
Viaarxiv icon

Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models

Add code
Dec 17, 2024
Viaarxiv icon

Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages

Add code
Nov 11, 2024
Figure 1 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 2 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 3 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 4 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Viaarxiv icon

Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation

Add code
Sep 06, 2024
Figure 1 for Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
Figure 2 for Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
Figure 3 for Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
Figure 4 for Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
Viaarxiv icon

Exploring Robust Face-Voice Matching in Multilingual Environments

Add code
Jul 29, 2024
Viaarxiv icon

Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech

Add code
Jul 17, 2024
Figure 1 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 2 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 3 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 4 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Viaarxiv icon

Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection

Add code
Jul 12, 2024
Viaarxiv icon