Picture for Heyang Liu

Heyang Liu

CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching

Add code
Oct 09, 2025
Viaarxiv icon

VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models

Add code
May 21, 2025
Viaarxiv icon

VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality Generation

Add code
Apr 05, 2025
Viaarxiv icon

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

Add code
Dec 15, 2024
Figure 1 for Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal
Figure 2 for Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal
Figure 3 for Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal
Figure 4 for Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal
Viaarxiv icon

Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm

Add code
Aug 16, 2024
Figure 1 for Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm
Figure 2 for Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm
Figure 3 for Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm
Figure 4 for Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm
Viaarxiv icon

Decoding Linguistic Representations of Human Brain

Add code
Jul 30, 2024
Figure 1 for Decoding Linguistic Representations of Human Brain
Figure 2 for Decoding Linguistic Representations of Human Brain
Figure 3 for Decoding Linguistic Representations of Human Brain
Figure 4 for Decoding Linguistic Representations of Human Brain
Viaarxiv icon

Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models

Add code
Jun 17, 2024
Viaarxiv icon

M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset

Add code
Mar 21, 2024
Viaarxiv icon

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview

Add code
Mar 01, 2024
Viaarxiv icon

MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception

Add code
Jan 15, 2024
Viaarxiv icon