Picture for Wenxi Chen

Wenxi Chen

URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models

Add code
Feb 25, 2025
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon

DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning

Add code
Oct 12, 2024
Figure 1 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 2 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 3 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 4 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Viaarxiv icon

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs

Add code
Oct 12, 2024
Figure 1 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 2 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 3 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 4 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Viaarxiv icon

ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

Add code
Jun 17, 2024
Figure 1 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 2 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 3 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 4 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Viaarxiv icon

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Add code
Jun 11, 2024
Viaarxiv icon

Meta-Learning for Fast Adaptation in Intent Inferral on a Robotic Hand Orthosis for Stroke

Add code
Mar 19, 2024
Viaarxiv icon

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Add code
Jan 07, 2024
Figure 1 for EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Figure 2 for EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Figure 3 for EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Figure 4 for EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Viaarxiv icon

If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills

Add code
Feb 05, 2020
Figure 1 for If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills
Figure 2 for If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills
Figure 3 for If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills
Figure 4 for If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills
Viaarxiv icon

Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys

Add code
May 25, 2019
Figure 1 for Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys
Figure 2 for Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys
Figure 3 for Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys
Figure 4 for Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys
Viaarxiv icon