Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Jan 08, 2024

Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu

Figure 1 for SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Figure 2 for SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Figure 3 for SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Figure 4 for SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Share this with someone who'll enjoy it:

Abstract:Human communication is a complex and diverse process that not only involves multiple factors such as language, commonsense, and cultural backgrounds but also requires the participation of multimodal information, such as speech. Large Language Model (LLM)-based multi-agent systems have demonstrated promising performance in simulating human society. Can we leverage LLM-based multi-agent systems to simulate human communication? However, current LLM-based multi-agent systems mainly rely on text as the primary medium. In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication. SpeechAgents utilizes multi-modal LLM as the control center for individual agent and employes multi-modal signals as the medium for exchanged messages among agents. Additionally, we propose Multi-Agent Tuning to enhance the multi-agent capabilities of LLM without compromising general abilities. To strengthen and evaluate the effectiveness of human communication simulation, we build the Human-Communication Simulation Benchmark. Experimental results demonstrate that SpeechAgents can simulate human communication dialogues with consistent content, authentic rhythm, and rich emotions and demonstrate excellent scalability even with up to 25 agents, which can apply to tasks such as drama creation and audio novels generation. Code and models will be open-sourced at https://github. com/0nutation/SpeechAgents

* work in progress

View paper on

Share this with someone who'll enjoy it:

Title:SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Paper and Code