Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhe-Yan Li

Creativity in LLM-based Multi-Agent Systems: A Survey

May 27, 2025

Yi-Cheng Lin, Kang-Chieh Chen, Zhe-Yan Li, Tzu-Heng Wu, Tzu-Hsuan Wu, Kuan-Yu Chen, Hung-yi Lee, Yun-Nung Chen

Figure 1 for Creativity in LLM-based Multi-Agent Systems: A Survey

Figure 2 for Creativity in LLM-based Multi-Agent Systems: A Survey

Figure 3 for Creativity in LLM-based Multi-Agent Systems: A Survey

Figure 4 for Creativity in LLM-based Multi-Agent Systems: A Survey

Abstract:Large language model (LLM)-driven multi-agent systems (MAS) are transforming how humans and AIs collaboratively generate ideas and artifacts. While existing surveys provide comprehensive overviews of MAS infrastructures, they largely overlook the dimension of \emph{creativity}, including how novel outputs are generated and evaluated, how creativity informs agent personas, and how creative workflows are coordinated. This is the first survey dedicated to creativity in MAS. We focus on text and image generation tasks, and present: (1) a taxonomy of agent proactivity and persona design; (2) an overview of generation techniques, including divergent exploration, iterative refinement, and collaborative synthesis, as well as relevant datasets and evaluation metrics; and (3) a discussion of key challenges, such as inconsistent evaluation standards, insufficient bias mitigation, coordination conflicts, and the lack of unified benchmarks. This survey offers a structured framework and roadmap for advancing the development, evaluation, and standardization of creative MAS.

* 23 pages

Via

Access Paper or Ask Questions

Text-centric Alignment for Multi-Modality Learning

Feb 12, 2024

Yun-Da Tsai, Ting-Yu Yen, Pei-Fu Guo, Zhe-Yan Li, Shou-De Lin

Figure 1 for Text-centric Alignment for Multi-Modality Learning

Figure 2 for Text-centric Alignment for Multi-Modality Learning

Figure 3 for Text-centric Alignment for Multi-Modality Learning

Figure 4 for Text-centric Alignment for Multi-Modality Learning

Abstract:This research paper addresses the challenge of modality mismatch in multimodal learning, where the modalities available during inference differ from those available at training. We propose the Text-centric Alignment for Multi-Modality Learning (TAMML) approach, an innovative method that utilizes Large Language Models (LLMs) with in-context learning and foundation models to enhance the generalizability of multimodal systems under these conditions. By leveraging the unique properties of text as a unified semantic space, TAMML demonstrates significant improvements in handling unseen, diverse, and unpredictable modality combinations. TAMML not only adapts to varying modalities but also maintains robust performance, showcasing the potential of foundation models in overcoming the limitations of traditional fixed-modality frameworks in embedding representations. This study contributes to the field by offering a flexible, effective solution for real-world applications where modality availability is dynamic and uncertain.

Via

Access Paper or Ask Questions