Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jinyu Cai

WereWolf-Plus: An Update of Werewolf Game setting Based on DSGBench

Jun 15, 2025

Xinyuan Xia, Yuanyi Song, Haomin Ma, Jinyu Cai

Abstract:With the rapid development of LLM-based agents, increasing attention has been given to their social interaction and strategic reasoning capabilities. However, existing Werewolf-based benchmarking platforms suffer from overly simplified game settings, incomplete evaluation metrics, and poor scalability. To address these limitations, we propose WereWolf-Plus, a multi-model, multi-dimensional, and multi-method benchmarking platform for evaluating multi-agent strategic reasoning in the Werewolf game. The platform offers strong extensibility, supporting customizable configurations for roles such as Seer, Witch, Hunter, Guard, and Sheriff, along with flexible model assignment and reasoning enhancement strategies for different roles. In addition, we introduce a comprehensive set of quantitative evaluation metrics for all special roles, werewolves, and the sheriff, and enrich the assessment dimensions for agent reasoning ability, cooperation capacity, and social influence. WereWolf-Plus provides a more flexible and reliable environment for advancing research on inference and strategic interaction within multi-agent communities. Our code is open sourced at https://github.com/MinstrelsyXia/WereWolfPlus.

Via

Access Paper or Ask Questions

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

Feb 26, 2025

Jinyu Cai, Yusei Ishimizu, Mingyue Zhang, Munan Li, Jialong Li, Kenji Tei

Abstract:Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies. This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints. In this framework, participant agents, as social media users, continuously evolve their language expression, while supervisory agents emulate platform-level regulation by assessing policy violations. To achieve a more faithful simulation, we employ a dual design of language strategies (constraint and expression) to differentiate conflicting goals and utilize an LLM-driven GA (Genetic Algorithm) for the selection, mutation, and crossover of language strategies. The framework is evaluated using two distinct scenarios: an abstract password game and a realistic simulated illegal pet trade scenario. Experimental results demonstrate that as the number of dialogue rounds increases, both the number of uninterrupted dialogue turns and the accuracy of information transmission improve significantly. Furthermore, a user study with 40 participants validates the real-world relevance of the generated dialogues and strategies. Moreover, ablation studies validate the importance of the GA, emphasizing its contribution to long-term adaptability and improved overall results.

* The manuscript has been submitted to IEEE Transactions on Computational Social Systems

Via

Access Paper or Ask Questions

Automatic Adaptation Rule Optimization via Large Language Models

Jul 02, 2024

Yusei Ishimizu, Jialong Li, Jinglue Xu, Jinyu Cai, Hitoshi Iba, Kenji Tei

Abstract:Rule-based adaptation is a foundational approach to self-adaptation, characterized by its human readability and rapid response. However, building high-performance and robust adaptation rules is often a challenge because it essentially involves searching the optimal design in a complex (variables) space. In response, this paper attempt to employ large language models (LLMs) as a optimizer to construct and optimize adaptation rules, leveraging the common sense and reasoning capabilities inherent in LLMs. Preliminary experiments conducted in SWIM have validated the effectiveness and limitation of our method.

Via

Access Paper or Ask Questions

Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

May 05, 2024

Jinyu Cai, Jialong Li, Mingyue Zhang, Munan Li, Chen-Shu Wang, Kenji Tei

Figure 1 for Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

Figure 2 for Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

Figure 3 for Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

Figure 4 for Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

Abstract:Social media platforms such as Twitter, Reddit, and Sina Weibo play a crucial role in global communication but often encounter strict regulations in geopolitically sensitive regions. This situation has prompted users to ingeniously modify their way of communicating, frequently resorting to coded language in these regulated social media environments. This shift in communication is not merely a strategy to counteract regulation, but a vivid manifestation of language evolution, demonstrating how language naturally evolves under societal and technological pressures. Studying the evolution of language in regulated social media contexts is of significant importance for ensuring freedom of speech, optimizing content moderation, and advancing linguistic research. This paper proposes a multi-agent simulation framework using Large Language Models (LLMs) to explore the evolution of user language in regulated social media environments. The framework employs LLM-driven agents: supervisory agent who enforce dialogue supervision and participant agents who evolve their language strategies while engaging in conversation, simulating the evolution of communication styles under strict regulations aimed at evading social media regulation. The study evaluates the framework's effectiveness through a range of scenarios from abstract scenarios to real-world situations. Key findings indicate that LLMs are capable of simulating nuanced language dynamics and interactions in constrained settings, showing improvement in both evading supervision and information accuracy as evolution progresses. Furthermore, it was found that LLM agents adopt different strategies for different scenarios.

* Accepted by IEEE WCCI 2024

Via

Access Paper or Ask Questions

Exploring the Improvement of Evolutionary Computation via Large Language Models

May 05, 2024

Jinyu Cai, Jinglue Xu, Jialong Li, Takuto Ymauchi, Hitoshi Iba, Kenji Tei

Abstract:Evolutionary computation (EC), as a powerful optimization algorithm, has been applied across various domains. However, as the complexity of problems increases, the limitations of EC have become more apparent. The advent of large language models (LLMs) has not only transformed natural language processing but also extended their capabilities to diverse fields. By harnessing LLMs' vast knowledge and adaptive capabilities, we provide a forward-looking overview of potential improvements LLMs can bring to EC, focusing on the algorithms themselves, population design, and additional enhancements. This presents a promising direction for future research at the intersection of LLMs and EC.

* accepted by GECCO 2024

Via

Access Paper or Ask Questions

FGAD: Self-boosted Knowledge Distillation for An Effective Federated Graph Anomaly Detection Framework

Feb 20, 2024

Jinyu Cai, Yunhe Zhang, Zhoumin Lu, Wenzhong Guo, See-kiong Ng

Abstract:Graph anomaly detection (GAD) aims to identify anomalous graphs that significantly deviate from other ones, which has raised growing attention due to the broad existence and complexity of graph-structured data in many real-world scenarios. However, existing GAD methods usually execute with centralized training, which may lead to privacy leakage risk in some sensitive cases, thereby impeding collaboration among organizations seeking to collectively develop robust GAD models. Although federated learning offers a promising solution, the prevalent non-IID problems and high communication costs present significant challenges, particularly pronounced in collaborations with graph data distributed among different participants. To tackle these challenges, we propose an effective federated graph anomaly detection framework (FGAD). We first introduce an anomaly generator to perturb the normal graphs to be anomalous, and train a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones. Then, we leverage a student model to distill knowledge from the trained anomaly detector (teacher model), which aims to maintain the personality of local models and alleviate the adverse impact of non-IID problems. Moreover, we design an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients. Empirical results of the GAD tasks on non-IID graphs compared with state-of-the-art baselines demonstrate the superiority and efficiency of the proposed FGAD method.

Via

Access Paper or Ask Questions

Multi: Multimodal Understanding Leaderboard with Text and Images

Feb 05, 2024

Zichen Zhu, Yang Xu, Lu Chen, Jingkai Yang, Yichuan Ma, Yiming Sun, Hailin Wen, Jiaqi Liu, Jinyu Cai, Yingzi Ma(+4 more)

Figure 1 for Multi: Multimodal Understanding Leaderboard with Text and Images

Figure 2 for Multi: Multimodal Understanding Leaderboard with Text and Images

Figure 3 for Multi: Multimodal Understanding Leaderboard with Text and Images

Figure 4 for Multi: Multimodal Understanding Leaderboard with Text and Images

Abstract:Rapid progress in multimodal large language models (MLLMs) highlights the need to introduce challenging yet realistic benchmarks to the academic community. Existing benchmarks primarily focus on simple natural image understanding, but Multi emerges as a cutting-edge benchmark for MLLMs, offering a comprehensive dataset for evaluating MLLMs against understanding complex figures and tables, and scientific questions. This benchmark, reflecting current realistic examination styles, provides multimodal inputs and requires responses that are either precise or open-ended, similar to real-life school tests. It challenges MLLMs with a variety of tasks, ranging from formula derivation to image detail analysis, and cross-modality reasoning. Multi includes over 18,000 questions, with a focus on science-based QA in diverse formats. We also introduce Multi-Elite, a 500-question subset for testing the extremities of MLLMs, and Multi-Extend, which enhances In-Context Learning research with more than 4,500 knowledge pieces. Our evaluation indicates significant potential for MLLM advancement, with GPT-4V achieving a 63.7% accuracy rate on Multi, in contrast to other MLLMs scoring between 31.3% and 53.7%. Multi serves not only as a robust evaluation platform but also paves the way for the development of expert-level AI.

* Details and access are available at: https://OpenDFM.github.io/MULTI-Benchmark/

Via

Access Paper or Ask Questions

Self-Discriminative Modeling for Anomalous Graph Detection

Oct 10, 2023

Jinyu Cai, Yunhe Zhang, Jicong Fan

Figure 1 for Self-Discriminative Modeling for Anomalous Graph Detection

Figure 2 for Self-Discriminative Modeling for Anomalous Graph Detection

Figure 3 for Self-Discriminative Modeling for Anomalous Graph Detection

Figure 4 for Self-Discriminative Modeling for Anomalous Graph Detection

Abstract:This paper studies the problem of detecting anomalous graphs using a machine learning model trained on only normal graphs, which has many applications in molecule, biology, and social network data analysis. We present a self-discriminative modeling framework for anomalous graph detection. The key idea, mathematically and numerically illustrated, is to learn a discriminator (classifier) from the given normal graphs together with pseudo-anomalous graphs generated by a model jointly trained, where we never use any true anomalous graphs and we hope that the generated pseudo-anomalous graphs interpolate between normal ones and (real) anomalous ones. Under the framework, we provide three algorithms with different computational efficiencies and stabilities for anomalous graph detection. The three algorithms are compared with several state-of-the-art graph-level anomaly detection baselines on nine popular graph datasets (four with small size and five with moderate size) and show significant improvement in terms of AUC. The success of our algorithms stems from the integration of the discriminative classifier and the well-posed pseudo-anomalous graphs, which provide new insights for anomaly detection. Moreover, we investigate our algorithms for large-scale imbalanced graph datasets. Surprisingly, our algorithms, though fully unsupervised, are able to significantly outperform supervised learning algorithms of anomalous graph detection. The corresponding reason is also analyzed.

* This work was submitted to NeurIPS 2023 but was unfortunately rejected

Via

Access Paper or Ask Questions

Value Iteration Networks with Gated Summarization Module

May 16, 2023

Jinyu Cai, Jialong Li, Mingyue Zhang, Kenji Tei

Figure 1 for Value Iteration Networks with Gated Summarization Module

Figure 2 for Value Iteration Networks with Gated Summarization Module

Figure 3 for Value Iteration Networks with Gated Summarization Module

Figure 4 for Value Iteration Networks with Gated Summarization Module

Abstract:In this paper, we address the challenges faced by Value Iteration Networks (VIN) in handling larger input maps and mitigating the impact of accumulated errors caused by increased iterations. We propose a novel approach, Value Iteration Networks with Gated Summarization Module (GS-VIN), which incorporates two main improvements: (1) employing an Adaptive Iteration Strategy in the Value Iteration module to reduce the number of iterations, and (2) introducing a Gated Summarization module to summarize the iterative process. The adaptive iteration strategy uses larger convolution kernels with fewer iteration times, reducing network depth and increasing training stability while maintaining the accuracy of the planning process. The gated summarization module enables the network to emphasize the entire planning process, rather than solely relying on the final global planning outcome, by temporally and spatially resampling the entire planning process within the VI module. We conduct experiments on 2D grid world path-finding problems and the Atari Mr. Pac-man environment, demonstrating that GS-VIN outperforms the baseline in terms of single-step accuracy, planning success rate, and overall performance across different map sizes. Additionally, we provide an analysis of the relationship between input size, kernel size, and the number of iterations in VI-based models, which is applicable to a majority of VI-based models and offers valuable insights for researchers and industrial deployment.

* 13 pages,6 figures,submitted to IEEE ACCESS

Via

Access Paper or Ask Questions

Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection

Feb 13, 2023

Yunhe Zhang, Yan Sun, Jinyu Cai, Jicong Fan

Figure 1 for Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection

Figure 2 for Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection

Figure 3 for Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection

Figure 4 for Deep Graph-Level Orthogonal Hypersphere Compression for Anomaly Detection

Abstract:Graph-level anomaly detection aims to identify anomalous graphs from a collection of graphs in an unsupervised manner. A common assumption of anomaly detection is that a reasonable decision boundary has a hypersphere shape, but may appear some non-conforming phenomena in high dimensions. Towards this end, we firstly propose a novel deep graph-level anomaly detection model, which learns the graph representation with maximum mutual information between substructure and global structure features while exploring a hypersphere anomaly decision boundary. The idea is to ensure the training data distribution consistent with the decision hypersphere via an orthogonal projection layer. Moreover, we further perform the bi-hypersphere compression to emphasize the discrimination of anomalous graphs from normal graphs. Note that our method is not confined to graph data and is applicable to anomaly detection of other data such as images. The numerical and visualization results on benchmark datasets demonstrate the effectiveness and superiority of our methods in comparison to many baselines and state-of-the-arts.

Via

Access Paper or Ask Questions