Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianyi Hu

Evaluation Hallucination in Multi-Round Incomplete Information Lateral-Driven Reasoning Tasks

May 28, 2025

Wenhan Dong, Tianyi Hu, Jingyi Zheng, Zhen Sun, Yuemeng Zhao, Yule Liu, Xinlei He, Xinyi Huang

Abstract:Multi-round incomplete information tasks are crucial for evaluating the lateral thinking capabilities of large language models (LLMs). Currently, research primarily relies on multiple benchmarks and automated evaluation metrics to assess these abilities. However, our study reveals novel insights into the limitations of existing methods, as they often yield misleading results that fail to uncover key issues, such as shortcut-taking behaviors, rigid patterns, and premature task termination. These issues obscure the true reasoning capabilities of LLMs and undermine the reliability of evaluations. To address these limitations, we propose a refined set of evaluation standards, including inspection of reasoning paths, diversified assessment metrics, and comparative analyses with human performance.

Via

Access Paper or Ask Questions

CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning

May 23, 2025

Jinyuan Feng, Chaopeng Wei, Tenghai Qiu, Tianyi Hu, Zhiqiang Pu

Abstract:In parameter-efficient fine-tuning, mixture-of-experts (MoE), which involves specializing functionalities into different experts and sparsely activating them appropriately, has been widely adopted as a promising approach to trade-off between model capacity and computation overhead. However, current MoE variants fall short on heterogeneous datasets, ignoring the fact that experts may learn similar knowledge, resulting in the underutilization of MoE's capacity. In this paper, we propose Contrastive Representation for MoE (CoMoE), a novel method to promote modularization and specialization in MoE, where the experts are trained along with a contrastive objective by sampling from activated and inactivated experts in top-k routing. We demonstrate that such a contrastive objective recovers the mutual-information gap between inputs and the two types of experts. Experiments on several benchmarks and in multi-task settings demonstrate that CoMoE can consistently enhance MoE's capacity and promote modularization among the experts.

Via

Access Paper or Ask Questions

Unreal-MAP: Unreal-Engine-Based General Platform for Multi-Agent Reinforcement Learning

Mar 20, 2025

Tianyi Hu, Qingxu Fu, Zhiqiang Pu, Yuan Wang, Tenghai Qiu

Abstract:In this paper, we propose Unreal Multi-Agent Playground (Unreal-MAP), an MARL general platform based on the Unreal-Engine (UE). Unreal-MAP allows users to freely create multi-agent tasks using the vast visual and physical resources available in the UE community, and deploy state-of-the-art (SOTA) MARL algorithms within them. Unreal-MAP is user-friendly in terms of deployment, modification, and visualization, and all its components are open-source. We also develop an experimental framework compatible with algorithms ranging from rule-based to learning-based provided by third-party frameworks. Lastly, we deploy several SOTA algorithms in example tasks developed via Unreal-MAP, and conduct corresponding experimental analyses. We believe Unreal-MAP can play an important role in the MARL field by closely integrating existing algorithms with user-customized tasks, thus advancing the field of MARL.

Via

Access Paper or Ask Questions

OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning

Jan 17, 2025

Jinyuan Feng, Zhiqiang Pu, Tianyi Hu, Dongmin Li, Xiaolin Ai, Huimu Wang

Figure 1 for OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning

Figure 2 for OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning

Figure 3 for OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning

Figure 4 for OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning

Abstract:Building mixture-of-experts (MoE) architecture for Low-rank adaptation (LoRA) is emerging as a potential direction in parameter-efficient fine-tuning (PEFT) for its modular design and remarkable performance. However, simply stacking the number of experts cannot guarantee significant improvement. In this work, we first conduct qualitative analysis to indicate that experts collapse to similar representations in vanilla MoE, limiting the capacity of modular design and computational efficiency. Ulteriorly, Our analysis reveals that the performance of previous MoE variants maybe limited by a lack of diversity among experts. Motivated by these findings, we propose Orthogonal Mixture-of-Experts (OMoE), a resource-efficient MoE variant that trains experts in an orthogonal manner to promote diversity. In OMoE, a Gram-Schmidt process is leveraged to enforce that the experts' representations lie within the Stiefel manifold. By applying orthogonal constraints directly to the architecture, OMoE keeps the learning objective unchanged, without compromising optimality. Our method is simple and alleviates memory bottlenecks, as it incurs minimal experts compared to vanilla MoE models. Experiments on diverse commonsense reasoning benchmarks demonstrate that OMoE can consistently achieve stable and efficient performance improvement when compared with the state-of-the-art methods while significantly reducing the number of required experts.

Via

Access Paper or Ask Questions

CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Dec 26, 2024

Jingyi Zheng, Tianyi Hu, Tianshuo Cong, Xinlei He

Figure 1 for CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Figure 2 for CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Figure 3 for CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Figure 4 for CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Abstract:Backdoor attacks significantly compromise the security of large language models by triggering them to output specific and controlled content. Currently, triggers for textual backdoor attacks fall into two categories: fixed-token triggers and sentence-pattern triggers. However, the former are typically easy to identify and filter, while the latter, such as syntax and style, do not apply to all original samples and may lead to semantic shifts. In this paper, inspired by cross-lingual (CL) prompts of LLMs in real-world scenarios, we propose a higher-dimensional trigger method at the paragraph level, namely CL-attack. CL-attack injects the backdoor by using texts with specific structures that incorporate multiple languages, thereby offering greater stealthiness and universality compared to existing backdoor attack techniques. Extensive experiments on different tasks and model architectures demonstrate that CL-attack can achieve nearly 100% attack success rate with a low poisoning rate in both classification and generation tasks. We also empirically show that the CL-attack is more robust against current major defense methods compared to baseline backdoor attacks. Additionally, to mitigate CL-attack, we further develop a new defense called TranslateDefense, which can partially mitigate the impact of CL-attack.

* The paper has been accepted to AAAI 2025

Via

Access Paper or Ask Questions

Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

Oct 08, 2024

Hao Ma, Tianyi Hu, Zhiqiang Pu, Boyin Liu, Xiaolin Ai, Yanyan Liang, Min Chen

Figure 1 for Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

Figure 2 for Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

Figure 3 for Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

Figure 4 for Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

Abstract:Reinforcement learning (RL) has emerged as a pivotal technique for fine-tuning large language models (LLMs) on specific tasks. However, prevailing RL fine-tuning methods predominantly rely on PPO and its variants. Though these algorithms are effective in general RL settings, they often exhibit suboptimal performance and vulnerability to distribution collapse when applied to the fine-tuning of LLMs. In this paper, we propose CORY, extending the RL fine-tuning of LLMs to a sequential cooperative multi-agent reinforcement learning framework, to leverage the inherent coevolution and emergent capabilities of multi-agent systems. In CORY, the LLM to be fine-tuned is initially duplicated into two autonomous agents: a pioneer and an observer. The pioneer generates responses based on queries, while the observer generates responses using both the queries and the pioneer's responses. The two agents are trained together. During training, the agents exchange roles periodically, fostering cooperation and coevolution between them. Experiments evaluate CORY's performance by fine-tuning GPT-2 and Llama-2 under subjective and objective reward functions on the IMDB Review and GSM8K datasets, respectively. Results show that CORY outperforms PPO in terms of policy optimality, resistance to distribution collapse, and training robustness, thereby underscoring its potential as a superior methodology for refining LLMs in real-world applications.

* 28 pages, 26 images

Via

Access Paper or Ask Questions

Measuring Policy Distance for Multi-Agent Reinforcement Learning

Jan 28, 2024

Tianyi Hu, Zhiqiang Pu, Xiaolin Ai, Tenghai Qiu, Jianqiang Yi

Figure 1 for Measuring Policy Distance for Multi-Agent Reinforcement Learning

Figure 2 for Measuring Policy Distance for Multi-Agent Reinforcement Learning

Figure 3 for Measuring Policy Distance for Multi-Agent Reinforcement Learning

Figure 4 for Measuring Policy Distance for Multi-Agent Reinforcement Learning

Abstract:Diversity plays a crucial role in improving the performance of multi-agent reinforcement learning (MARL). Currently, many diversity-based methods have been developed to overcome the drawbacks of excessive parameter sharing in traditional MARL. However, there remains a lack of a general metric to quantify policy differences among agents. Such a metric would not only facilitate the evaluation of the diversity evolution in multi-agent systems, but also provide guidance for the design of diversity-based MARL algorithms. In this paper, we propose the multi-agent policy distance (MAPD), a general tool for measuring policy differences in MARL. By learning the conditional representations of agents' decisions, MAPD can computes the policy distance between any pair of agents. Furthermore, we extend MAPD to a customizable version, which can quantify differences among agent policies on specified aspects. Based on the online deployment of MAPD, we design a multi-agent dynamic parameter sharing (MADPS) algorithm as an example of the MAPD's applications. Extensive experiments demonstrate that our method is effective in measuring differences in agent policies and specific behavioral tendencies. Moreover, in comparison to other methods of parameter sharing, MADPS exhibits superior performance.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

Content-Based Landmark Retrieval Combining Global and Local Features using Siamese Neural Networks

Aug 03, 2022

Tianyi Hu, Monika Kwiatkowski, Simon Matern, Olaf Hellwich

Figure 1 for Content-Based Landmark Retrieval Combining Global and Local Features using Siamese Neural Networks

Figure 2 for Content-Based Landmark Retrieval Combining Global and Local Features using Siamese Neural Networks

Figure 3 for Content-Based Landmark Retrieval Combining Global and Local Features using Siamese Neural Networks

Figure 4 for Content-Based Landmark Retrieval Combining Global and Local Features using Siamese Neural Networks

Abstract:In this work, we present a method for landmark retrieval that utilizes global and local features. A Siamese network is used for global feature extraction and metric learning, which gives an initial ranking of the landmark search. We utilize the extracted feature maps from the Siamese architecture as local descriptors, the search results are then further refined using a cosine similarity between local descriptors. We conduct a deeper analysis of the Google Landmark Dataset, which is used for evaluation, and augment the dataset to handle various intra-class variances. Furthermore, we conduct several experiments to compare the effects of transfer learning and metric learning, as well as experiments using other local descriptors. We show that a re-ranking using local features can improve the search results. We believe that the proposed local feature extraction using cosine similarity is a simple approach that can be extended to many other retrieval tasks.

Via

Access Paper or Ask Questions

TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

May 28, 2022

Xueyuan Lin, Chengjin Xu, Haihong E, Fenglong Su, Gengxian Zhou, Tianyi Hu, Ningyuan Li, Mingzhi Sun, Haoran Luo

Figure 1 for TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

Figure 2 for TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

Figure 3 for TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

Figure 4 for TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph

Abstract:Multi-hop logical reasoning over knowledge graph (KG) plays a fundamental role in many artificial intelligence tasks. Recent complex query embedding (CQE) methods for reasoning focus on static KGs, while temporal knowledge graphs (TKGs) have not been fully explored. Reasoning over TKGs has two challenges: 1. The query should answer entities or timestamps; 2. The operators should consider both set logic on entity set and temporal logic on timestamp set. To bridge this gap, we define the multi-hop logical reasoning problem on TKGs. With generated three datasets, we propose the first temporal CQE named Temporal Feature-Logic Embedding framework (TFLEX) to answer the temporal complex queries. We utilize vector logic to compute the logic part of Temporal Feature-Logic embeddings, thus naturally modeling all First-Order Logic (FOL) operations on entity set. In addition, our framework extends vector logic on timestamp set to cope with three extra temporal operators (After, Before and Between). Experiments on numerous query patterns demonstrate the effectiveness of our method.

Via

Access Paper or Ask Questions

FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning

May 23, 2022

Xueyuan Lin, Haihong E, Gengxian Zhou, Tianyi Hu, Li Ningyuan, Mingzhi Sun, Haoran Luo

Figure 1 for FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning

Figure 2 for FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning

Figure 3 for FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning

Figure 4 for FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning

Abstract:Current best performing models for knowledge graph reasoning (KGR) are based on complex distribution or geometry objects to embed entities and first-order logical (FOL) queries in low-dimensional spaces. They can be summarized as a center-size framework (point/box/cone, Beta/Gaussian distribution, etc.) whose logical reasoning ability is limited by the expressiveness of the relevant mathematical concepts. Because too deeply the center and the size depend on each other, it is difficult to integrate the logical reasoning ability with other models. To address these challenges, we instead propose a novel KGR framework named Feature-Logic Embedding framework, FLEX, which is the first KGR framework that can not only TRULY handle all FOL operations including conjunction, disjunction, negation and so on, but also support various feature spaces. Specifically, the logic part of feature-logic framework is based on vector logic, which naturally models all FOL operations. Experiments demonstrate that FLEX significantly outperforms existing state-of-the-art methods on benchmark datasets.

Via

Access Paper or Ask Questions