Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rumei Li

Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching

Jun 05, 2025

Jianfei Zhang, Bei Li, Jun Bai, Rumei Li, Yanmeng Wang, Chenghua Lin, Wenge Rong

Abstract:In-Context Learning (ICL) empowers Large Language Models (LLMs) for rapid task adaptation without Fine-Tuning (FT), but its reliance on demonstration selection remains a critical challenge. While many-shot ICL shows promising performance through scaled demonstrations, the selection method for many-shot demonstrations remains limited to random selection in existing work. Since the conventional instance-level retrieval is not suitable for many-shot scenarios, we hypothesize that the data requirements for in-context learning and fine-tuning are analogous. To this end, we introduce a novel gradient matching approach that selects demonstrations by aligning fine-tuning gradients between the entire training set of the target task and the selected examples, so as to approach the learning effect on the entire training set within the selected examples. Through gradient matching on relatively small models, e.g., Qwen2.5-3B or Llama3-8B, our method consistently outperforms random selection on larger LLMs from 4-shot to 128-shot scenarios across 9 diverse datasets. For instance, it surpasses random selection by 4% on Qwen2.5-72B and Llama3-70B, and by around 2% on 5 closed-source LLMs. This work unlocks more reliable and effective many-shot ICL, paving the way for its broader application.

* accepted to the ACL2025 Findings

Via

Access Paper or Ask Questions

XBOUND: Exploring the Capability Boundaries of Device-Control Agents through Trajectory Tree Exploration

May 27, 2025

Shaoqing Zhang, Kehai Chen, Zhuosheng Zhang, Rumei Li, Rongxiang Weng, Yang Xiang, Liqiang Nie, Min Zhang

Abstract:Recent advancements in vision-language models (VLMs) have spurred increased interest in Device-Control Agents (DC agents), such as utilizing in-the-wild device control to manage graphical user interfaces. Conventional methods for assessing the capabilities of DC agents, such as computing step-wise action accuracy and overall task success rates, provide a macroscopic view of DC agents' performance; however, they fail to offer microscopic insights into potential errors that may occur in real-world applications. Conducting a finer-grained performance evaluation of DC agents presents significant challenges. This study introduces a new perspective on evaluation methods for DC agents by proposing the XBOUND evaluation method, which employs the calculation of a novel Explore Metric to delineate the capability boundaries of DC agents. Compared to previous evaluation methods, XBOUND focuses on individual states to assess the proficiency of DC agents in mastering these states. Furthermore, we have developed a ``pseudo'' episode tree dataset derived from Android Control test data. Utilizing this dataset and XBOUND, we comprehensively evaluate the OS-Atlas and UI-TARS series, examining both the overall and specific performance across five common tasks. Additionally, we select representative cases to highlight the current deficiencies and limitations inherent in both series. Code is available at https://github.com/sqzhang-lazy/XBOUND.

Via

Access Paper or Ask Questions

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Dec 30, 2024

Jianfei Zhang, Jun Bai, Bei Li, Yanmeng Wang, Rumei Li, Chenghua Lin, Wenge Rong

Abstract:Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human. However, human values are inherently diverse among different individuals, making it insufficient to align LLMs solely with general preferences. To address this, personalizing LLMs according to individual feedback emerges as a promising solution. Nonetheless, this approach presents challenges in terms of the efficiency of alignment algorithms. In this work, we introduce a flexible paradigm for individual preference alignment. Our method fundamentally improves efficiency by disentangling preference representation from text generation in LLMs. We validate our approach across multiple text generation tasks and demonstrate that it can produce aligned quality as well as or better than PEFT-based methods, while reducing additional training time for each new individual preference by $80\%$ to $90\%$ in comparison with them.

* Coling 2025

Via

Access Paper or Ask Questions

SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models

Aug 05, 2024

Muxi Diao, Rumei Li, Shiyang Liu, Guogang Liao, Jingang Wang, Xunliang Cai, Weiran Xu

Abstract:As large language models (LLMs) continue to advance in capability and influence, ensuring their security and preventing harmful outputs has become crucial. A promising approach to address these concerns involves training models to automatically generate adversarial prompts for red teaming. However, the evolving subtlety of vulnerabilities in LLMs challenges the effectiveness of current adversarial methods, which struggle to specifically target and explore the weaknesses of these models. To tackle these challenges, we introduce the $\mathbf{S}\text{elf-}\mathbf{E}\text{volving }\mathbf{A}\text{dversarial }\mathbf{S}\text{afety }\mathbf{(SEAS)}$ optimization framework, which enhances security by leveraging data generated by the model itself. SEAS operates through three iterative stages: Initialization, Attack, and Adversarial Optimization, refining both the Red Team and Target models to improve robustness and safety. This framework reduces reliance on manual testing and significantly enhances the security capabilities of LLMs. Our contributions include a novel adversarial framework, a comprehensive safety dataset, and after three iterations, the Target model achieves a security level comparable to GPT-4, while the Red Team model shows a marked increase in attack success rate (ASR) against advanced models.

Via

Access Paper or Ask Questions

Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA

Aug 28, 2023

Guanting Dong, Rumei Li, Sirui Wang, Yupeng Zhang, Yunsen Xian, Weiran Xu

Figure 1 for Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA

Figure 2 for Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA

Figure 3 for Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA

Figure 4 for Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA

Abstract:Knowledge Base Question Answering (KBQA) aims to answer natural language questions with factual information such as entities and relations in KBs. However, traditional Pre-trained Language Models (PLMs) are directly pre-trained on large-scale natural language corpus, which poses challenges for them in understanding and representing complex subgraphs in structured KBs. To bridge the gap between texts and structured KBs, we propose a Structured Knowledge-aware Pre-training method (SKP). In the pre-training stage, we introduce two novel structured knowledge-aware tasks, guiding the model to effectively learn the implicit relationship and better representations of complex subgraphs. In downstream KBQA task, we further design an efficient linearization strategy and an interval attention mechanism, which assist the model to better encode complex subgraphs and shield the interference of irrelevant subgraphs during reasoning respectively. Detailed experiments and analyses on WebQSP verify the effectiveness of SKP, especially the significant improvement in subgraph retrieval (+4.08% H@10).

* Accepted as a short paper at CIKM 2023

Via

Access Paper or Ask Questions

Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Oct 16, 2022

Jian Song, Di Liang, Rumei Li, Yuntao Li, Sirui Wang, Minlong Peng, Wei Wu, Yongxin Yu

Figure 1 for Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Figure 2 for Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Figure 3 for Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Figure 4 for Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Abstract:Transformer-based pre-trained models like BERT have achieved great progress on Semantic Sentence Matching. Meanwhile, dependency prior knowledge has also shown general benefits in multiple NLP tasks. However, how to efficiently integrate dependency prior structure into pre-trained models to better model complex semantic matching relations is still unsettled. In this paper, we propose the \textbf{D}ependency-Enhanced \textbf{A}daptive \textbf{F}usion \textbf{A}ttention (\textbf{DAFA}), which explicitly introduces dependency structure into pre-trained models and adaptively fuses it with semantic information. Specifically, \textbf{\emph{(i)}} DAFA first proposes a structure-sensitive paradigm to construct a dependency matrix for calibrating attention weights. It adopts an adaptive fusion module to integrate the obtained dependency information and the original semantic signals. Moreover, DAFA reconstructs the attention calculation flow and provides better interpretability. By applying it on BERT, our method achieves state-of-the-art or competitive performance on 10 public datasets, demonstrating the benefits of adaptively fusing dependency structure in semantic matching task.

* Accepted by Findings of EMNLP 2022

Via

Access Paper or Ask Questions

InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Mar 08, 2022

Liwen Wang, Rumei Li, Yang Yan, Yuanmeng Yan, Sirui Wang, Wei Wu, Weiran Xu

Figure 1 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Figure 2 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Figure 3 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Figure 4 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Abstract:Recently, prompt-based methods have achieved significant performance in few-shot learning scenarios by bridging the gap between language model pre-training and fine-tuning for downstream tasks. However, existing prompt templates are mostly designed for sentence-level tasks and are inappropriate for sequence labeling objectives. To address the above issue, we propose a multi-task instruction-based generative framework, named InstructionNER, for low-resource named entity recognition. Specifically, we reformulate the NER task as a generation problem, which enriches source sentences with task-specific instructions and answer options, then inferences the entities and types in natural language. We further propose two auxiliary tasks, including entity extraction and entity typing, which enable the model to capture more boundary information of entities and deepen the understanding of entity type semantics, respectively. Experimental results show that our method consistently outperforms other baselines on five datasets in few-shot settings.

* Work in progress

Via

Access Paper or Ask Questions

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

May 25, 2021

Yuanmeng Yan, Rumei Li, Sirui Wang, Fuzheng Zhang, Wei Wu, Weiran Xu

Figure 1 for ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Figure 2 for ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Figure 3 for ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Figure 4 for ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Abstract:Learning high-quality sentence representations benefits a wide range of natural language processing tasks. Though BERT-based pre-trained language models achieve high performance on many downstream tasks, the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity (STS) tasks. In this paper, we present ConSERT, a Contrastive Framework for Self-Supervised Sentence Representation Transfer, that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way. By making use of unlabeled texts, ConSERT solves the collapse issue of BERT-derived sentence representations and make them more applicable for downstream tasks. Experiments on STS datasets demonstrate that ConSERT achieves an 8\% relative improvement over the previous state-of-the-art, even comparable to the supervised SBERT-NLI. And when further incorporating NLI supervision, we achieve new state-of-the-art performance on STS tasks. Moreover, ConSERT obtains comparable results with only 1000 samples available, showing its robustness in data scarcity scenarios.

* Accepted by ACL2021, 10 pages, 7 figures, 4 tables

Via

Access Paper or Ask Questions