Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yeqin Zhang

Retrospex: Language Agent Meets Offline Reinforcement Learning Critic

May 17, 2025

Yufei Xiang, Yiqun Shen, Yeqin Zhang, Cam-Tu Nguyen

Abstract:Large Language Models (LLMs) possess extensive knowledge and commonsense reasoning capabilities, making them valuable for creating powerful agents. However, existing LLM agent frameworks have not fully utilized past experiences for improvement. This work introduces a new LLM-based agent framework called Retrospex, which addresses this challenge by analyzing past experiences in depth. Unlike previous approaches, Retrospex does not directly integrate experiences into the LLM's context. Instead, it combines the LLM's action likelihood with action values estimated by a Reinforcement Learning (RL) Critic, which is trained on past experiences through an offline ''retrospection'' process. Additionally, Retrospex employs a dynamic action rescoring mechanism that increases the importance of experience-based values for tasks that require more interaction with the environment. We evaluate Retrospex in ScienceWorld, ALFWorld and Webshop environments, demonstrating its advantages over strong, contemporary baselines.

* Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 4650-4666, ACL Anthology, 2024
* 17 pages

Via

Access Paper or Ask Questions

Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization

Jan 13, 2024

Shiqi Wang, Yeqin Zhang, Cam-Tu Nguyen

Abstract:In open-domain Question Answering (QA), dense retrieval is crucial for finding relevant passages for answer generation. Typically, contrastive learning is used to train a retrieval model that maps passages and queries to the same semantic space. The objective is to make similar ones closer and dissimilar ones further apart. However, training such a system is challenging due to the false negative issue, where relevant passages may be missed during data annotation. Hard negative sampling, which is commonly used to improve contrastive learning, can introduce more noise in training. This is because hard negatives are those closer to a given query, and thus more likely to be false negatives. To address this issue, we propose a novel contrastive confidence regularizer for Noise Contrastive Estimation (NCE) loss, a commonly used loss for dense retrieval. Our analysis shows that the regularizer helps dense retrieval models be more robust against false negatives with a theoretical guarantee. Additionally, we propose a model-agnostic method to filter out noisy negative passages in the dataset, improving any downstream dense retrieval models. Through experiments on three datasets, we demonstrate that our method achieves better retrieval performance in comparison to existing state-of-the-art dense retrieval systems.

* Accepted by AAAI24

Via

Access Paper or Ask Questions

Long Short-Term Planning for Conversational Recommendation Systems

Oct 23, 2023

Xian Li, Hongguang Shi, Yunfei Wang, Yeqin Zhang, Xubin Li, Cam-Tu Nguyen

Figure 1 for Long Short-Term Planning for Conversational Recommendation Systems

Figure 2 for Long Short-Term Planning for Conversational Recommendation Systems

Figure 3 for Long Short-Term Planning for Conversational Recommendation Systems

Figure 4 for Long Short-Term Planning for Conversational Recommendation Systems

Abstract:In Conversational Recommendation Systems (CRS), the central question is how the conversational agent can naturally ask for user preferences and provide suitable recommendations. Existing works mainly follow the hierarchical architecture, where a higher policy decides whether to invoke the conversation module (to ask questions) or the recommendation module (to make recommendations). This architecture prevents these two components from fully interacting with each other. In contrast, this paper proposes a novel architecture, the long short-term feedback architecture, to connect these two essential components in CRS. Specifically, the recommendation predicts the long-term recommendation target based on the conversational context and the user history. Driven by the targeted recommendation, the conversational model predicts the next topic or attribute to verify if the user preference matches the target. The balance feedback loop continues until the short-term planner output matches the long-term planner output, that is when the system should make the recommendation.

* 14 pages, 3 figures. Accepted by ICONIP 2023

Via

Access Paper or Ask Questions

Coarse-to-Fine Knowledge Selection for Document Grounded Dialogs

Feb 23, 2023

Yeqin Zhang, Haomin Fu, Cheng Fu, Haiyang Yu, Yongbin Li, Cam-Tu Nguyen

Figure 1 for Coarse-to-Fine Knowledge Selection for Document Grounded Dialogs

Figure 2 for Coarse-to-Fine Knowledge Selection for Document Grounded Dialogs

Abstract:Multi-document grounded dialogue systems (DGDS) belong to a class of conversational agents that answer users' requests by finding supporting knowledge from a collection of documents. Most previous studies aim to improve the knowledge retrieval model or propose more effective ways to incorporate external knowledge into a parametric generation model. These methods, however, focus on retrieving knowledge from mono-granularity language units (e.g. passages, sentences, or spans in documents), which is not enough to effectively and efficiently capture precise knowledge in long documents. This paper proposes Re3G, which aims to optimize both coarse-grained knowledge retrieval and fine-grained knowledge extraction in a unified framework. Specifically, the former efficiently finds relevant passages in a retrieval-and-reranking process, whereas the latter effectively extracts finer-grain spans within those passages to incorporate into a parametric answer generation model (BART, T5). Experiments on DialDoc Shared Task demonstrate the effectiveness of our method.

* 6 pages, 1 figure. Accepted by ICASSP 2023

Via

Access Paper or Ask Questions