Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Training Agents with Weakly Supervised Feedback from Large Language Models

Nov 29, 2024

Dihong Gong, Pu Lu, Zelong Wang, Meng Zhou, Xiuqiang He

Figure 1 for Training Agents with Weakly Supervised Feedback from Large Language Models

Figure 2 for Training Agents with Weakly Supervised Feedback from Large Language Models

Figure 3 for Training Agents with Weakly Supervised Feedback from Large Language Models

Figure 4 for Training Agents with Weakly Supervised Feedback from Large Language Models

Share this with someone who'll enjoy it:

Abstract:Large Language Models (LLMs) offer a promising basis for creating agents that can tackle complex tasks through iterative environmental interaction. Existing methods either require these agents to mimic expert-provided trajectories or rely on definitive environmental feedback for reinforcement learning which limits their application to specific scenarios like gaming or code generation. This paper introduces a novel training method for LLM-based agents using weakly supervised signals from a critic LLM, bypassing the need for expert trajectories or definitive feedback. Our agents are trained in iterative manner, where they initially generate trajectories through environmental interaction. Subsequently, a critic LLM selects a subset of good trajectories, which are then used to update the agents, enabling them to generate improved trajectories in the next iteration. Extensive tests on the API-bank dataset show consistent improvement in our agents' capabilities and comparable performance to GPT-4, despite using open-source models with much fewer parameters.

View paper on

Share this with someone who'll enjoy it:

Title:Training Agents with Weakly Supervised Feedback from Large Language Models

Paper and Code