Picture for Shuyue Hu

Shuyue Hu

Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model

Add code
Jul 09, 2025
Viaarxiv icon

The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Add code
May 26, 2025
Viaarxiv icon

Decouple and Orthogonalize: A Data-Free Framework for LoRA Merging

Add code
May 21, 2025
Viaarxiv icon

A Reputation System for Large Language Model-based Multi-agent Systems to Avoid the Tragedy of the Commons

Add code
May 08, 2025
Viaarxiv icon

Nondeterministic Polynomial-time Problem Challenge: An Ever-Scaling Reasoning Benchmark for LLMs

Add code
Apr 15, 2025
Viaarxiv icon

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Add code
Apr 02, 2025
Viaarxiv icon

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Add code
Mar 12, 2025
Viaarxiv icon

Nature-Inspired Population-Based Evolution of Large Language Models

Add code
Mar 03, 2025
Viaarxiv icon

If Multi-Agent Debate is the Answer, What is the Question?

Add code
Feb 12, 2025
Viaarxiv icon

EvoFlow: Evolving Diverse Agentic Workflows On The Fly

Add code
Feb 11, 2025
Figure 1 for EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Figure 2 for EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Figure 3 for EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Figure 4 for EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Viaarxiv icon