Picture for Muning Wen

Muning Wen

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Add code
Oct 12, 2024
Viaarxiv icon

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

Add code
Oct 06, 2024
Figure 1 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 2 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 3 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 4 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Viaarxiv icon

Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation

Add code
Sep 14, 2024
Viaarxiv icon

P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training

Add code
Aug 10, 2024
Viaarxiv icon

Reinforcing Language Agents via Policy Optimization with Action Decomposition

Add code
May 23, 2024
Viaarxiv icon

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

Add code
Mar 10, 2024
Figure 1 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 2 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 3 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 4 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Viaarxiv icon

Entropy-Regularized Token-Level Policy Optimization for Large Language Models

Add code
Feb 09, 2024
Viaarxiv icon

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Add code
Sep 29, 2023
Figure 1 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 2 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 3 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 4 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Add code
Jun 24, 2023
Viaarxiv icon

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Add code
May 30, 2022
Figure 1 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 2 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 3 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 4 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Viaarxiv icon