Picture for Changyu Chen

Changyu Chen

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Add code
Feb 18, 2025
Viaarxiv icon

On Learning Informative Trajectory Embeddings for Imitation, Classification and Regression

Add code
Jan 16, 2025
Viaarxiv icon

Sample-Efficient Alignment for LLMs

Add code
Nov 03, 2024
Figure 1 for Sample-Efficient Alignment for LLMs
Figure 2 for Sample-Efficient Alignment for LLMs
Figure 3 for Sample-Efficient Alignment for LLMs
Figure 4 for Sample-Efficient Alignment for LLMs
Viaarxiv icon

Towards Neural Network based Cognitive Models of Dynamic Decision-Making by Humans

Add code
Jul 24, 2024
Viaarxiv icon

Unlocking Large Language Model's Planning Capabilities with Maximum Diversity Fine-tuning

Add code
Jun 15, 2024
Viaarxiv icon

Bootstrapping Language Models with DPO Implicit Rewards

Add code
Jun 14, 2024
Figure 1 for Bootstrapping Language Models with DPO Implicit Rewards
Figure 2 for Bootstrapping Language Models with DPO Implicit Rewards
Figure 3 for Bootstrapping Language Models with DPO Implicit Rewards
Figure 4 for Bootstrapping Language Models with DPO Implicit Rewards
Viaarxiv icon

Prototypical Reward Network for Data-Efficient RLHF

Add code
Jun 06, 2024
Viaarxiv icon

Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models

Add code
Mar 04, 2024
Viaarxiv icon

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

Add code
Dec 07, 2023
Figure 1 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 2 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 3 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Figure 4 for Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Viaarxiv icon

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

Add code
Nov 26, 2023
Viaarxiv icon