Picture for Senjie Jin

Senjie Jin

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

Add code
Jun 17, 2024
Viaarxiv icon

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Add code
Feb 08, 2024
Figure 1 for Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Figure 2 for Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Figure 3 for Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Figure 4 for Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Viaarxiv icon

MouSi: Poly-Visual-Expert Vision-Language Models

Add code
Jan 30, 2024
Viaarxiv icon

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Add code
Jan 12, 2024
Figure 1 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 2 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 3 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 4 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Viaarxiv icon

TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

Add code
Oct 10, 2023
Viaarxiv icon

The Rise and Potential of Large Language Model Based Agents: A Survey

Add code
Sep 19, 2023
Viaarxiv icon

Secrets of RLHF in Large Language Models Part I: PPO

Add code
Jul 18, 2023
Viaarxiv icon

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement

Add code
May 23, 2023
Viaarxiv icon