Picture for Jean Vassoyan

Jean Vassoyan

CB

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

Add code
Feb 10, 2025
Figure 1 for Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Figure 2 for Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Figure 3 for Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Figure 4 for Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Viaarxiv icon

A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents

Add code
Nov 18, 2024
Viaarxiv icon

Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning

Add code
May 10, 2023
Viaarxiv icon