Picture for Nicolas Le Roux

Nicolas Le Roux

SIERRA, LIENS

Fast Convergence of Softmax Policy Mirror Ascent

Add code
Nov 18, 2024
Viaarxiv icon

fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models

Add code
Oct 07, 2024
Viaarxiv icon

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Add code
Oct 02, 2024
Figure 1 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 2 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 3 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 4 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Viaarxiv icon

Improving Context-Aware Preference Modeling for Language Models

Add code
Jul 20, 2024
Viaarxiv icon

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Add code
May 18, 2024
Viaarxiv icon

Language-guided Skill Learning with Temporal Variational Inference

Add code
Feb 26, 2024
Viaarxiv icon

Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference

Add code
Jun 21, 2023
Figure 1 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 2 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 3 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 4 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Viaarxiv icon

Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive Advancements

Add code
Jun 11, 2023
Viaarxiv icon

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

Add code
May 24, 2023
Viaarxiv icon

Target-based Surrogates for Stochastic Optimization

Add code
Feb 06, 2023
Viaarxiv icon