Picture for Alessandro Sordoni

Alessandro Sordoni

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Add code
Oct 02, 2024
Figure 1 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 2 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 3 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 4 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Viaarxiv icon

Not All LLM Reasoners Are Created Equal

Add code
Oct 02, 2024
Figure 1 for Not All LLM Reasoners Are Created Equal
Figure 2 for Not All LLM Reasoners Are Created Equal
Figure 3 for Not All LLM Reasoners Are Created Equal
Figure 4 for Not All LLM Reasoners Are Created Equal
Viaarxiv icon

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Add code
Aug 13, 2024
Figure 1 for A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Viaarxiv icon

Improving Context-Aware Preference Modeling for Language Models

Add code
Jul 20, 2024
Figure 1 for Improving Context-Aware Preference Modeling for Language Models
Figure 2 for Improving Context-Aware Preference Modeling for Language Models
Figure 3 for Improving Context-Aware Preference Modeling for Language Models
Figure 4 for Improving Context-Aware Preference Modeling for Language Models
Viaarxiv icon

Efficient Adversarial Training in LLMs with Continuous Attacks

Add code
May 24, 2024
Figure 1 for Efficient Adversarial Training in LLMs with Continuous Attacks
Figure 2 for Efficient Adversarial Training in LLMs with Continuous Attacks
Figure 3 for Efficient Adversarial Training in LLMs with Continuous Attacks
Figure 4 for Efficient Adversarial Training in LLMs with Continuous Attacks
Viaarxiv icon

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Add code
May 18, 2024
Viaarxiv icon

V-STaR: Training Verifiers for Self-Taught Reasoners

Add code
Feb 09, 2024
Figure 1 for V-STaR: Training Verifiers for Self-Taught Reasoners
Figure 2 for V-STaR: Training Verifiers for Self-Taught Reasoners
Figure 3 for V-STaR: Training Verifiers for Self-Taught Reasoners
Figure 4 for V-STaR: Training Verifiers for Self-Taught Reasoners
Viaarxiv icon

Guiding Language Model Reasoning with Planning Tokens

Add code
Oct 09, 2023
Figure 1 for Guiding Language Model Reasoning with Planning Tokens
Figure 2 for Guiding Language Model Reasoning with Planning Tokens
Figure 3 for Guiding Language Model Reasoning with Planning Tokens
Figure 4 for Guiding Language Model Reasoning with Planning Tokens
Viaarxiv icon

Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference

Add code
Jun 21, 2023
Figure 1 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 2 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 3 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Figure 4 for Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference
Viaarxiv icon

On the Compositional Generalization Gap of In-Context Learning

Add code
Nov 15, 2022
Viaarxiv icon