Picture for Souradip Chakraborty

Souradip Chakraborty

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

Add code
Mar 27, 2025
Viaarxiv icon

VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences

Add code
Mar 18, 2025
Viaarxiv icon

BalancedDPO: Adaptive Multi-Metric Alignment

Add code
Mar 16, 2025
Viaarxiv icon

Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment

Add code
Jan 07, 2025
Figure 1 for Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment
Figure 2 for Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment
Figure 3 for Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment
Figure 4 for Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment
Viaarxiv icon

LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds

Add code
Dec 06, 2024
Figure 1 for LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds
Figure 2 for LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds
Figure 3 for LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds
Figure 4 for LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds
Viaarxiv icon

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Add code
Nov 27, 2024
Figure 1 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 2 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 3 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 4 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Viaarxiv icon

Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction

Add code
Nov 01, 2024
Figure 1 for Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Figure 2 for Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Figure 3 for Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Figure 4 for Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Viaarxiv icon

On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning

Add code
Oct 05, 2024
Figure 1 for On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning
Figure 2 for On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning
Figure 3 for On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning
Figure 4 for On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning
Viaarxiv icon

AIME: AI System Optimization via Multiple LLM Evaluators

Add code
Oct 04, 2024
Viaarxiv icon

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

Add code
Jul 24, 2024
Viaarxiv icon