Picture for Ahmad Beirami

Ahmad Beirami

EJ

Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset

Add code
Mar 05, 2025
Viaarxiv icon

CoDe: Blockwise Control for Denoising Diffusion Models

Add code
Feb 03, 2025
Viaarxiv icon

InfAlign: Inference-aware language model alignment

Add code
Dec 27, 2024
Viaarxiv icon

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Add code
Nov 27, 2024
Figure 1 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 2 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 3 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Figure 4 for Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Viaarxiv icon

Generalization Error of the Tilted Empirical Risk

Add code
Sep 28, 2024
Viaarxiv icon

Inducing Group Fairness in LLM-Based Decisions

Add code
Jun 24, 2024
Figure 1 for Inducing Group Fairness in LLM-Based Decisions
Figure 2 for Inducing Group Fairness in LLM-Based Decisions
Figure 3 for Inducing Group Fairness in LLM-Based Decisions
Figure 4 for Inducing Group Fairness in LLM-Based Decisions
Viaarxiv icon

Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Add code
Jun 10, 2024
Figure 1 for Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Figure 2 for Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Figure 3 for Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Figure 4 for Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Viaarxiv icon

Robust Preference Optimization through Reward Model Distillation

Add code
May 29, 2024
Viaarxiv icon

Mitigating Object Hallucination via Data Augmented Contrastive Tuning

Add code
May 28, 2024
Viaarxiv icon

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

Add code
Apr 18, 2024
Figure 1 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 2 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 3 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 4 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Viaarxiv icon