Picture for Christopher Parisien

Christopher Parisien

Towards Inference-time Category-wise Safety Steering for Large Language Models

Add code
Oct 02, 2024
Figure 1 for Towards Inference-time Category-wise Safety Steering for Large Language Models
Figure 2 for Towards Inference-time Category-wise Safety Steering for Large Language Models
Figure 3 for Towards Inference-time Category-wise Safety Steering for Large Language Models
Figure 4 for Towards Inference-time Category-wise Safety Steering for Large Language Models
Viaarxiv icon

Unsupervised Extraction of Dialogue Policies from Conversations

Add code
Jun 21, 2024
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Viaarxiv icon

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

Add code
Apr 09, 2024
Figure 1 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 2 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 3 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 4 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Viaarxiv icon

CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Add code
Apr 04, 2024
Viaarxiv icon

NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails

Add code
Oct 16, 2023
Viaarxiv icon

Prompt Learning for Domain Adaptation in Task-Oriented Dialogue

Add code
Nov 10, 2022
Viaarxiv icon