Picture for Pierre Dognin

Pierre Dognin

Sparsity May Be All You Need: Sparse Random Parameter Adaptation

Add code
Feb 21, 2025
Viaarxiv icon

Granite Guardian

Add code
Dec 10, 2024
Figure 1 for Granite Guardian
Figure 2 for Granite Guardian
Figure 3 for Granite Guardian
Figure 4 for Granite Guardian
Viaarxiv icon

Evaluating the Prompt Steerability of Large Language Models

Add code
Nov 19, 2024
Figure 1 for Evaluating the Prompt Steerability of Large Language Models
Figure 2 for Evaluating the Prompt Steerability of Large Language Models
Figure 3 for Evaluating the Prompt Steerability of Large Language Models
Figure 4 for Evaluating the Prompt Steerability of Large Language Models
Viaarxiv icon

Programming Refusal with Conditional Activation Steering

Add code
Sep 06, 2024
Figure 1 for Programming Refusal with Conditional Activation Steering
Figure 2 for Programming Refusal with Conditional Activation Steering
Figure 3 for Programming Refusal with Conditional Activation Steering
Figure 4 for Programming Refusal with Conditional Activation Steering
Viaarxiv icon

Value Alignment from Unstructured Text

Add code
Aug 19, 2024
Figure 1 for Value Alignment from Unstructured Text
Figure 2 for Value Alignment from Unstructured Text
Figure 3 for Value Alignment from Unstructured Text
Figure 4 for Value Alignment from Unstructured Text
Viaarxiv icon

Contextual Moral Value Alignment Through Context-Based Aggregation

Add code
Mar 19, 2024
Figure 1 for Contextual Moral Value Alignment Through Context-Based Aggregation
Figure 2 for Contextual Moral Value Alignment Through Context-Based Aggregation
Figure 3 for Contextual Moral Value Alignment Through Context-Based Aggregation
Viaarxiv icon

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Add code
Mar 09, 2024
Figure 1 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 2 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 3 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 4 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Viaarxiv icon

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Add code
Mar 08, 2024
Viaarxiv icon

Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

Add code
May 02, 2023
Figure 1 for Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Figure 2 for Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Figure 3 for Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Figure 4 for Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Viaarxiv icon

Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting

Add code
Dec 13, 2022
Figure 1 for Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
Figure 2 for Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
Figure 3 for Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
Figure 4 for Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
Viaarxiv icon