Picture for Brian Formento

Brian Formento

Confidence Elicitation: A New Attack Vector for Large Language Models

Add code
Feb 10, 2025
Figure 1 for Confidence Elicitation: A New Attack Vector for Large Language Models
Figure 2 for Confidence Elicitation: A New Attack Vector for Large Language Models
Figure 3 for Confidence Elicitation: A New Attack Vector for Large Language Models
Figure 4 for Confidence Elicitation: A New Attack Vector for Large Language Models
Viaarxiv icon

SemRoDe: Macro Adversarial Training to Learn Representations That are Robust to Word-Level Attacks

Add code
Mar 27, 2024
Viaarxiv icon