Picture for Joan Velja

Joan Velja

'Explaining RL Decisions with Trajectories': A Reproducibility Study

Add code
Nov 11, 2024
Figure 1 for 'Explaining RL Decisions with Trajectories': A Reproducibility Study
Figure 2 for 'Explaining RL Decisions with Trajectories': A Reproducibility Study
Figure 3 for 'Explaining RL Decisions with Trajectories': A Reproducibility Study
Figure 4 for 'Explaining RL Decisions with Trajectories': A Reproducibility Study
Viaarxiv icon

Dynamic Vocabulary Pruning in Early-Exit LLMs

Add code
Oct 24, 2024
Viaarxiv icon

Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs

Add code
Oct 02, 2024
Figure 1 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 2 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 3 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 4 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Viaarxiv icon