Picture for Dylan Cope

Dylan Cope

Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs

Add code
Oct 02, 2024
Figure 1 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 2 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 3 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Figure 4 for Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Viaarxiv icon

Mimicry and the Emergence of Cooperative Communication

Add code
May 26, 2024
Viaarxiv icon

Learning Translations: Emergent Communication Pretraining for Cooperative Language Acquisition

Add code
Feb 26, 2024
Viaarxiv icon

Improving Activation Steering in Language Models with Mean-Centring

Add code
Dec 06, 2023
Viaarxiv icon

Joining the Conversation: Towards Language Acquisition for Ad Hoc Team Play

Add code
May 20, 2023
Viaarxiv icon

Real-time Evolution of Multicellularity with Artificial Gene Regulation

Add code
May 20, 2023
Viaarxiv icon

A Measure of Explanatory Effectiveness

Add code
May 20, 2023
Viaarxiv icon

Low-Entropy Latent Variables Hurt Out-of-Distribution Performance

Add code
May 20, 2023
Viaarxiv icon

Learning to Communicate with Strangers via Channel Randomisation Methods

Add code
Apr 19, 2021
Figure 1 for Learning to Communicate with Strangers via Channel Randomisation Methods
Figure 2 for Learning to Communicate with Strangers via Channel Randomisation Methods
Figure 3 for Learning to Communicate with Strangers via Channel Randomisation Methods
Figure 4 for Learning to Communicate with Strangers via Channel Randomisation Methods
Viaarxiv icon