Picture for Martín Soto

Martín Soto

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Viaarxiv icon

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Add code
Feb 25, 2025
Viaarxiv icon

Tell me about yourself: LLMs are aware of their learned behaviors

Add code
Jan 19, 2025
Viaarxiv icon