Picture for Dawn Song

Dawn Song

University of California, Berkeley

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Add code
Feb 18, 2025
Viaarxiv icon

International AI Safety Report

Add code
Jan 29, 2025
Viaarxiv icon

Can LLMs Design Good Questions Based on Context?

Add code
Jan 07, 2025
Viaarxiv icon

Formal Mathematical Reasoning: A New Frontier in AI

Add code
Dec 20, 2024
Viaarxiv icon

Capturing the Temporal Dependence of Training Data Influence

Add code
Dec 12, 2024
Viaarxiv icon

Boosting Alignment for Post-Unlearning Text-to-Image Generative Models

Add code
Dec 09, 2024
Figure 1 for Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
Figure 2 for Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
Figure 3 for Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
Figure 4 for Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
Viaarxiv icon

Data Free Backdoor Attacks

Add code
Dec 09, 2024
Viaarxiv icon

PrivAgent: Agentic-based Red-teaming for LLM Privacy Leakage

Add code
Dec 07, 2024
Viaarxiv icon

SoK: Watermarking for AI-Generated Content

Add code
Nov 27, 2024
Viaarxiv icon

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon