Picture for Been Kim

Been Kim

Escaping Platos Cave: JAM for Aligning Independently Trained Vision and Language Models

Add code
Jul 01, 2025
Viaarxiv icon

Because we have LLMs, we Can and Should Pursue Agentic Interpretability

Add code
Jun 13, 2025
Viaarxiv icon

How new data permeates LLM knowledge and how to dilute it

Add code
Apr 13, 2025
Viaarxiv icon

QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?

Add code
Mar 28, 2025
Viaarxiv icon

We Can't Understand AI Using our Existing Vocabulary

Add code
Feb 11, 2025
Viaarxiv icon

Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty

Add code
Dec 09, 2024
Viaarxiv icon

Getting aligned on representational alignment

Add code
Nov 02, 2023
Viaarxiv icon

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

Add code
Oct 25, 2023
Viaarxiv icon

State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding

Add code
Sep 21, 2023
Viaarxiv icon

Don't trust your eyes: on the reliability of feature visualizations

Add code
Jun 21, 2023
Viaarxiv icon