Picture for Jane Pan

Jane Pan

Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models

Add code
Apr 13, 2025
Viaarxiv icon

Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification

Add code
Apr 07, 2025
Viaarxiv icon

Spontaneous Reward Hacking in Iterative Self-Refinement

Add code
Jul 05, 2024
Figure 1 for Spontaneous Reward Hacking in Iterative Self-Refinement
Figure 2 for Spontaneous Reward Hacking in Iterative Self-Refinement
Figure 3 for Spontaneous Reward Hacking in Iterative Self-Refinement
Figure 4 for Spontaneous Reward Hacking in Iterative Self-Refinement
Viaarxiv icon

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning

Add code
May 16, 2023
Viaarxiv icon