Picture for Sewoong Oh

Sewoong Oh

Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch

Add code
Feb 03, 2026
Viaarxiv icon

ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems

Add code
Feb 02, 2026
Viaarxiv icon

Understanding the Gain from Data Filtering in Multimodal Contrastive Learning

Add code
Dec 16, 2025
Figure 1 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 2 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 3 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 4 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Viaarxiv icon

Are Robust LLM Fingerprints Adversarially Robust?

Add code
Sep 30, 2025
Viaarxiv icon

Sampling from Your Language Model One Byte at a Time

Add code
Jun 17, 2025
Viaarxiv icon

Spurious Rewards: Rethinking Training Signals in RLVR

Add code
Jun 12, 2025
Figure 1 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 2 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 3 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 4 for Spurious Rewards: Rethinking Training Signals in RLVR
Viaarxiv icon

OpenThoughts: Data Recipes for Reasoning Models

Add code
Jun 05, 2025
Viaarxiv icon

Zeroth-Order Optimization Finds Flat Minima

Add code
Jun 05, 2025
Viaarxiv icon

Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Add code
Jun 05, 2025
Viaarxiv icon

Foundation model for mass spectrometry proteomics

Add code
May 19, 2025
Viaarxiv icon