Picture for Sewoong Oh

Sewoong Oh

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

Add code
Mar 26, 2025
Viaarxiv icon

SuperBPE: Space Travel for Language Models

Add code
Mar 17, 2025
Viaarxiv icon

S4S: Solving for a Diffusion Model Solver

Add code
Feb 24, 2025
Viaarxiv icon

Economics of Sourcing Human Data

Add code
Feb 11, 2025
Viaarxiv icon

Scalable Fingerprinting of Large Language Models

Add code
Feb 11, 2025
Viaarxiv icon

OML: Open, Monetizable, and Loyal AI

Add code
Nov 01, 2024
Viaarxiv icon

Randomization Techniques to Mitigate the Risk of Copyright Infringement

Add code
Aug 21, 2024
Figure 1 for Randomization Techniques to Mitigate the Risk of Copyright Infringement
Figure 2 for Randomization Techniques to Mitigate the Risk of Copyright Infringement
Figure 3 for Randomization Techniques to Mitigate the Risk of Copyright Infringement
Figure 4 for Randomization Techniques to Mitigate the Risk of Copyright Infringement
Viaarxiv icon

Better Alignment with Instruction Back-and-Forth Translation

Add code
Aug 08, 2024
Figure 1 for Better Alignment with Instruction Back-and-Forth Translation
Figure 2 for Better Alignment with Instruction Back-and-Forth Translation
Figure 3 for Better Alignment with Instruction Back-and-Forth Translation
Figure 4 for Better Alignment with Instruction Back-and-Forth Translation
Viaarxiv icon

Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?

Add code
Jul 24, 2024
Figure 1 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 2 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 3 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 4 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Viaarxiv icon

Understanding the Gains from Repeated Self-Distillation

Add code
Jul 05, 2024
Figure 1 for Understanding the Gains from Repeated Self-Distillation
Figure 2 for Understanding the Gains from Repeated Self-Distillation
Figure 3 for Understanding the Gains from Repeated Self-Distillation
Figure 4 for Understanding the Gains from Repeated Self-Distillation
Viaarxiv icon