Picture for Juhan Bae

Juhan Bae

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

Influence Functions for Scalable Data Attribution in Diffusion Models

Add code
Oct 17, 2024
Viaarxiv icon

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Add code
May 22, 2024
Viaarxiv icon

Training Data Attribution via Approximate Unrolled Differentiation

Add code
May 21, 2024
Viaarxiv icon

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

Add code
Feb 13, 2024
Figure 1 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 2 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 3 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 4 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Viaarxiv icon

Using Large Language Models for Hyperparameter Optimization

Add code
Dec 07, 2023
Figure 1 for Using Large Language Models for Hyperparameter Optimization
Figure 2 for Using Large Language Models for Hyperparameter Optimization
Figure 3 for Using Large Language Models for Hyperparameter Optimization
Figure 4 for Using Large Language Models for Hyperparameter Optimization
Viaarxiv icon

Studying Large Language Model Generalization with Influence Functions

Add code
Aug 07, 2023
Figure 1 for Studying Large Language Model Generalization with Influence Functions
Figure 2 for Studying Large Language Model Generalization with Influence Functions
Figure 3 for Studying Large Language Model Generalization with Influence Functions
Figure 4 for Studying Large Language Model Generalization with Influence Functions
Viaarxiv icon

Benchmarking Neural Network Training Algorithms

Add code
Jun 12, 2023
Figure 1 for Benchmarking Neural Network Training Algorithms
Figure 2 for Benchmarking Neural Network Training Algorithms
Figure 3 for Benchmarking Neural Network Training Algorithms
Figure 4 for Benchmarking Neural Network Training Algorithms
Viaarxiv icon

Efficient Parametric Approximations of Neural Network Function Space Distance

Add code
Feb 07, 2023
Viaarxiv icon

Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Add code
Dec 07, 2022
Viaarxiv icon