Picture for Juhan Bae

Juhan Bae

Spectral-factorized Positive-definite Curvature Learning for NN Training

Add code
Feb 10, 2025
Viaarxiv icon

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Add code
Nov 19, 2024
Figure 1 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 2 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 3 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 4 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Viaarxiv icon

Influence Functions for Scalable Data Attribution in Diffusion Models

Add code
Oct 17, 2024
Viaarxiv icon

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Add code
May 22, 2024
Figure 1 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 2 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 3 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Figure 4 for What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Viaarxiv icon

Training Data Attribution via Approximate Unrolled Differentiation

Add code
May 21, 2024
Viaarxiv icon

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

Add code
Feb 13, 2024
Figure 1 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 2 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 3 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Figure 4 for Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Viaarxiv icon

Using Large Language Models for Hyperparameter Optimization

Add code
Dec 07, 2023
Figure 1 for Using Large Language Models for Hyperparameter Optimization
Figure 2 for Using Large Language Models for Hyperparameter Optimization
Figure 3 for Using Large Language Models for Hyperparameter Optimization
Figure 4 for Using Large Language Models for Hyperparameter Optimization
Viaarxiv icon

Studying Large Language Model Generalization with Influence Functions

Add code
Aug 07, 2023
Figure 1 for Studying Large Language Model Generalization with Influence Functions
Figure 2 for Studying Large Language Model Generalization with Influence Functions
Figure 3 for Studying Large Language Model Generalization with Influence Functions
Figure 4 for Studying Large Language Model Generalization with Influence Functions
Viaarxiv icon

Benchmarking Neural Network Training Algorithms

Add code
Jun 12, 2023
Figure 1 for Benchmarking Neural Network Training Algorithms
Figure 2 for Benchmarking Neural Network Training Algorithms
Figure 3 for Benchmarking Neural Network Training Algorithms
Figure 4 for Benchmarking Neural Network Training Algorithms
Viaarxiv icon

Efficient Parametric Approximations of Neural Network Function Space Distance

Add code
Feb 07, 2023
Viaarxiv icon