Picture for Vahab Mirrokni

Vahab Mirrokni

Synthetic Text Generation for Training Large Language Models via Gradient Matching

Add code
Feb 24, 2025
Viaarxiv icon

PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts

Add code
Feb 10, 2025
Viaarxiv icon

DeepCrossAttention: Supercharging Transformer Residual Connections

Add code
Feb 10, 2025
Viaarxiv icon

PolarQuant: Quantizing KV Caches with Polar Transformation

Add code
Feb 04, 2025
Viaarxiv icon

Titans: Learning to Memorize at Test Time

Add code
Dec 31, 2024
Viaarxiv icon

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Add code
Nov 23, 2024
Viaarxiv icon

Procurement Auctions via Approximately Optimal Submodular Optimization

Add code
Nov 20, 2024
Viaarxiv icon

The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering

Add code
Nov 15, 2024
Figure 1 for The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering
Figure 2 for The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering
Figure 3 for The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering
Figure 4 for The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering
Viaarxiv icon

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Add code
Oct 09, 2024
Figure 1 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 2 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 3 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 4 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Viaarxiv icon

DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction

Add code
Oct 04, 2024
Figure 1 for DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
Figure 2 for DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
Figure 3 for DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
Figure 4 for DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
Viaarxiv icon