Picture for Binghui Li

Binghui Li

Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models

Add code
May 26, 2026
Viaarxiv icon

Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws

Add code
Feb 15, 2026
Viaarxiv icon

Optimal Learning-Rate Schedules under Functional Scaling Laws: Power Decay and Warmup-Stable-Decay

Add code
Feb 06, 2026
Viaarxiv icon

Muon in Associative Memory Learning: Training Dynamics and Scaling Laws

Add code
Feb 05, 2026
Viaarxiv icon

Larger Datasets Can Be Repeated More: A Theoretical Analysis of Multi-Epoch Scaling in Linear Regression

Add code
Nov 17, 2025
Viaarxiv icon

M2IO-R1: An Efficient RL-Enhanced Reasoning Framework for Multimodal Retrieval Augmented Multimodal Generation

Add code
Aug 08, 2025
Viaarxiv icon

MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation

Add code
Feb 06, 2025
Viaarxiv icon

Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks

Add code
Oct 14, 2024
Figure 1 for Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Figure 2 for Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Figure 3 for Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Figure 4 for Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Viaarxiv icon

Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

Add code
Oct 11, 2024
Figure 1 for Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Figure 2 for Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Figure 3 for Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Figure 4 for Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Viaarxiv icon

Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training

Add code
Jun 02, 2023
Viaarxiv icon