Picture for Zhiyuan Li

Zhiyuan Li

PENCIL: Long Thoughts with Short Memory

Add code
Mar 18, 2025
Viaarxiv icon

Structured Preconditioners in Adaptive Optimization: A Unified Analysis

Add code
Mar 13, 2025
Viaarxiv icon

A Theory of Learning with Autoregressive Chain of Thought

Add code
Mar 11, 2025
Viaarxiv icon

Weak-to-Strong Generalization Even in Random Feature Networks, Provably

Add code
Mar 04, 2025
Viaarxiv icon

External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

Add code
Feb 26, 2025
Figure 1 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 2 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 3 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 4 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Viaarxiv icon

Reasoning with Latent Thoughts: On the Power of Looped Transformers

Add code
Feb 24, 2025
Viaarxiv icon

SegRet: An Efficient Design for Semantic Segmentation with Retentive Network

Add code
Feb 19, 2025
Viaarxiv icon

Megrez-Omni Technical Report

Add code
Feb 19, 2025
Viaarxiv icon

Pitfalls of defacing whole-head MRI: re-identification risk with diffusion models and compromised research potential

Add code
Jan 31, 2025
Viaarxiv icon

A Survey of RWKV

Add code
Dec 19, 2024
Viaarxiv icon