Picture for Zhijian Zhuo

Zhijian Zhuo

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

Add code
Feb 21, 2025
Viaarxiv icon

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Add code
Nov 06, 2024
Viaarxiv icon

Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism

Add code
Mar 04, 2023
Viaarxiv icon