Picture for Yihai Zhang

Yihai Zhang

IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining

Add code
May 16, 2024
Viaarxiv icon