Picture for Shasha Mo

Shasha Mo

LBPE: Long-token-first Tokenization to Improve Large Language Models

Add code
Nov 08, 2024
Figure 1 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 2 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 3 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 4 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Viaarxiv icon

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Add code
Apr 27, 2024
Viaarxiv icon