Picture for Liu Xiao

Liu Xiao

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Add code
Jan 26, 2025
Viaarxiv icon