Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer

Add code
Sep 20, 2024

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: