Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the Power of Decision Trees in Auto-Regressive Language Modeling

Sep 27, 2024

Yulu Gan, Tomer Galanti, Tomaso Poggio, Eran Malach

Figure 1 for On the Power of Decision Trees in Auto-Regressive Language Modeling

Figure 2 for On the Power of Decision Trees in Auto-Regressive Language Modeling

Figure 3 for On the Power of Decision Trees in Auto-Regressive Language Modeling

Figure 4 for On the Power of Decision Trees in Auto-Regressive Language Modeling

Share this with someone who'll enjoy it:

Abstract:Originally proposed for handling time series data, Auto-regressive Decision Trees (ARDTs) have not yet been explored for language modeling. This paper delves into both the theoretical and practical applications of ARDTs in this new context. We theoretically demonstrate that ARDTs can compute complex functions, such as simulating automata, Turing machines, and sparse circuits, by leveraging "chain-of-thought" computations. Our analysis provides bounds on the size, depth, and computational efficiency of ARDTs, highlighting their surprising computational power. Empirically, we train ARDTs on simple language generation tasks, showing that they can learn to generate coherent and grammatically correct text on par with a smaller Transformer model. Additionally, we show that ARDTs can be used on top of transformer representations to solve complex reasoning tasks. This research reveals the unique computational abilities of ARDTs, aiming to broaden the architectural diversity in language model development.

* Accepted to NeurIPS 2024

View paper on

Share this with someone who'll enjoy it:

Title:On the Power of Decision Trees in Auto-Regressive Language Modeling

Paper and Code