Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the Computational Power of Decoder-Only Transformer Language Models

May 30, 2023

Jesse Roberts

Share this with someone who'll enjoy it:

Abstract:This article presents a theoretical evaluation of the computational universality of decoder-only transformer models. We extend the theoretical literature on transformer models and show that decoder-only transformer architectures (even with only a single layer and single attention head) are Turing complete under reasonable assumptions. From the theoretical analysis, we show sparsity/compressibility of the word embedding to be a necessary condition for Turing completeness to hold.

View paper on

Share this with someone who'll enjoy it:

Title:On the Computational Power of Decoder-Only Transformer Language Models

Paper and Code