Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Quantum linear algebra is all you need for Transformer architectures

Feb 26, 2024

Naixu Guo, Zhan Yu, Aman Agrawal, Patrick Rebentrost

Figure 1 for Quantum linear algebra is all you need for Transformer architectures

Figure 2 for Quantum linear algebra is all you need for Transformer architectures

Figure 3 for Quantum linear algebra is all you need for Transformer architectures

Figure 4 for Quantum linear algebra is all you need for Transformer architectures

Share this with someone who'll enjoy it:

Abstract:Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large amount of computational resources. The transformer is a key component in large language models that aims to generate a suitable completion of a given partial sequence. In this work, we investigate transformer architectures under the lens of fault-tolerant quantum computing. The input model is one where pre-trained weight matrices are given as block encodings to construct the query, key, and value matrices for the transformer. As a first step, we show how to prepare a block encoding of the self-attention matrix, with a row-wise application of the softmax function using the Hadamard product. In addition, we combine quantum subroutines to construct important building blocks in the transformer, the residual connection, layer normalization, and the feed-forward neural network. Our subroutines prepare an amplitude encoding of the transformer output, which can be measured to obtain a prediction. We discuss the potential and challenges for obtaining a quantum advantage.

* 26 pages, 2 figures, 2 tables, comments are welcome

View paper on

Share this with someone who'll enjoy it:

Title:Quantum linear algebra is all you need for Transformer architectures

Paper and Code