arXiv 2401.06104
Transformers are Multi-State RNNs
By Matanel Oren, Michael Hassid, et al.
Published 2024-01-11
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only transformers can in fact be conceptualized as unbounded multi-state RNNs - an RNN variant with unlimited hidden state size. We further show that transformers can be converted into multi-state RNNs by fixing the size of their…