Reformer: The Efficient Transformer
Transformer models have been used in a variety of fields and
yield great results on many NLP tasks. But between the BERT,
GPT-3, and many other variants, they can be inefficient and it
can be hard to apply them. I will introduce a new efficient variant
of Transformer called the Reformer. I’ll take you through the code
that implements it and I will show how it runs at high efficiency and
addresses the main problems or high memory use and low performance
on long sequences that limited the use of some Transformers before.
I will finish with new applications of Reformer that open up.
Lukasz joined Google in 2013 and is currently a Research
Scientist in the Google Brain Team in Mountain View, where he
works on fundamental aspects of deep learning and natural
language processing. He has co-designed state-of-the-art
neural models for machine translation, parsing and other
algorithmic and generative tasks and co-authored the TensorFlow
system, the Tensor2Tensor and Trax libraries and the Transformer
model. Before joining Google, Lukasz was a tenured researcher
at University Paris Diderot and worked on logic and
automata theory. He received his PhD from RWTH Aachen
University in 2008 and his MSc from the University of Wroclaw, Poland.