Skip to yearly menu bar Skip to main content


Live Demo
in
Demonstration: Demonstrations 1

Training Transformers Together

Alexander Borzunov · Max Ryabinin · Tim Dettmers · quentin lhoest · Lucile Saulnier · Michael Diskin · Yacine Jernite · Thomas Wolf


Abstract:

We invite volunteers to train a large Transformer language model over the Internet. Instead of using supercomputers, we will pool together all available computational resources: desktops, laptops, servers and even cloud TPUs from around the world. All training artifacts, such as model checkpoint and optimizer states, will be shared online for public use.

For this demonstration, we will provide an open-source starter kit that volunteers can use to join the global distributed training run and host similar experiments independently in the future.