Skip to yearly menu bar Skip to main content


Poster
in
Affinity Workshop: Black in AI

DynamicViT: Faster Vision Transformer

Amanuel Mersha · Samuel Assefa

Keywords: [ Computer Vision ]


Abstract:

The recent deep learning breakthroughs in language and vision tasks can be mainly attributed to large-scale transformers. Unfortunately, their massive size and high compute requirements have limited their use in resource-constrained environments. Dynamic neural networks promise reduced amount of compute requirement by dynamically adjusting the computational path based on the input. We propose a layer skipping dynamic vision transformer (DynamicViT) that skips layers for each sample based on decisions given by a reinforcement learning agent. Extensive experiment on CIFAR-10 and CIFAR-100 showed that this dynamic ViT gained an average of 40\% speed increase evaluated on different batch sizes ranging from 1 to 1024.

Chat is not available.