Poster
in
Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants
Fed-EE: Federating Heterogeneous ASR Models using Early-Exit Architectures
Mohamed Nabih Ali Mohamed Nawar · Alessio Brutti · Falavigna Daniele
Automatic speech recognition models require large speech recordings for training. However, the collection of such data often is cumbersome and leads to privacy concerns. Federated learning has been widely used as an effective decentralized technique that collaboratively learns a shared prediction model while keeping the data local on different clients devices. Unfortunately, client devices often feature limited computation and communication resources leading to practical difficulties for large models. In addition, the heterogeneity that characterizes edge devices make unpractical federating a single model that fits all the different clients. Differently from the recent literature, where multiple models with different architectures are used, in this work we propose using early-exit models. This solution brings 2 benefits: a single model is used on a variety of devices; federating the models is straightforward. Experiments on the public dataset (TED-LIUM 3) show that our proposed approach is effective and can be combined with basic federated learning strategies. We also shed light on how to federate self-attention models for speech recognition, for which an established recipe does not exist in literature.