Oral
in
Workshop: Decentralization and Trustworthy Machine Learning in Web3: Methodologies, Platforms, and Applications
Scalable Collaborative Learning via Representation Sharing
Frédéric Berdoz · Abhishek Singh · Martin Jaggi · Ramesh Raskar
Decentralized machine learning has become a key conundrum for multi-party artificial intelligence. Existing algorithms usually rely on the release of model parameters to spread the knowledge across users. This can raise several issues, particularly in terms of communication if the models are large. Additionally, participants in such frameworks cannot freely choose their model architecture as they must coincide to collaborate.In this work, we present a novel approach for decentralized machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss (contrastive w.r.t. the labels). The goal is to ensure that the participants learn similar features on similar classes without sharing their input data nor their model parameters. To do so, each client releases averaged last hidden layer activations of similar labels to a central server that only acts as a relay (i.e., is not involved in the training or aggregation of the models). Then, the clients download these last layer activations (feature representations) of the ensemble of users and distill their knowledge in their personal model using a contrastive objective.For cross-device applications (i.e., small local datasets and limited computational capacity), this approach increases the utility of the models compared to independent learning, is communication efficient and is scalable with the number of clients. We prove theoretically that our framework is well-posed, and we benchmark its performance against standard collaborative learning algorithms on various datasets using different model architectures.