Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 5th Workshop on Self-Supervised Learning: Theory and Practice

Hoop-MSSL: Multi-Task Self-supervised Representation Learning on Basketball Spatio-Temporal Data

xing wang · Jianchong Shao · Chunyang Huang · Zitian Tang · Miguel-Ángel GÓMEZ · Zhang shaoliang · Konstantinos Pelechrinis


Abstract:

Observing and identifying on-court behaviors by basketball players, engaging in intricate spatial-temporal interactions with their teammates and the opponent players, have long been considered challenging tasks for machines. Early approaches focused on supervised learning to capture spatial-temporal information and role relationships between players. These frameworks relied on labeled data and were unable to be generalized to other tasks. To addressed these limitations, some recent works has drawn inspiration from the field of autonomous driving to develop self-supervised learning frameworks for trajectory data. However, these frameworks mainly focus on single tasks such as trajectory reconstruction or prediction and do not take into account the domain knowledge in basketball. In this work, we propose Hoop-MSSL, a multi-task self-supervised representation learning framework to handle complex interactions and dependencies among spatial-temporal data on basketball court. Specially, Hoop-MSSL integrates masking augmentation and three pre-training tasks for (i) motion reconstruction, (ii) player-role identification and (iii) contrastive learning, to capture the spatial-temporal features and the role relationships across multiple dimensions. To evaluate the efficacy of Hoop-MSSL, we conducted extensive line-probing experiments on three downstream tasks. Our results demonstrate that the synergistic interaction among all of the Hoop-MSSL components helps the model to learn more general spatial-temporal representations, allowing it to achieve better performance on all downstream tasks as compared to using only subsets of the components. Finally, a high masking ratio (80\%) can further enhance significantly the model’s ability to learn useful representations.

Chat is not available.