Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 5th Workshop on Self-Supervised Learning: Theory and Practice

Informed Augmentation Selection Improves Tabular Contrastive Learning

Arash Khoeini · Shuman Peng · Martin Ester


Abstract:

While contrastive learning (CL) has demonstrated success in image data, its application to tabular data remains relatively unexplored. The effectiveness of CL heavily depends on data augmentations, yet the suitability of tabular augmentation techniques for contrastive learning remains unclear. In this study, we assess the compatibility of various tabular augmentation techniques with CL by examining their impact on feature space characteristics (i.e., uniformity and alignment) which serve as proxies for downstream performance. Our investigation reveals that augmentations impact feature space quality, and that achieving a balance between uniformity and alignment is essential for good downstream performance. We then propose a novel framework for selecting augmentation combinations that strike this balance. Experimental results on 21 tabular datasets from the OpenML-CC18 benchmark and on the TCGA cancer genomics dataset consistently demonstrate the effectiveness of our proposed framework in enhancing downstream performance.

Chat is not available.