Poster
in
Workshop: Table Representation Learning Workshop (TRL)
TabFlex: Scaling Tabular Learning to Millions with Linear Attention
Yuchen Zeng · Wonjun Kang · Andreas Mueller
Keywords: [ State-Space Models ] [ Linear Attention ] [ Transformer ] [ Scalability ] [ Tabular Classification ]
Abstract:
Recent advances in the field of in-context learning (ICL) have demonstrated impressive performance for tabular classification, exemplified by TabPFN's success on small datasets. However, the quadratic complexity of the attention mechanism limits its applicability to larger datasets. To address this issue, we conduct a comprehensive comparison of popular scalable attention alternatives, including state-space models (SSMs) and linear attention mechanisms, revealing that the inherent causality of SSMs hinders ICL performance for large datasets, while linear attention preserves effectiveness. Leveraging these insights, we introduce TabFlex, a model based on linear attention that supports thousands of features and hundreds of classes, capable of handling datasets with millions of samples. Extensive experiments demonstrate that TabFlex is significantly faster than most existing methods while achieving top-two performance on small datasets among 25 baselines, with a 2$\times$ speedup over TabPFN and a 1.5$\times$ speedup over XGBoost. On large datasets, TabFlex remains efficient (e.g., approximately 5 seconds on the `poker-hand` dataset, which consists of millions of samples), while achieving relatively solid performance.
Chat is not available.