Poster
in
Workshop: Table Representation Learning Workshop (TRL)
Sparsely Connected Layers for Financial Tabular Data
Mohammed Abdulrahman · Yin Wang · Hui Chen
Keywords: [ Deep Learning ] [ Neural Networks ] [ XGBoost ] [ Tabular Data ] [ Finance ] [ Decision Trees ]
While neural networks are the norm for unstructured data such as images and texts, their performance often lags behind more traditional machine learning models such as gradient boosted trees for tabular data, evidenced by academic studies, industrial practices, and Kaggle competitions. In particular, there is no easy way to increase the number of layers for neural networks applied on tabular data, which has been a key reason in their success on unstructured data. Deeper fully connected networks also suffer from the vanishing gradient problem, and convolutional layers and transformers are not directly applicable to tabular data in general. Special constructs such as skip layers and attentions have been adapted to tabular data models with limited success. In this paper, we show that for the consumer financial tabular data, while standard 2-layer neural networks fall behind gradient boosted trees as usual, sparsely connected layers can increase the network depth and reliably outperform gradient boosted trees. The superior performance seems to come from the ability of the sparse layers to reduce correlations from the input data, which is a common problem in high-dimensional tabular data. Therefore we are hopeful the method could be applicable to other domains with similar challenges.