Skip to yearly menu bar Skip to main content


Poster

Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox

Haohui Wang · Weijie Guan · Chen Jianpeng · Zi Wang · Dawei Zhou

[ ]
Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Long-tailed data distributions pose challenges for a variety of domains like e-commerce, finance, biomedical science, and cyber security, where the performance of machine learning models is often dominated by head categories while tail categories are inadequately learned. This work aims to provide a systematic view of long-tailed learning with regard to three pivotal angles: (A1) the characterization of data long-tailedness, (A2) the data complexity of various domains, and (A3) the heterogeneity of emerging tasks. We develop HeroLT, a comprehensive long-tailed learning benchmark integrating 16 state-of-the-art algorithms, 6 evaluation metrics, and 16 real-world datasets across 5 tasks from 3 domains. HeroLT with novel angles and extensive experiments (313 in total) enables effective and fair evaluation of newly proposed methods compared with existing baselines on varying dataset types. Finally, we conclude by highlighting the significant applications of long-tailed learning and identifying several promising future directions. For accessibility and reproducibility, we open-source our benchmark HeroLT and corresponding results at https://anonymous.4open.science/r/HeroLT-9746/.

Live content is unavailable. Log in and register to view live content