Oral
in
Workshop: AIM-FM: Advancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond
M3H: Multimodal Multitask Machine Learning for Healthcare
Dimitris Bertsimas · Yu Ma
Recent breakthroughs in AI are poised to fundamentally enhance our study and understanding of healthcare. Developing an integrated many-to-many framework leveraging multimodal data for multiple tasks is essential to unifying modern medicine. We introduce M3H, an explainable Multimodal Multitask Machine Learning for Healthcare framework that consolidates learning from tabular, time-series, language, and vision data for supervised binary/multiclass classification, regression, and unsupervised clustering. M3H encompasses an unprecedented range of medical tasks and problem domains and consistently outperforms traditional single-task models by on average 11.6\% across 40 disease diagnoses from 16 medical departments, three hospital operation forecasts, and one patient phenotyping task. It offers explainability through a proposed TIM score, shedding light on the dynamics of task learning interdependencies of the output space. The modular design of the framework ensures its generalizable data processing, task definition, and rapid model prototyping, applicable to both clinical and operational healthcare settings. Specifically, the model design features a novel lightweight attention mechanism balancing self-exploitation (learning source-task), and cross-exploration (learning cross-tasks) to ensure learning quality without overburdening computational resources. Its adaptable architecture supports easy customization and integration of new data modalities and tasks, establishing it as a robust, scalable solution for advancing AI-driven healthcare systems.