Poster
in
Workshop: Statistical Frontiers in LLMs and Foundation Models
A teacher-teacher framework for clinical language representation learning
Feiqing Huang · Shenghan Zhang · Sara Sweet · Tianxi Cai
Keywords: [ privacy preserving ] [ clinical LLM ] [ teacher-teacher learning ]
In recent years, the proliferation of ready-to-use large language models (LLMs) for various applications, both general-purpose and domain-specific, has accelerated. Rather than advocating for the development of a new model or the continuous pretraining of an existing one, this paper introduces a pragmatic teacher-teacher framework to facilitate mutual learning between two pre-existing models. By leveraging two teacher models with complementary knowledge, we introduce a LIghtweight kNowledge alignmEnt (LINE) module, designed to harmonize their knowledge within a unified representation space. This framework is particularly valuable in clinical settings, where strict regulations govern the use of real-life clinical notes. The LINE module enables the use of privacy-preserving data to fully represent raw clinical data. Validation and downstream tasks demonstrate the efficacy of the proposed framework.