Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)
Fine-tuned protein language models capture T cell receptor stochasticity
Lewis Cornwall · Grisha Szep · James Day · S R Gokul Krishnan · David Carter · Jamie Blundell · Lilly Wollman · Neil Dalchau · Aaron Sim
Keywords: [ transfer learning ] [ protein language models ] [ T cell receptors ] [ Fine-tuning ]
The combinatorial explosion of T cell receptor (TCRs) sequences enables our immune systems to recognise and respond to an enormous diversity of pathogens. Modelling the highly stochastic TCR generation and selection processes at both sequence and repertoire levels is important for disease detection and advancing therapeutic research. Here we demonstrate that protein language models fine-tuned on TCR sequences are able to capture TCR statistics in hypervariable regions to which mechanistic models are blind, and show that amino acids exhibit strong dependencies on each other within chains but not across chains. Our approach generates representations that improve the prediction of TCR binding specificities.