Poster
in
Workshop: NeurIPS 2024 Workshop: Machine Learning and the Physical Sciences
Explainable Deep Learning Framework for SERS Bio-quantification
Jihan K. Zaki · Jakub Tomasik · Sabine Bahn · Jade McCune · Pietro Lió · Oren Scherman
Surface-enhanced Raman spectroscopy (SERS) is rapidly gaining attention as a potential fast and inexpensive method of biomarker quantification, which can be combined with deep learning methodology to elucidate complex biomarker-disease relationships. Current standard practices in SERS analysis are substantially behind the state-of-the-art machine learning approaches; however, present challenges of SERS analysis could be effectiely addressed with a robust computational framework. Furthermore, there is a particular need for improved model explainability for SERS analysis, which at present is insufficient in assessing the contexts where confounding factors affect prediction outcomes. This study presents a novel framework for SERS bio-quantification rooted in a three-step process, including spectral processing, analyte quantification, and model explainability. To this end, a serotonin quantification task in a urine medium was assessed as a model task with 682 SERS spectra measured in a micromolar range using cucurbit[8]uril chemical spacers. A denoising autoencoder was utilized for spectral enhancement, convolutional neural networks (CNN), and vision transformers were utilized for biomarker quantification. In addition, a novel context representative interpretable model explanations (CRIME) method was developed to suit the current needs of SERS mixture analysis explainability. Serotonin quantification was most efficient in denoised spectra analysed using a CNN with a three-parameter logistic output layer (Validation: mean absolute error (MAE) = 0.24 μM, mean percentage error (MPE) = 15.00%, Test: MAE = 0.15 μM, MPE = 4.67%). Subsequently, the CRIME method revealed the CNN model to present six unique prediction contexts, of which three were associated with serotonin. The proposed framework could unlock a novel, untargeted hypothesis generating method of biomarker discovery considering the rapid and inexpensive nature of SERS measurements, and the potential to identify biomarkers from CRIME contexts, which should be validated in a clinical setting.