Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)

High throughput decomposition of spectra

Dumitru Mirauta · Vladimir Gusev

Keywords: [ spectroscopic data ] [ optimal transport ] [ unmixing ] [ basis variation ] [ decomposition ]


Abstract:

In order to fully utilise the potential throughput of automated synthesis and characterisation data collection, data analysis capabilities must have matching throughput, which consumes excessive (human) expert time even for small datasets.One such analysis task is unmixing; being able to generally separate, from a sample consisting of multiple components, the individual patterns characteristic of the constituent parts.Such tasks are often complicated by variation of the basis patterns (e.g. peak shifting and broadening in PXRD).Conventional approaches focus on fitting a parameterised subset of transformations or utilising phase space relationships, and so one tuned for PXRD may require extensive modification or retraining before being suitable for another modality. This work aims to build a more robust foundation for unmixing, not specific to a particular spectral modality.A more robust optimisation can be achieved through a more robust cost, and distance/comparison is a vital component of such costs.We construct a non-regressive, distance geometry based framework, in this presentation leveraging Optimal Transport (OT) with a Euclidean ground cost, but lending itself to modification through the use of different distances.This provides a non-parametric approach that allows for arbitrary variation.We show through numerical experiments that our approach can handle fully blind basis discovery despite independent random peak shifting/broadening at various intensities, where matrix factorisation frameworks break down.We also showcase use in smaller data regimes through a laboratory discovery mockup, where our method can flag compositions containing an unknown trace component.

Chat is not available.