Skip to yearly menu bar Skip to main content


Poster
in
Workshop: NeurIPS 2024 Workshop: Machine Learning and the Physical Sciences

Transforming Simulation to Data Without Pairing

Eli Gendreau-Distler · Luc Le Pottier · Haichen Wang


Abstract: We explore a generative machine learning-based approach for estimating multi-dimensional probability density functions (PDFs) in a target sample using a statistically independent but related control sampleā€”a common challenge in particle physics data analysis. The generative model must accurately reproduce individual observable distributions while preserving the correlations between them, based on the input multidimensional distribution from the control sample. Here we present an conditional normalizing flow model ($\mathcal{CNF}$) based on a chain of bijectors which learns to transform unpaired simulation events to data events. We assess the performance of the $\mathcal{CNF}$ model in the context of LHC Higgs to diphoton analysis, where we use the $\mathcal{CNF}$ model to convert a Monte Carlo diphoton sample to one that models data. We show that the $\mathcal{CNF}$ model can accurately model complex data distributions and correlations. We also leverage the recently popularized Modified Differential Multiplier Method (MDMM) to improve the convergence of our model and assign physical meaning to usually arbitrary loss-function parameters.

Chat is not available.