SymbolFit: Automatic Parametric Modeling with Symbolic Regression
We introduce SymbolFit, a framework that automates parametric modeling by using symbolic regression to perform a machine-search for functions that fit the data, while simultaneously providing uncertainty estimates in a single run. Traditionally, constructing a parametric model to accurately describe...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce SymbolFit, a framework that automates parametric modeling by
using symbolic regression to perform a machine-search for functions that fit
the data, while simultaneously providing uncertainty estimates in a single run.
Traditionally, constructing a parametric model to accurately describe binned
data has been a manual and iterative process, requiring an adequate functional
form to be determined before the fit can be performed. The main challenge
arises when the appropriate functional forms cannot be derived from first
principles, especially when there is no underlying true closed-form function
for the distribution. In this work, we address this problem by utilizing
symbolic regression, a machine learning technique that explores a vast space of
candidate functions without needing a predefined functional form, treating the
functional form itself as a trainable parameter. Our approach is demonstrated
in data analysis applications in high-energy physics experiments at the CERN
Large Hadron Collider (LHC). We demonstrate its effectiveness and efficiency
using five real proton-proton collision datasets from new physics searches at
the LHC, namely the background modeling in resonance searches for high-mass
dijet, trijet, paired-dijet, diphoton, and dimuon events. We also validate the
framework using several toy datasets with one and more variables. |
---|---|
DOI: | 10.48550/arxiv.2411.09851 |