Performance of the Distributed Central Analysis in BaBar

The total dataset produced by the BaBar experiment at the Stanford Linear Accelerator Center (SLAC) currently comprises roughly 3times10 9 data events and an equal amount of simulated events, corresponding to 23 Tbytes of real data and 51 Tbytes simulated events. Since individual analyses typically...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on nuclear science 2006-10, Vol.53 (5), p.2876-2880
Hauptverfasser: Khan, A., Mommsen, R.K., Gradl, W., Fritsch, M., Petzold, A., Roethel, W., Smith, D.A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The total dataset produced by the BaBar experiment at the Stanford Linear Accelerator Center (SLAC) currently comprises roughly 3times10 9 data events and an equal amount of simulated events, corresponding to 23 Tbytes of real data and 51 Tbytes simulated events. Since individual analyses typically select a very small fraction of all events, it would be extremely inefficient if each analysis had to process the full dataset. A first, centrally managed analysis step is therefore a common pre-selection ('skimming') of all data according to very loose, inclusive criteria to facilitate data access for later analysis. Usually, there are common selection criteria for several analysis. However, they may change over time, e.g., when new analyses are developed. Currently, O(100) such pre-selection streams ('skims') are defined. In order to provide timely access to newly created or modified skims, it is necessary to process the complete dataset several times a year. Additionally, newly taken or simulated data has to be skimmed as it becomes available. The system currently deployed for skim production is using 1800 CPUs distributed over three production sites. It was possible to process the complete dataset within about 3.5 months. We report on the stability and the performance of the system
ISSN:0018-9499
1558-1578
DOI:10.1109/TNS.2006.881737