Migrating INFN-T1 from CREAM-CE/LSF to HTCondor-CE/HTCondor

The INFN Tier-1 datacentre provides computing resources to several HEP and Astrophysics experiments. These are organized in Virtual Organizations submitting jobs to our computing facilities through Computing Elements, acting as Grid interfaces to the Local Resource Manager. We are phasing-out our cu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:EPJ Web of Conferences 2020, Vol.245, p.3037
Hauptverfasser: Dal Pra, S, Fornari, F, Michelotto, D, Chierici, A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The INFN Tier-1 datacentre provides computing resources to several HEP and Astrophysics experiments. These are organized in Virtual Organizations submitting jobs to our computing facilities through Computing Elements, acting as Grid interfaces to the Local Resource Manager. We are phasing-out our current LRMS (IBM/Platform LSF 9.1.3) and CEs (CREAM) set to adopt HTCondor as a replacement for LSF and HTCondor-CE in place of CREAM. A small instance has been set up to practice with the cluster management and evaluate the feasibility of our migration plans to a new LRMS and CE set. A second cluster instance has been setup to work on production. A number of management tools have been adapted or rewritten in order to integrate the new system with the existing infrastructure. Two different accounting solution for the HTCondor-CE have been implemented, and the more reliable one have been adopted. A python tool has been written to disentangle the management of HTCondor machines from our puppet instance, and to enable a quicker configuration of the cluster nodes. The monitoring tools tied to the old system are being adapted to also work on the new one. Finally, the most relevant setup steps have been documented in a public wiki page and a support mailing has been created to help other INFN sites willing to migrate their LRMS and CE to HTCondor. This document reports about our experience with HTCondor-CE on top of HTCondor and the integration of this system into our infrastructure.
ISSN:2100-014X
2101-6275
2100-014X
DOI:10.1051/epjconf/202024503037