Re-Architecting Buffer Management in Lossless Ethernet

Converged Ethernet employs Priority-based Flow Control (PFC) to provide a lossless network. However, issues caused by PFC, including victim flow, congestion spreading, and deadlock, impede its large-scale deployment in production systems. The fine-grained experimental observations on switch buffer o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on networking 2024-12, Vol.32 (6), p.4749-4764
Hauptverfasser: Huang, Hanlin, Du, Xinle, Li, Tong, Wang, Haiyang, Xu, Ke, Wang, Mowei, Dai, Huichen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Converged Ethernet employs Priority-based Flow Control (PFC) to provide a lossless network. However, issues caused by PFC, including victim flow, congestion spreading, and deadlock, impede its large-scale deployment in production systems. The fine-grained experimental observations on switch buffer occupancy find that the root cause of these performance problems is a mismatch of sending rates between end-to-end congestion control and hop-by-hop flow control. Resolving this mismatch requires the switch to provide an additional buffer, which is not supported by the classic dynamic threshold (DT) policy in current shared-buffer commercial switches. In this paper, we propose Selective-PFC (SPFC), a practical buffer management scheme that handles such mismatch. Specifically, SPFC incrementally modifies DT by proactively detecting port traffic and adjusting buffer allocation accordingly to trigger PFC PAUSE frames selectively. Extensive case studies demonstrate that SPFC can reduce the number of PFC PAUSEs on non-bursty ports by up to 69.0%, and reduce the average flow completion time by up to 83.5% for large victim flows.
ISSN:1063-6692
1558-2566
DOI:10.1109/TNET.2024.3430989