Experiences with Active Per-Flow Queuing for Traffic Manager in High Performance Routers

Per-flow queuing is believed to be an effective approach to guarantee Quality of Service (QoS) in high performance routers. However, its brute-force implementation consumes a huge amount of memory and is not scalable as the number of flows increases. Dynamic Queue Sharing (DQS) mechanism, in which a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jindou Fan, Chengchen Hu, Bin Liu
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Per-flow queuing is believed to be an effective approach to guarantee Quality of Service (QoS) in high performance routers. However, its brute-force implementation consumes a huge amount of memory and is not scalable as the number of flows increases. Dynamic Queue Sharing (DQS) mechanism, in which a physical queue is dynamically created on-demand when a new flow comes and released when the flow temporarily paused, is able to achieve per-flow queuing performance with much less memory. In this paper, based on DQS, an active per-flow queuing system is designed, implemented and tested. To evaluate the effectiveness of DQS, we implement two FPGA-based Traffic Manager (TM) prototypes, one with DQS and the other a traditional one. The real chip implementation shows that DQS can not only scale down the required memory for per-flow queuing but also reduce the total number of control logic elements. As a result of reduced control logic, original 3-stage scheduling in naive scheme can be improved to be a single stage while maintaining the same delay performance, thus resulting in a faster speed potential. Besides, the power consumption can also considerably be reduced. Our experiments on a 4Gbps TM prototype using Stratix EP1S80F1508C5 FPGA show a 58.6% decrease in control memory. Meanwhile, the logic cells and LC registers are reduced by 6.8% and 15.0% respectively, and the power consumption is saved by 23% compared with the brute-force per-flow queuing implementation with 8K queues.
ISSN:1550-3607
1938-1883
DOI:10.1109/ICC.2010.5502404