Fast Fail-Over Technique for Distributed Controller Architecture in Software-Defined Networks

Recent studies have proposed different approaches to mitigate the risk of overload and failure in Software-Defined Networks (SDNs). Some of these approaches have proven effective but only in specific use cases, making it potentially difficult to generalize their application. Furthermore, network fai...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019, Vol.7, p.160718-160737
Hauptverfasser: Akanbi, Oluwatobi A., Aljaedi, Amer, Zhou, Xiaobo, Alharbi, Adel R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent studies have proposed different approaches to mitigate the risk of overload and failure in Software-Defined Networks (SDNs). Some of these approaches have proven effective but only in specific use cases, making it potentially difficult to generalize their application. Furthermore, network failure detection and recovery by the SDN control plane requires sophisticated software logic running on multiple controllers that are non-intrusive to the network environment. While this allows more flexibility to respond to failure events, it also implies that each controller application must include its recovery logic, which increases code complexity. In this paper, we propose a fast fail-over technique for solving the problem of a controller failure or target availability in the network. We argue that inter-domain controller synchronization can result in high network overhead and should be minimized to per-need base only. To this end, upon detecting a failure in the control plane, the proposed fast failure recovery technique leverages a load-shifting scheme to initialize alternate paths and proactively instantiate flow rules to reduce flow setup latency. To prevent packet loss during failure recovery, we utilize a forwarding information table that quickly replays inputs to the controller after failure recovery. Our extensive experiments show that the average latency incurred by the controller to controller communication is approximately twice that of per-need based synchronization. The experimental results also show that our proposed technique achieved a 50% reduction in service interruption period and 75% flow_mod reduction during a single link failure over the traditional SDN baseline approach.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2951598