PDD: Partitioning DAG-Topology DNNs for Streaming Tasks

To enable the inference of high-precision deep neural networks (DNNs) on resource-constrained devices, DNN offloading has been widely explored in recent years. Some works have also integrated the chain-topology DNN (CDNN) offloading with pipeline processing to further reduce inference delay when pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2024-03, Vol.11 (6), p.9258-9268
Hauptverfasser: Wu, Liantao, Gao, Guoliang, Yu, Jing, Zhou, Fangtong, Yang, Yang, Wang, Tengfei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To enable the inference of high-precision deep neural networks (DNNs) on resource-constrained devices, DNN offloading has been widely explored in recent years. Some works have also integrated the chain-topology DNN (CDNN) offloading with pipeline processing to further reduce inference delay when processing streaming tasks. To improve the accuracy of the inference results, the topology of DNN tends to evolve from chain topology to directed acyclic graph (DAG) topology. However, most of the existing works do not study partitioning and offloading DAG-topology DNNs (DDNNs) for streaming tasks. Moreover, when partitioning computationally expensive DNN models, multipartitioning probably outperforms the bi-partitioning method, and most of the works do not study multipartitioning DAG-topology DNNs. In this article, we propose a more general multipartitioning and offloading method for large-scale DDNNs to process streaming tasks, which can adaptively partition DDNNs into multiple parts considering the computing power and bandwidth of all available computing units. Specifically, we first present a transforming method based on topological sorting that can losslessly transform DAG-topology DNNs into CDNNs. Then, based on greedy and dichotomy ideas, a multipartitioning algorithm is designed to partition and offload CDNNs. In this way, we can solve DDNNs' multipartitioning problem based on the proposed transforming and partitioning algorithms. Experiments https://github.com/sreasearcher/PDD-Code show that the method proposed in this article significantly outperforms bi-partitioning and nonpartitioning methods when offloading computationally expensive DNN models.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2023.3323520