PDD: Partitioning DAG-Topology DNNs for Streaming Tasks
To enable the inference of high-precision deep neural networks (DNNs) on resource-constrained devices, DNN offloading has been widely explored in recent years. Some works have also integrated the chain-topology DNN (CDNN) offloading with pipeline processing to further reduce inference delay when pro...
Gespeichert in:
Veröffentlicht in: | IEEE internet of things journal 2024-03, Vol.11 (6), p.9258-9268 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To enable the inference of high-precision deep neural networks (DNNs) on resource-constrained devices, DNN offloading has been widely explored in recent years. Some works have also integrated the chain-topology DNN (CDNN) offloading with pipeline processing to further reduce inference delay when processing streaming tasks. To improve the accuracy of the inference results, the topology of DNN tends to evolve from chain topology to directed acyclic graph (DAG) topology. However, most of the existing works do not study partitioning and offloading DAG-topology DNNs (DDNNs) for streaming tasks. Moreover, when partitioning computationally expensive DNN models, multipartitioning probably outperforms the bi-partitioning method, and most of the works do not study multipartitioning DAG-topology DNNs. In this article, we propose a more general multipartitioning and offloading method for large-scale DDNNs to process streaming tasks, which can adaptively partition DDNNs into multiple parts considering the computing power and bandwidth of all available computing units. Specifically, we first present a transforming method based on topological sorting that can losslessly transform DAG-topology DNNs into CDNNs. Then, based on greedy and dichotomy ideas, a multipartitioning algorithm is designed to partition and offload CDNNs. In this way, we can solve DDNNs' multipartitioning problem based on the proposed transforming and partitioning algorithms. Experiments https://github.com/sreasearcher/PDD-Code show that the method proposed in this article significantly outperforms bi-partitioning and nonpartitioning methods when offloading computationally expensive DNN models. |
---|---|
ISSN: | 2327-4662 2327-4662 |
DOI: | 10.1109/JIOT.2023.3323520 |