A Tightly Coupled AI-ISP Vision Processor

To achieve high-quality and high-resolution image processing, this work presents a novel vision processor that facilitates deep learning-enhanced image processing pipelines. At the system level, by identifying that a divide-and-conquer approach is essential to synergize both classical image processi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2025, p.1-1
Hauptverfasser: Zhang, Hao, Li, Sicheng, Gui, Yupeng, Li, Zhiyong, Xu, Shusong, Lu, Yanheng, Niu, Dimin, Zheng, Hongzhong, Chen, Yen-Kuang, Xie, Yuan, Fan, Yibo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To achieve high-quality and high-resolution image processing, this work presents a novel vision processor that facilitates deep learning-enhanced image processing pipelines. At the system level, by identifying that a divide-and-conquer approach is essential to synergize both classical image processing and image enhancement networks, we develop a tightly coupled system with strip-tile conversion dataflow to enable fine-grained low-latency data interactions between image signal processors (ISPs) and the deep learning accelerator (DLA). At the architecture level, we design a comprehensive set of 21 efficient image processing modules to construct classical ISP pipelines, a tile-based strip layer fusion DLA specifically optimized for networks, and a programmable pixel pool that seamlessly supports the data access patterns of the ISP and the DLA. At the software and hardware co-design level, we propose a comprehensive optimization framework to address the implementation overhead of networks while maintaining the image quality. Finally, evaluations of the AI-ISP vision processor demonstrate 53.95% external memory access reduction and 35.51% latency reduction, delivering superior image quality with minimal on-chip memory overhead. A throughput of up to 168.5 frames per second facilitates efficient processing of ultra-high definition (UHD) resolution images.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3510939