OVERLAPPING CNN CACHE REUSE IN HIGH RESOLUTION AND STREAMING-BASED DEEP LEARNING INFERENCE ENGINES
A method optimizes Convolutional Neural Network (CNN) inference time for full resolution images. One or more processors divide a full resolution image into a plurality of partially overlapping sub-images. The processor(s) select, from the plurality of partially overlapping sub-images, a first sub-im...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method optimizes Convolutional Neural Network (CNN) inference time for full resolution images. One or more processors divide a full resolution image into a plurality of partially overlapping sub-images. The processor(s) select, from the plurality of partially overlapping sub-images, a first sub-image and a second sub-image that overlap one another in an overlapping area. The processor(s) feed the first sub-image, including the overlapping area, into a Convolutional Neural Network (CNN) in order to create a first inference result for the first sub-image, where the CNN has been trained at a fine resolution. The processor(s) cache an inference result from the CNN for the overlapping area, and then utilize the cached inference result when inferring the second sub-image in the CNN. The processor(s) then identify a specific object in the full resolution image based on inferring the first sub-image and the second sub-image. |
---|