An efficient hardware implementation of CNN-based object trackers for real-time applications

The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainm...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2022-11, Vol.34 (22), p.19937-19952
Hauptverfasser: El-Shafie, Al-Hussein A., Zaki, Mohamed, Habib, S. E. D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 19952
container_issue 22
container_start_page 19937
container_title Neural computing & applications
container_volume 34
creator El-Shafie, Al-Hussein A.
Zaki, Mohamed
Habib, S. E. D.
description The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.
doi_str_mv 10.1007/s00521-022-07538-1
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2726616677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2726616677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Bz9FJ0yTd47L4BYte9CaEaTrRrtsPky7iv7duBW9eZmDmfWbgYexcwqUEsFcJQGdSQJYJsFoVQh6wmcyVEgp0cchmsMjHtcnVMTtJaQMAuSn0jL0sW04h1L6mduBvGKtPjMTrpt9SM45wqLuWd4GvHh5EiYkq3pUb8gMfIvp3iomHLvJIuBVD3RDHvt_Wfo-lU3YUcJvo7LfP2fPN9dPqTqwfb-9Xy7XwSuaDCFRaoqLEIlTGjiUvlDKAtvAYyCDqCnJValwscpRk0VRKa08WkLxegJqzi-luH7uPHaXBbbpdbMeXLrOZMdIYa8dUNqV87FKKFFwf6wbjl5Pgfiy6yaIbLbq9RSdHSE1QGsPtK8W_0_9Q3ycQdmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2726616677</pqid></control><display><type>article</type><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><source>SpringerLink Journals - AutoHoldings</source><creator>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</creator><creatorcontrib>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</creatorcontrib><description>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-022-07538-1</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Candidates ; Circuits ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Computer vision ; Data Mining and Knowledge Discovery ; Embedded systems ; Engineering ; Frames (data processing) ; Frames per second ; Hardware ; Image Processing and Computer Vision ; Interpolation ; Localization ; Neural networks ; Online instruction ; Original Article ; Performance degradation ; Probability and Statistics in Computer Science ; Real time ; Tracking ; Training</subject><ispartof>Neural computing &amp; applications, 2022-11, Vol.34 (22), p.19937-19952</ispartof><rights>The Author(s) 2022</rights><rights>The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</cites><orcidid>0000-0002-6324-6740</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-022-07538-1$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-022-07538-1$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>El-Shafie, Al-Hussein A.</creatorcontrib><creatorcontrib>Zaki, Mohamed</creatorcontrib><creatorcontrib>Habib, S. E. D.</creatorcontrib><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><title>Neural computing &amp; applications</title><addtitle>Neural Comput &amp; Applic</addtitle><description>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Candidates</subject><subject>Circuits</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Embedded systems</subject><subject>Engineering</subject><subject>Frames (data processing)</subject><subject>Frames per second</subject><subject>Hardware</subject><subject>Image Processing and Computer Vision</subject><subject>Interpolation</subject><subject>Localization</subject><subject>Neural networks</subject><subject>Online instruction</subject><subject>Original Article</subject><subject>Performance degradation</subject><subject>Probability and Statistics in Computer Science</subject><subject>Real time</subject><subject>Tracking</subject><subject>Training</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kE1LxDAQhoMouK7-AU8Bz9FJ0yTd47L4BYte9CaEaTrRrtsPky7iv7duBW9eZmDmfWbgYexcwqUEsFcJQGdSQJYJsFoVQh6wmcyVEgp0cchmsMjHtcnVMTtJaQMAuSn0jL0sW04h1L6mduBvGKtPjMTrpt9SM45wqLuWd4GvHh5EiYkq3pUb8gMfIvp3iomHLvJIuBVD3RDHvt_Wfo-lU3YUcJvo7LfP2fPN9dPqTqwfb-9Xy7XwSuaDCFRaoqLEIlTGjiUvlDKAtvAYyCDqCnJValwscpRk0VRKa08WkLxegJqzi-luH7uPHaXBbbpdbMeXLrOZMdIYa8dUNqV87FKKFFwf6wbjl5Pgfiy6yaIbLbq9RSdHSE1QGsPtK8W_0_9Q3ycQdmA</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>El-Shafie, Al-Hussein A.</creator><creator>Zaki, Mohamed</creator><creator>Habib, S. E. D.</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0002-6324-6740</orcidid></search><sort><creationdate>20221101</creationdate><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><author>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Candidates</topic><topic>Circuits</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Embedded systems</topic><topic>Engineering</topic><topic>Frames (data processing)</topic><topic>Frames per second</topic><topic>Hardware</topic><topic>Image Processing and Computer Vision</topic><topic>Interpolation</topic><topic>Localization</topic><topic>Neural networks</topic><topic>Online instruction</topic><topic>Original Article</topic><topic>Performance degradation</topic><topic>Probability and Statistics in Computer Science</topic><topic>Real time</topic><topic>Tracking</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>El-Shafie, Al-Hussein A.</creatorcontrib><creatorcontrib>Zaki, Mohamed</creatorcontrib><creatorcontrib>Habib, S. E. D.</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Neural computing &amp; applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>El-Shafie, Al-Hussein A.</au><au>Zaki, Mohamed</au><au>Habib, S. E. D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient hardware implementation of CNN-based object trackers for real-time applications</atitle><jtitle>Neural computing &amp; applications</jtitle><stitle>Neural Comput &amp; Applic</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>34</volume><issue>22</issue><spage>19937</spage><epage>19952</epage><pages>19937-19952</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-022-07538-1</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-6324-6740</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0941-0643
ispartof Neural computing & applications, 2022-11, Vol.34 (22), p.19937-19952
issn 0941-0643
1433-3058
language eng
recordid cdi_proquest_journals_2726616677
source SpringerLink Journals - AutoHoldings
subjects Algorithms
Artificial Intelligence
Artificial neural networks
Candidates
Circuits
Computational Biology/Bioinformatics
Computational Science and Engineering
Computer Science
Computer vision
Data Mining and Knowledge Discovery
Embedded systems
Engineering
Frames (data processing)
Frames per second
Hardware
Image Processing and Computer Vision
Interpolation
Localization
Neural networks
Online instruction
Original Article
Performance degradation
Probability and Statistics in Computer Science
Real time
Tracking
Training
title An efficient hardware implementation of CNN-based object trackers for real-time applications
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T19%3A59%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient%20hardware%20implementation%20of%20CNN-based%20object%20trackers%20for%20real-time%20applications&rft.jtitle=Neural%20computing%20&%20applications&rft.au=El-Shafie,%20Al-Hussein%20A.&rft.date=2022-11-01&rft.volume=34&rft.issue=22&rft.spage=19937&rft.epage=19952&rft.pages=19937-19952&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-022-07538-1&rft_dat=%3Cproquest_cross%3E2726616677%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2726616677&rft_id=info:pmid/&rfr_iscdi=true