An efficient hardware implementation of CNN-based object trackers for real-time applications
The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainm...
Gespeichert in:
Veröffentlicht in: | Neural computing & applications 2022-11, Vol.34 (22), p.19937-19952 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 19952 |
---|---|
container_issue | 22 |
container_start_page | 19937 |
container_title | Neural computing & applications |
container_volume | 34 |
creator | El-Shafie, Al-Hussein A. Zaki, Mohamed Habib, S. E. D. |
description | The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers. |
doi_str_mv | 10.1007/s00521-022-07538-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2726616677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2726616677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Bz9FJ0yTd47L4BYte9CaEaTrRrtsPky7iv7duBW9eZmDmfWbgYexcwqUEsFcJQGdSQJYJsFoVQh6wmcyVEgp0cchmsMjHtcnVMTtJaQMAuSn0jL0sW04h1L6mduBvGKtPjMTrpt9SM45wqLuWd4GvHh5EiYkq3pUb8gMfIvp3iomHLvJIuBVD3RDHvt_Wfo-lU3YUcJvo7LfP2fPN9dPqTqwfb-9Xy7XwSuaDCFRaoqLEIlTGjiUvlDKAtvAYyCDqCnJValwscpRk0VRKa08WkLxegJqzi-luH7uPHaXBbbpdbMeXLrOZMdIYa8dUNqV87FKKFFwf6wbjl5Pgfiy6yaIbLbq9RSdHSE1QGsPtK8W_0_9Q3ycQdmA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2726616677</pqid></control><display><type>article</type><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><source>SpringerLink Journals - AutoHoldings</source><creator>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</creator><creatorcontrib>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</creatorcontrib><description>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-022-07538-1</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Candidates ; Circuits ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Computer vision ; Data Mining and Knowledge Discovery ; Embedded systems ; Engineering ; Frames (data processing) ; Frames per second ; Hardware ; Image Processing and Computer Vision ; Interpolation ; Localization ; Neural networks ; Online instruction ; Original Article ; Performance degradation ; Probability and Statistics in Computer Science ; Real time ; Tracking ; Training</subject><ispartof>Neural computing & applications, 2022-11, Vol.34 (22), p.19937-19952</ispartof><rights>The Author(s) 2022</rights><rights>The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</cites><orcidid>0000-0002-6324-6740</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-022-07538-1$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-022-07538-1$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>El-Shafie, Al-Hussein A.</creatorcontrib><creatorcontrib>Zaki, Mohamed</creatorcontrib><creatorcontrib>Habib, S. E. D.</creatorcontrib><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><description>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Candidates</subject><subject>Circuits</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Embedded systems</subject><subject>Engineering</subject><subject>Frames (data processing)</subject><subject>Frames per second</subject><subject>Hardware</subject><subject>Image Processing and Computer Vision</subject><subject>Interpolation</subject><subject>Localization</subject><subject>Neural networks</subject><subject>Online instruction</subject><subject>Original Article</subject><subject>Performance degradation</subject><subject>Probability and Statistics in Computer Science</subject><subject>Real time</subject><subject>Tracking</subject><subject>Training</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kE1LxDAQhoMouK7-AU8Bz9FJ0yTd47L4BYte9CaEaTrRrtsPky7iv7duBW9eZmDmfWbgYexcwqUEsFcJQGdSQJYJsFoVQh6wmcyVEgp0cchmsMjHtcnVMTtJaQMAuSn0jL0sW04h1L6mduBvGKtPjMTrpt9SM45wqLuWd4GvHh5EiYkq3pUb8gMfIvp3iomHLvJIuBVD3RDHvt_Wfo-lU3YUcJvo7LfP2fPN9dPqTqwfb-9Xy7XwSuaDCFRaoqLEIlTGjiUvlDKAtvAYyCDqCnJValwscpRk0VRKa08WkLxegJqzi-luH7uPHaXBbbpdbMeXLrOZMdIYa8dUNqV87FKKFFwf6wbjl5Pgfiy6yaIbLbq9RSdHSE1QGsPtK8W_0_9Q3ycQdmA</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>El-Shafie, Al-Hussein A.</creator><creator>Zaki, Mohamed</creator><creator>Habib, S. E. D.</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0002-6324-6740</orcidid></search><sort><creationdate>20221101</creationdate><title>An efficient hardware implementation of CNN-based object trackers for real-time applications</title><author>El-Shafie, Al-Hussein A. ; Zaki, Mohamed ; Habib, S. E. D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-feb7ee8ba8fd678fd483360a78cafe6aa5d043b5a994a1e7a6d355ce70aec5903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Candidates</topic><topic>Circuits</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Embedded systems</topic><topic>Engineering</topic><topic>Frames (data processing)</topic><topic>Frames per second</topic><topic>Hardware</topic><topic>Image Processing and Computer Vision</topic><topic>Interpolation</topic><topic>Localization</topic><topic>Neural networks</topic><topic>Online instruction</topic><topic>Original Article</topic><topic>Performance degradation</topic><topic>Probability and Statistics in Computer Science</topic><topic>Real time</topic><topic>Tracking</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>El-Shafie, Al-Hussein A.</creatorcontrib><creatorcontrib>Zaki, Mohamed</creatorcontrib><creatorcontrib>Habib, S. E. D.</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>El-Shafie, Al-Hussein A.</au><au>Zaki, Mohamed</au><au>Habib, S. E. D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient hardware implementation of CNN-based object trackers for real-time applications</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>34</volume><issue>22</issue><spage>19937</spage><epage>19952</epage><pages>19937-19952</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>The object tracking field continues to evolve as an important application of computer vision. Real-time performance is typically required in most applications of object tracking. The recent introduction of Convolutional Neural network (CNN) techniques to the object tracking field enabled the attainment of significant performance gains. However, the heavy computational load required for CNNs conflicts with the real-time requirements required for object tracking. In this paper, we address these computational limitations on the algorithm-side and the circuit-side. On the algorithm side, we adopt interpolation schemes which can significantly reduce the processing time and the memory storage requirements. We also evaluate the approximation of the hardware-expensive computations to attain an efficient hardware design. Moreover, we modify the online-training scheme in order to achieve a constant processing time across all video frames. On the circuit side, we developed a hardware accelerator of the online training stage. We avoid transposed reading from the external memory to speed-up the data movement with no performance degradation. Our proposed hardware accelerator achieves 44 frames-per-second in training the fully connected layers.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-022-07538-1</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-6324-6740</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0941-0643 |
ispartof | Neural computing & applications, 2022-11, Vol.34 (22), p.19937-19952 |
issn | 0941-0643 1433-3058 |
language | eng |
recordid | cdi_proquest_journals_2726616677 |
source | SpringerLink Journals - AutoHoldings |
subjects | Algorithms Artificial Intelligence Artificial neural networks Candidates Circuits Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Computer vision Data Mining and Knowledge Discovery Embedded systems Engineering Frames (data processing) Frames per second Hardware Image Processing and Computer Vision Interpolation Localization Neural networks Online instruction Original Article Performance degradation Probability and Statistics in Computer Science Real time Tracking Training |
title | An efficient hardware implementation of CNN-based object trackers for real-time applications |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T19%3A59%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient%20hardware%20implementation%20of%20CNN-based%20object%20trackers%20for%20real-time%20applications&rft.jtitle=Neural%20computing%20&%20applications&rft.au=El-Shafie,%20Al-Hussein%20A.&rft.date=2022-11-01&rft.volume=34&rft.issue=22&rft.spage=19937&rft.epage=19952&rft.pages=19937-19952&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-022-07538-1&rft_dat=%3Cproquest_cross%3E2726616677%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2726616677&rft_id=info:pmid/&rfr_iscdi=true |