An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions
In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-04, Vol.70 (4), p.1-12 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 12 |
---|---|
container_issue | 4 |
container_start_page | 1 |
container_title | IEEE transactions on circuits and systems. I, Regular papers |
container_volume | 70 |
creator | Ou, Zixuan Yu, Bing Ye, Wenbin |
description | In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-scale wavelet transform. After that, an FPGA-based neural network (NN) accelerator tailored for the proposed network is designed. The proposed NN accelerator replaces the general multipliers with non-exact multipliers to reduce the hardware cost. For the multi-branch convolution layer, a novel layer computing sequence is introduced to improve the efficiency of the processing element (PE) array and reduce the memory footprint. In addition, the average pooling operation in the proposed network is folded into the quantization factors to reduce hardware cost. The experimental findings show that the MB-Net can maintain competitive performance in comparison to state-of-the-art methods while the hardware cost is significantly lower. The proposed network model is implemented in Zynq ZC702 board using only 3615 LUTs, 1843 FFs, 11.5 BRAMs, and 8 DSPs with 0.234 W power consumption. Through algorithm and hardware co-optimization, the fall detection accelerator can achieve 95 \% PE efficiency and takes 0.346 ms latency for a radar sample interference with only 80.96 uJ energy consumption. |
doi_str_mv | 10.1109/TCSI.2022.3232918 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2793209283</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10005051</ieee_id><sourcerecordid>2793209283</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-f93d45f87bdecada362e704166128c9bff8e09d9a6e73e2ea0d5d8e5d87f339d3</originalsourceid><addsrcrecordid>eNpNkE1LAzEQhhdRsFZ_gOAh4Dk1H_uRHNttawsVQSseQ7qZtCnb3ZrsKv57d2kPHoYZmPeZgSeK7ikZUUrk0zp_X44YYWzEGWeSiotoQJNEYCJIetnPscSCM3Ed3YSwJ4RJwukg2o4rNLPWFQ6qBo3Lbe1dszvghfbmR3tAeY2nENy2Qrb26E0b7fFEBzBorssSTaGBonF1hT47Dr20ZePwxOuq2HVo9V2Xbb8Nt9GV1WWAu3MfRh_z2Tpf4NXr8zIfr3DB4rTBVnITJ1ZkGwNF94unDDIS0zSlTBRyY60AIo3UKWQcGGhiEiOgq8xyLg0fRo-nu0dff7UQGrWvW191LxXLJGdEMsG7FD2lCl-H4MGqo3cH7X8VJar3qXqfqvepzj475uHEOAD4lyckIQnlf4_ice8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2793209283</pqid></control><display><type>article</type><title>An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions</title><source>IEEE Electronic Library (IEL)</source><creator>Ou, Zixuan ; Yu, Bing ; Ye, Wenbin</creator><creatorcontrib>Ou, Zixuan ; Yu, Bing ; Ye, Wenbin</creatorcontrib><description>In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-scale wavelet transform. After that, an FPGA-based neural network (NN) accelerator tailored for the proposed network is designed. The proposed NN accelerator replaces the general multipliers with non-exact multipliers to reduce the hardware cost. For the multi-branch convolution layer, a novel layer computing sequence is introduced to improve the efficiency of the processing element (PE) array and reduce the memory footprint. In addition, the average pooling operation in the proposed network is folded into the quantization factors to reduce hardware cost. The experimental findings show that the MB-Net can maintain competitive performance in comparison to state-of-the-art methods while the hardware cost is significantly lower. The proposed network model is implemented in Zynq ZC702 board using only 3615 LUTs, 1843 FFs, 11.5 BRAMs, and 8 DSPs with 0.234 W power consumption. Through algorithm and hardware co-optimization, the fall detection accelerator can achieve 95<inline-formula> <tex-math notation="LaTeX">\%</tex-math> </inline-formula> PE efficiency and takes 0.346 ms latency for a radar sample interference with only 80.96 uJ energy consumption.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2022.3232918</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>algorithm-hardware co-design ; Algorithms ; Co-design ; Convolution ; convolutional neural network ; Energy consumption ; Fall detection ; Feature extraction ; Hardware ; low cost ; low power ; Multipliers ; Neural networks ; Optimization ; Power consumption ; Radar ; Radar detection ; Radar imaging ; radar signal processing ; Spectrogram ; Wavelet transforms</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-12</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-f93d45f87bdecada362e704166128c9bff8e09d9a6e73e2ea0d5d8e5d87f339d3</cites><orcidid>0000-0001-6978-813X ; 0000-0002-6130-5583 ; 0000-0002-6910-6220</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10005051$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10005051$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ou, Zixuan</creatorcontrib><creatorcontrib>Yu, Bing</creatorcontrib><creatorcontrib>Ye, Wenbin</creatorcontrib><title>An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-scale wavelet transform. After that, an FPGA-based neural network (NN) accelerator tailored for the proposed network is designed. The proposed NN accelerator replaces the general multipliers with non-exact multipliers to reduce the hardware cost. For the multi-branch convolution layer, a novel layer computing sequence is introduced to improve the efficiency of the processing element (PE) array and reduce the memory footprint. In addition, the average pooling operation in the proposed network is folded into the quantization factors to reduce hardware cost. The experimental findings show that the MB-Net can maintain competitive performance in comparison to state-of-the-art methods while the hardware cost is significantly lower. The proposed network model is implemented in Zynq ZC702 board using only 3615 LUTs, 1843 FFs, 11.5 BRAMs, and 8 DSPs with 0.234 W power consumption. Through algorithm and hardware co-optimization, the fall detection accelerator can achieve 95<inline-formula> <tex-math notation="LaTeX">\%</tex-math> </inline-formula> PE efficiency and takes 0.346 ms latency for a radar sample interference with only 80.96 uJ energy consumption.</description><subject>algorithm-hardware co-design</subject><subject>Algorithms</subject><subject>Co-design</subject><subject>Convolution</subject><subject>convolutional neural network</subject><subject>Energy consumption</subject><subject>Fall detection</subject><subject>Feature extraction</subject><subject>Hardware</subject><subject>low cost</subject><subject>low power</subject><subject>Multipliers</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Power consumption</subject><subject>Radar</subject><subject>Radar detection</subject><subject>Radar imaging</subject><subject>radar signal processing</subject><subject>Spectrogram</subject><subject>Wavelet transforms</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1LAzEQhhdRsFZ_gOAh4Dk1H_uRHNttawsVQSseQ7qZtCnb3ZrsKv57d2kPHoYZmPeZgSeK7ikZUUrk0zp_X44YYWzEGWeSiotoQJNEYCJIetnPscSCM3Ed3YSwJ4RJwukg2o4rNLPWFQ6qBo3Lbe1dszvghfbmR3tAeY2nENy2Qrb26E0b7fFEBzBorssSTaGBonF1hT47Dr20ZePwxOuq2HVo9V2Xbb8Nt9GV1WWAu3MfRh_z2Tpf4NXr8zIfr3DB4rTBVnITJ1ZkGwNF94unDDIS0zSlTBRyY60AIo3UKWQcGGhiEiOgq8xyLg0fRo-nu0dff7UQGrWvW191LxXLJGdEMsG7FD2lCl-H4MGqo3cH7X8VJar3qXqfqvepzj475uHEOAD4lyckIQnlf4_ice8</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Ou, Zixuan</creator><creator>Yu, Bing</creator><creator>Ye, Wenbin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-6978-813X</orcidid><orcidid>https://orcid.org/0000-0002-6130-5583</orcidid><orcidid>https://orcid.org/0000-0002-6910-6220</orcidid></search><sort><creationdate>20230401</creationdate><title>An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions</title><author>Ou, Zixuan ; Yu, Bing ; Ye, Wenbin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-f93d45f87bdecada362e704166128c9bff8e09d9a6e73e2ea0d5d8e5d87f339d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>algorithm-hardware co-design</topic><topic>Algorithms</topic><topic>Co-design</topic><topic>Convolution</topic><topic>convolutional neural network</topic><topic>Energy consumption</topic><topic>Fall detection</topic><topic>Feature extraction</topic><topic>Hardware</topic><topic>low cost</topic><topic>low power</topic><topic>Multipliers</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Power consumption</topic><topic>Radar</topic><topic>Radar detection</topic><topic>Radar imaging</topic><topic>radar signal processing</topic><topic>Spectrogram</topic><topic>Wavelet transforms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ou, Zixuan</creatorcontrib><creatorcontrib>Yu, Bing</creatorcontrib><creatorcontrib>Ye, Wenbin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ou, Zixuan</au><au>Yu, Bing</au><au>Ye, Wenbin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2023-04-01</date><risdate>2023</risdate><volume>70</volume><issue>4</issue><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-scale wavelet transform. After that, an FPGA-based neural network (NN) accelerator tailored for the proposed network is designed. The proposed NN accelerator replaces the general multipliers with non-exact multipliers to reduce the hardware cost. For the multi-branch convolution layer, a novel layer computing sequence is introduced to improve the efficiency of the processing element (PE) array and reduce the memory footprint. In addition, the average pooling operation in the proposed network is folded into the quantization factors to reduce hardware cost. The experimental findings show that the MB-Net can maintain competitive performance in comparison to state-of-the-art methods while the hardware cost is significantly lower. The proposed network model is implemented in Zynq ZC702 board using only 3615 LUTs, 1843 FFs, 11.5 BRAMs, and 8 DSPs with 0.234 W power consumption. Through algorithm and hardware co-optimization, the fall detection accelerator can achieve 95<inline-formula> <tex-math notation="LaTeX">\%</tex-math> </inline-formula> PE efficiency and takes 0.346 ms latency for a radar sample interference with only 80.96 uJ energy consumption.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2022.3232918</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-6978-813X</orcidid><orcidid>https://orcid.org/0000-0002-6130-5583</orcidid><orcidid>https://orcid.org/0000-0002-6910-6220</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1549-8328 |
ispartof | IEEE transactions on circuits and systems. I, Regular papers, 2023-04, Vol.70 (4), p.1-12 |
issn | 1549-8328 1558-0806 |
language | eng |
recordid | cdi_proquest_journals_2793209283 |
source | IEEE Electronic Library (IEL) |
subjects | algorithm-hardware co-design Algorithms Co-design Convolution convolutional neural network Energy consumption Fall detection Feature extraction Hardware low cost low power Multipliers Neural networks Optimization Power consumption Radar Radar detection Radar imaging radar signal processing Spectrogram Wavelet transforms |
title | An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T14%3A45%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Efficient%20Algorithm-Hardware%20Co-Design%20for%20Radar-Based%20Fall%20Detection%20With%20Multi-Branch%20Convolutions&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Ou,%20Zixuan&rft.date=2023-04-01&rft.volume=70&rft.issue=4&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2022.3232918&rft_dat=%3Cproquest_RIE%3E2793209283%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2793209283&rft_id=info:pmid/&rft_ieee_id=10005051&rfr_iscdi=true |