A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification

This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a cust...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of solid-state circuits 2023-11, Vol.58 (11), p.1-9
Hauptverfasser: Hsu, Tzu-Hsiang, Chen, Guan-Cheng, Chen, Yi-Ren, Liu, Ren-Shuo, Lo, Chung-Chuan, Tang, Kea-Tiong, Chang, Meng-Fan, Hsieh, Chih-Cheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 9
container_issue 11
container_start_page 1
container_title IEEE journal of solid-state circuits
container_volume 58
creator Hsu, Tzu-Hsiang
Chen, Guan-Cheng
Chen, Yi-Ren
Liu, Ren-Shuo
Lo, Chung-Chuan
Tang, Kea-Tiong
Chang, Meng-Fan
Hsieh, Chih-Cheng
description This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a 3 \times 3 convolution layer (stride = 3) with activation function of rectified linear unit (ReLU), a 2 \times 2 maximum pooling (MP) layer (stride = 2), and a 1 \times 1 fully connected (FC) layer for inference. A 0.8 V 128 \times 128 IVS prototype was fabricated and verified in TSMC 0.18 \mu m standard CMOS technology. In normal image mode, it consumed 76.4 \mu W with full-resolution ( 126 \times 126 active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 \mu W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel \cdot frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.
doi_str_mv 10.1109/JSSC.2023.3285734
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10164007</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10164007</ieee_id><sourcerecordid>2881502195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</originalsourceid><addsrcrecordid>eNpNkctu2zAQRYmiAeo6-YACXRDoWi6fNrkMhKR14DQBbCfdCTQ1kpnKZELKefxSv7JU7UVWgxmeOw9ehL5QMqGU6O9Xy2U5YYTxCWdKzrj4gEZUSlXQGf_9EY0IoarQjJBP6HNKDzkVQtER-nuOyUThOzz3PXSda8H3-M4lFzxegk8h4nvXb_HK-TdcBv8cun2fH02Hf8E-_g_9S4h_sPE1vo2hjWa3M5sO8D24dtsnvE7Ot_javUJdXIcaBspCGqqF88Vxygrs1runPeAmZ_OdaQGXnclY46wZRp6ik8Z0Cc6OcYzWlxer8mexuPkxL88XhWVa9PlgavM_NLUSXFpljTWMbMhUMmG0pBQU3zAx1Qq0JtqIWhmpa25sFlg5Az5G3w59H2PI-6S-egj7mC9OFVOKSsKolpmiB8rGkFKEpnqMbmfiW0VJNVhSDZZUgyXV0ZKs-XrQOAB4x9OpIGTG_wEWlInF</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2881502195</pqid></control><display><type>article</type><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</creator><creatorcontrib>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</creatorcontrib><description><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></description><identifier>ISSN: 0018-9200</identifier><identifier>EISSN: 1558-173X</identifier><identifier>DOI: 10.1109/JSSC.2023.3285734</identifier><identifier>CODEN: IJSCBC</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial intelligent (AI) ; Artificial neural networks ; Circuits ; Convolution ; convolutional neural network (CNN) CMOS image sensor (CIS) ; Convolutional neural networks ; face detection (FD) ; feature extraction ; Frequency modulation ; Image classification ; Inference ; intelligent vision sensor (IVS) ; Kernel ; Neural networks ; processing-in-sensor (PIS) ; Prototypes ; Pulse width modulation ; Sensors ; Task analysis ; Vision sensors</subject><ispartof>IEEE journal of solid-state circuits, 2023-11, Vol.58 (11), p.1-9</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</citedby><cites>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</cites><orcidid>0009-0003-5662-4621 ; 0000-0002-5311-4955 ; 0000-0003-4070-5059 ; 0000-0002-9689-1236 ; 0000-0001-7481-075X ; 0000-0001-6905-6350</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10164007$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10164007$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hsu, Tzu-Hsiang</creatorcontrib><creatorcontrib>Chen, Guan-Cheng</creatorcontrib><creatorcontrib>Chen, Yi-Ren</creatorcontrib><creatorcontrib>Liu, Ren-Shuo</creatorcontrib><creatorcontrib>Lo, Chung-Chuan</creatorcontrib><creatorcontrib>Tang, Kea-Tiong</creatorcontrib><creatorcontrib>Chang, Meng-Fan</creatorcontrib><creatorcontrib>Hsieh, Chih-Cheng</creatorcontrib><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><title>IEEE journal of solid-state circuits</title><addtitle>JSSC</addtitle><description><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></description><subject>Artificial intelligent (AI)</subject><subject>Artificial neural networks</subject><subject>Circuits</subject><subject>Convolution</subject><subject>convolutional neural network (CNN) CMOS image sensor (CIS)</subject><subject>Convolutional neural networks</subject><subject>face detection (FD)</subject><subject>feature extraction</subject><subject>Frequency modulation</subject><subject>Image classification</subject><subject>Inference</subject><subject>intelligent vision sensor (IVS)</subject><subject>Kernel</subject><subject>Neural networks</subject><subject>processing-in-sensor (PIS)</subject><subject>Prototypes</subject><subject>Pulse width modulation</subject><subject>Sensors</subject><subject>Task analysis</subject><subject>Vision sensors</subject><issn>0018-9200</issn><issn>1558-173X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkctu2zAQRYmiAeo6-YACXRDoWi6fNrkMhKR14DQBbCfdCTQ1kpnKZELKefxSv7JU7UVWgxmeOw9ehL5QMqGU6O9Xy2U5YYTxCWdKzrj4gEZUSlXQGf_9EY0IoarQjJBP6HNKDzkVQtER-nuOyUThOzz3PXSda8H3-M4lFzxegk8h4nvXb_HK-TdcBv8cun2fH02Hf8E-_g_9S4h_sPE1vo2hjWa3M5sO8D24dtsnvE7Ot_javUJdXIcaBspCGqqF88Vxygrs1runPeAmZ_OdaQGXnclY46wZRp6ik8Z0Cc6OcYzWlxer8mexuPkxL88XhWVa9PlgavM_NLUSXFpljTWMbMhUMmG0pBQU3zAx1Qq0JtqIWhmpa25sFlg5Az5G3w59H2PI-6S-egj7mC9OFVOKSsKolpmiB8rGkFKEpnqMbmfiW0VJNVhSDZZUgyXV0ZKs-XrQOAB4x9OpIGTG_wEWlInF</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Hsu, Tzu-Hsiang</creator><creator>Chen, Guan-Cheng</creator><creator>Chen, Yi-Ren</creator><creator>Liu, Ren-Shuo</creator><creator>Lo, Chung-Chuan</creator><creator>Tang, Kea-Tiong</creator><creator>Chang, Meng-Fan</creator><creator>Hsieh, Chih-Cheng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0003-5662-4621</orcidid><orcidid>https://orcid.org/0000-0002-5311-4955</orcidid><orcidid>https://orcid.org/0000-0003-4070-5059</orcidid><orcidid>https://orcid.org/0000-0002-9689-1236</orcidid><orcidid>https://orcid.org/0000-0001-7481-075X</orcidid><orcidid>https://orcid.org/0000-0001-6905-6350</orcidid></search><sort><creationdate>20231101</creationdate><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><author>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial intelligent (AI)</topic><topic>Artificial neural networks</topic><topic>Circuits</topic><topic>Convolution</topic><topic>convolutional neural network (CNN) CMOS image sensor (CIS)</topic><topic>Convolutional neural networks</topic><topic>face detection (FD)</topic><topic>feature extraction</topic><topic>Frequency modulation</topic><topic>Image classification</topic><topic>Inference</topic><topic>intelligent vision sensor (IVS)</topic><topic>Kernel</topic><topic>Neural networks</topic><topic>processing-in-sensor (PIS)</topic><topic>Prototypes</topic><topic>Pulse width modulation</topic><topic>Sensors</topic><topic>Task analysis</topic><topic>Vision sensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hsu, Tzu-Hsiang</creatorcontrib><creatorcontrib>Chen, Guan-Cheng</creatorcontrib><creatorcontrib>Chen, Yi-Ren</creatorcontrib><creatorcontrib>Liu, Ren-Shuo</creatorcontrib><creatorcontrib>Lo, Chung-Chuan</creatorcontrib><creatorcontrib>Tang, Kea-Tiong</creatorcontrib><creatorcontrib>Chang, Meng-Fan</creatorcontrib><creatorcontrib>Hsieh, Chih-Cheng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE journal of solid-state circuits</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hsu, Tzu-Hsiang</au><au>Chen, Guan-Cheng</au><au>Chen, Yi-Ren</au><au>Liu, Ren-Shuo</au><au>Lo, Chung-Chuan</au><au>Tang, Kea-Tiong</au><au>Chang, Meng-Fan</au><au>Hsieh, Chih-Cheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</atitle><jtitle>IEEE journal of solid-state circuits</jtitle><stitle>JSSC</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>58</volume><issue>11</issue><spage>1</spage><epage>9</epage><pages>1-9</pages><issn>0018-9200</issn><eissn>1558-173X</eissn><coden>IJSCBC</coden><abstract><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSSC.2023.3285734</doi><tpages>9</tpages><orcidid>https://orcid.org/0009-0003-5662-4621</orcidid><orcidid>https://orcid.org/0000-0002-5311-4955</orcidid><orcidid>https://orcid.org/0000-0003-4070-5059</orcidid><orcidid>https://orcid.org/0000-0002-9689-1236</orcidid><orcidid>https://orcid.org/0000-0001-7481-075X</orcidid><orcidid>https://orcid.org/0000-0001-6905-6350</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9200
ispartof IEEE journal of solid-state circuits, 2023-11, Vol.58 (11), p.1-9
issn 0018-9200
1558-173X
language eng
recordid cdi_ieee_primary_10164007
source IEEE Electronic Library (IEL)
subjects Artificial intelligent (AI)
Artificial neural networks
Circuits
Convolution
convolutional neural network (CNN) CMOS image sensor (CIS)
Convolutional neural networks
face detection (FD)
feature extraction
Frequency modulation
Image classification
Inference
intelligent vision sensor (IVS)
Kernel
Neural networks
processing-in-sensor (PIS)
Prototypes
Pulse width modulation
Sensors
Task analysis
Vision sensors
title A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T15%3A35%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%200.8%20V%20Intelligent%20Vision%20Sensor%20With%20Tiny%20Convolutional%20Neural%20Network%20and%20Programmable%20Weights%20Using%20Mixed-Mode%20Processing-in-Sensor%20Technique%20for%20Image%20Classification&rft.jtitle=IEEE%20journal%20of%20solid-state%20circuits&rft.au=Hsu,%20Tzu-Hsiang&rft.date=2023-11-01&rft.volume=58&rft.issue=11&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.issn=0018-9200&rft.eissn=1558-173X&rft.coden=IJSCBC&rft_id=info:doi/10.1109/JSSC.2023.3285734&rft_dat=%3Cproquest_RIE%3E2881502195%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2881502195&rft_id=info:pmid/&rft_ieee_id=10164007&rfr_iscdi=true