A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification

This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a cust...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of solid-state circuits 2023-11, Vol.58 (11), p.1-9
Hauptverfasser:	Hsu, Tzu-Hsiang, Chen, Guan-Cheng, Chen, Yi-Ren, Liu, Ren-Shuo, Lo, Chung-Chuan, Tang, Kea-Tiong, Chang, Meng-Fan, Hsieh, Chih-Cheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligent (AI) Artificial neural networks Circuits Convolution convolutional neural network (CNN) CMOS image sensor (CIS) Convolutional neural networks face detection (FD) feature extraction Frequency modulation Image classification Inference intelligent vision sensor (IVS) Kernel Neural networks processing-in-sensor (PIS) Prototypes Pulse width modulation Sensors Task analysis Vision sensors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	9
container_issue	11
container_start_page	1
container_title	IEEE journal of solid-state circuits
container_volume	58
creator	Hsu, Tzu-Hsiang Chen, Guan-Cheng Chen, Yi-Ren Liu, Ren-Shuo Lo, Chung-Chuan Tang, Kea-Tiong Chang, Meng-Fan Hsieh, Chih-Cheng
description	This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a 3 \times 3 convolution layer (stride = 3) with activation function of rectified linear unit (ReLU), a 2 \times 2 maximum pooling (MP) layer (stride = 2), and a 1 \times 1 fully connected (FC) layer for inference. A 0.8 V 128 \times 128 IVS prototype was fabricated and verified in TSMC 0.18 \mu m standard CMOS technology. In normal image mode, it consumed 76.4 \mu W with full-resolution ( 126 \times 126 active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 \mu W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel \cdot frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.
doi_str_mv	10.1109/JSSC.2023.3285734
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10164007</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10164007</ieee_id><sourcerecordid>2881502195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</originalsourceid><addsrcrecordid>eNpNkctu2zAQRYmiAeo6-YACXRDoWi6fNrkMhKR14DQBbCfdCTQ1kpnKZELKefxSv7JU7UVWgxmeOw9ehL5QMqGU6O9Xy2U5YYTxCWdKzrj4gEZUSlXQGf_9EY0IoarQjJBP6HNKDzkVQtER-nuOyUThOzz3PXSda8H3-M4lFzxegk8h4nvXb_HK-TdcBv8cun2fH02Hf8E-_g_9S4h_sPE1vo2hjWa3M5sO8D24dtsnvE7Ot_javUJdXIcaBspCGqqF88Vxygrs1runPeAmZ_OdaQGXnclY46wZRp6ik8Z0Cc6OcYzWlxer8mexuPkxL88XhWVa9PlgavM_NLUSXFpljTWMbMhUMmG0pBQU3zAx1Qq0JtqIWhmpa25sFlg5Az5G3w59H2PI-6S-egj7mC9OFVOKSsKolpmiB8rGkFKEpnqMbmfiW0VJNVhSDZZUgyXV0ZKs-XrQOAB4x9OpIGTG_wEWlInF</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2881502195</pqid></control><display><type>article</type><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</creator><creatorcontrib>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</creatorcontrib><description><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></description><identifier>ISSN: 0018-9200</identifier><identifier>EISSN: 1558-173X</identifier><identifier>DOI: 10.1109/JSSC.2023.3285734</identifier><identifier>CODEN: IJSCBC</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Artificial intelligent (AI) ; Artificial neural networks ; Circuits ; Convolution ; convolutional neural network (CNN) CMOS image sensor (CIS) ; Convolutional neural networks ; face detection (FD) ; feature extraction ; Frequency modulation ; Image classification ; Inference ; intelligent vision sensor (IVS) ; Kernel ; Neural networks ; processing-in-sensor (PIS) ; Prototypes ; Pulse width modulation ; Sensors ; Task analysis ; Vision sensors</subject><ispartof>IEEE journal of solid-state circuits, 2023-11, Vol.58 (11), p.1-9</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</citedby><cites>FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</cites><orcidid>0009-0003-5662-4621 ; 0000-0002-5311-4955 ; 0000-0003-4070-5059 ; 0000-0002-9689-1236 ; 0000-0001-7481-075X ; 0000-0001-6905-6350</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10164007$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10164007$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hsu, Tzu-Hsiang</creatorcontrib><creatorcontrib>Chen, Guan-Cheng</creatorcontrib><creatorcontrib>Chen, Yi-Ren</creatorcontrib><creatorcontrib>Liu, Ren-Shuo</creatorcontrib><creatorcontrib>Lo, Chung-Chuan</creatorcontrib><creatorcontrib>Tang, Kea-Tiong</creatorcontrib><creatorcontrib>Chang, Meng-Fan</creatorcontrib><creatorcontrib>Hsieh, Chih-Cheng</creatorcontrib><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><title>IEEE journal of solid-state circuits</title><addtitle>JSSC</addtitle><description><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></description><subject>Artificial intelligent (AI)</subject><subject>Artificial neural networks</subject><subject>Circuits</subject><subject>Convolution</subject><subject>convolutional neural network (CNN) CMOS image sensor (CIS)</subject><subject>Convolutional neural networks</subject><subject>face detection (FD)</subject><subject>feature extraction</subject><subject>Frequency modulation</subject><subject>Image classification</subject><subject>Inference</subject><subject>intelligent vision sensor (IVS)</subject><subject>Kernel</subject><subject>Neural networks</subject><subject>processing-in-sensor (PIS)</subject><subject>Prototypes</subject><subject>Pulse width modulation</subject><subject>Sensors</subject><subject>Task analysis</subject><subject>Vision sensors</subject><issn>0018-9200</issn><issn>1558-173X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkctu2zAQRYmiAeo6-YACXRDoWi6fNrkMhKR14DQBbCfdCTQ1kpnKZELKefxSv7JU7UVWgxmeOw9ehL5QMqGU6O9Xy2U5YYTxCWdKzrj4gEZUSlXQGf_9EY0IoarQjJBP6HNKDzkVQtER-nuOyUThOzz3PXSda8H3-M4lFzxegk8h4nvXb_HK-TdcBv8cun2fH02Hf8E-_g_9S4h_sPE1vo2hjWa3M5sO8D24dtsnvE7Ot_javUJdXIcaBspCGqqF88Vxygrs1runPeAmZ_OdaQGXnclY46wZRp6ik8Z0Cc6OcYzWlxer8mexuPkxL88XhWVa9PlgavM_NLUSXFpljTWMbMhUMmG0pBQU3zAx1Qq0JtqIWhmpa25sFlg5Az5G3w59H2PI-6S-egj7mC9OFVOKSsKolpmiB8rGkFKEpnqMbmfiW0VJNVhSDZZUgyXV0ZKs-XrQOAB4x9OpIGTG_wEWlInF</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Hsu, Tzu-Hsiang</creator><creator>Chen, Guan-Cheng</creator><creator>Chen, Yi-Ren</creator><creator>Liu, Ren-Shuo</creator><creator>Lo, Chung-Chuan</creator><creator>Tang, Kea-Tiong</creator><creator>Chang, Meng-Fan</creator><creator>Hsieh, Chih-Cheng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0003-5662-4621</orcidid><orcidid>https://orcid.org/0000-0002-5311-4955</orcidid><orcidid>https://orcid.org/0000-0003-4070-5059</orcidid><orcidid>https://orcid.org/0000-0002-9689-1236</orcidid><orcidid>https://orcid.org/0000-0001-7481-075X</orcidid><orcidid>https://orcid.org/0000-0001-6905-6350</orcidid></search><sort><creationdate>20231101</creationdate><title>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</title><author>Hsu, Tzu-Hsiang ; Chen, Guan-Cheng ; Chen, Yi-Ren ; Liu, Ren-Shuo ; Lo, Chung-Chuan ; Tang, Kea-Tiong ; Chang, Meng-Fan ; Hsieh, Chih-Cheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-171c023fd8435c8caca20b06524a9511e83b24698e9909a4d8a59d3acd84c57e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial intelligent (AI)</topic><topic>Artificial neural networks</topic><topic>Circuits</topic><topic>Convolution</topic><topic>convolutional neural network (CNN) CMOS image sensor (CIS)</topic><topic>Convolutional neural networks</topic><topic>face detection (FD)</topic><topic>feature extraction</topic><topic>Frequency modulation</topic><topic>Image classification</topic><topic>Inference</topic><topic>intelligent vision sensor (IVS)</topic><topic>Kernel</topic><topic>Neural networks</topic><topic>processing-in-sensor (PIS)</topic><topic>Prototypes</topic><topic>Pulse width modulation</topic><topic>Sensors</topic><topic>Task analysis</topic><topic>Vision sensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hsu, Tzu-Hsiang</creatorcontrib><creatorcontrib>Chen, Guan-Cheng</creatorcontrib><creatorcontrib>Chen, Yi-Ren</creatorcontrib><creatorcontrib>Liu, Ren-Shuo</creatorcontrib><creatorcontrib>Lo, Chung-Chuan</creatorcontrib><creatorcontrib>Tang, Kea-Tiong</creatorcontrib><creatorcontrib>Chang, Meng-Fan</creatorcontrib><creatorcontrib>Hsieh, Chih-Cheng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE journal of solid-state circuits</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hsu, Tzu-Hsiang</au><au>Chen, Guan-Cheng</au><au>Chen, Yi-Ren</au><au>Liu, Ren-Shuo</au><au>Lo, Chung-Chuan</au><au>Tang, Kea-Tiong</au><au>Chang, Meng-Fan</au><au>Hsieh, Chih-Cheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification</atitle><jtitle>IEEE journal of solid-state circuits</jtitle><stitle>JSSC</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>58</volume><issue>11</issue><spage>1</spage><epage>9</epage><pages>1-9</pages><issn>0018-9200</issn><eissn>1558-173X</eissn><coden>IJSCBC</coden><abstract><![CDATA[This article presents an intelligent vision sensor (IVS) with embedded tiny convolutional neural network (CNN) model and programmable processing-in-sensor (PIS) circuit for real-time inference applications of low-power edge devices. The proposed imager realizes the full computing functions of a customized three-layers tiny network, which includes a <inline-formula> <tex-math notation="LaTeX">3 \times 3</tex-math> </inline-formula> convolution layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 3) with activation function of rectified linear unit (ReLU), a <inline-formula> <tex-math notation="LaTeX">2 \times 2</tex-math> </inline-formula> maximum pooling (MP) layer (stride <inline-formula> <tex-math notation="LaTeX">=</tex-math> </inline-formula> 2), and a <inline-formula> <tex-math notation="LaTeX">1 \times 1</tex-math> </inline-formula> fully connected (FC) layer for inference. A 0.8 V <inline-formula> <tex-math notation="LaTeX">128 \times 128</tex-math> </inline-formula> IVS prototype was fabricated and verified in TSMC 0.18 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>m standard CMOS technology. In normal image mode, it consumed 76.4 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W with full-resolution (<inline-formula> <tex-math notation="LaTeX">126 \times 126</tex-math> </inline-formula> active resolution) image output at 125 f/s. In CNN mode, it consumed 134.5 <inline-formula> <tex-math notation="LaTeX">\mu</tex-math> </inline-formula>W at 250 f/s and an achieved iFoMs of 33.8 pJ/pixel<inline-formula> <tex-math notation="LaTeX">\cdot</tex-math> </inline-formula>frame. Using the proposed mixed-mode PIS circuits, the prototype is configured to demonstrate a "human face or not detection" task with an achieved accuracy of 93.6%.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSSC.2023.3285734</doi><tpages>9</tpages><orcidid>https://orcid.org/0009-0003-5662-4621</orcidid><orcidid>https://orcid.org/0000-0002-5311-4955</orcidid><orcidid>https://orcid.org/0000-0003-4070-5059</orcidid><orcidid>https://orcid.org/0000-0002-9689-1236</orcidid><orcidid>https://orcid.org/0000-0001-7481-075X</orcidid><orcidid>https://orcid.org/0000-0001-6905-6350</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9200
ispartof	IEEE journal of solid-state circuits, 2023-11, Vol.58 (11), p.1-9
issn	0018-9200 1558-173X
language	eng
recordid	cdi_ieee_primary_10164007
source	IEEE Electronic Library (IEL)
subjects	Artificial intelligent (AI) Artificial neural networks Circuits Convolution convolutional neural network (CNN) CMOS image sensor (CIS) Convolutional neural networks face detection (FD) feature extraction Frequency modulation Image classification Inference intelligent vision sensor (IVS) Kernel Neural networks processing-in-sensor (PIS) Prototypes Pulse width modulation Sensors Task analysis Vision sensors
title	A 0.8 V Intelligent Vision Sensor With Tiny Convolutional Neural Network and Programmable Weights Using Mixed-Mode Processing-in-Sensor Technique for Image Classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T15%3A35%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%200.8%20V%20Intelligent%20Vision%20Sensor%20With%20Tiny%20Convolutional%20Neural%20Network%20and%20Programmable%20Weights%20Using%20Mixed-Mode%20Processing-in-Sensor%20Technique%20for%20Image%20Classification&rft.jtitle=IEEE%20journal%20of%20solid-state%20circuits&rft.au=Hsu,%20Tzu-Hsiang&rft.date=2023-11-01&rft.volume=58&rft.issue=11&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.issn=0018-9200&rft.eissn=1558-173X&rft.coden=IJSCBC&rft_id=info:doi/10.1109/JSSC.2023.3285734&rft_dat=%3Cproquest_RIE%3E2881502195%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2881502195&rft_id=info:pmid/&rft_ieee_id=10164007&rfr_iscdi=true