HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks

This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Isik, Murat, Vishwamith, Hiruna, Inadagbo, Kayode, Dikmen, I. Can
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Isik, Murat
Vishwamith, Hiruna
Inadagbo, Kayode
Dikmen, I. Can
description This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our approach lies in the fusion of the temporal dynamics of SNNs with the attention mechanisms of Transformers, enabling the model to capture intricate audio patterns and relationships. Our architecture, HPCNeuroNet, employs the Short-Time Fourier Transform (STFT) for time-frequency representation, Transformer embeddings for dense vector generation, and SNN encoding/decoding mechanisms for spike train conversions. The system's performance is further enhanced by leveraging the computational capabilities of NVIDIA's GeForce RTX 3060 GPU and Intel's Core i9 12900H CPU. Additionally, we introduce a hardware implementation on the Xilinx VU37P HBM FPGA platform, optimizing for energy efficiency and real-time processing. The proposed accelerator achieves a throughput of 71.11 Giga-Operations Per Second (GOP/s) with a 3.55 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. Through design-space exploration, we provide insights into optimizing core capacities for audio tasks. Our findings underscore the transformative potential of integrating SNNs, Transformers, and HPC for neuromorphic audio processing, setting a new benchmark for future research and applications.
doi_str_mv 10.48550/arxiv.2311.12449
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_12449</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_12449</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-e78d6bd5390f05110e42315e929b4e2a2a5103cf14bd41e663e04533f9fd300b3</originalsourceid><addsrcrecordid>eNo1j8tOwzAQRb1hgQofwAr_QIId22nNLooKRapKpWYfOfG4sdrYkd0H_D1pgNVIV3OP7kHoiZKUL4QgLyp82UuaMUpTmnEu71G_2pYbOAe_gdMrLvRFuda6PZ6y3oehsy0uztp6vLN7p454G3wLMd6ervbU4SooF40PPYRk6bqxDxrvBnv4x4ydEX714RAf0J1RxwiPf3eGqrdlVa6S9ef7R1msE5XPZQLzhc4bLZgkhghKCfBxsgCZyYZDpjIlKGGtobzRnEKeMyBcMGak0YyQhs3Q8y928q2HYHsVvuubdz15sx-zelPo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks</title><source>arXiv.org</source><creator>Isik, Murat ; Vishwamith, Hiruna ; Inadagbo, Kayode ; Dikmen, I. Can</creator><creatorcontrib>Isik, Murat ; Vishwamith, Hiruna ; Inadagbo, Kayode ; Dikmen, I. Can</creatorcontrib><description>This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our approach lies in the fusion of the temporal dynamics of SNNs with the attention mechanisms of Transformers, enabling the model to capture intricate audio patterns and relationships. Our architecture, HPCNeuroNet, employs the Short-Time Fourier Transform (STFT) for time-frequency representation, Transformer embeddings for dense vector generation, and SNN encoding/decoding mechanisms for spike train conversions. The system's performance is further enhanced by leveraging the computational capabilities of NVIDIA's GeForce RTX 3060 GPU and Intel's Core i9 12900H CPU. Additionally, we introduce a hardware implementation on the Xilinx VU37P HBM FPGA platform, optimizing for energy efficiency and real-time processing. The proposed accelerator achieves a throughput of 71.11 Giga-Operations Per Second (GOP/s) with a 3.55 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. Through design-space exploration, we provide insights into optimizing core capacities for audio tasks. Our findings underscore the transformative potential of integrating SNNs, Transformers, and HPC for neuromorphic audio processing, setting a new benchmark for future research and applications.</description><identifier>DOI: 10.48550/arxiv.2311.12449</identifier><language>eng</language><subject>Computer Science - Sound</subject><creationdate>2023-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.12449$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.12449$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Isik, Murat</creatorcontrib><creatorcontrib>Vishwamith, Hiruna</creatorcontrib><creatorcontrib>Inadagbo, Kayode</creatorcontrib><creatorcontrib>Dikmen, I. Can</creatorcontrib><title>HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks</title><description>This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our approach lies in the fusion of the temporal dynamics of SNNs with the attention mechanisms of Transformers, enabling the model to capture intricate audio patterns and relationships. Our architecture, HPCNeuroNet, employs the Short-Time Fourier Transform (STFT) for time-frequency representation, Transformer embeddings for dense vector generation, and SNN encoding/decoding mechanisms for spike train conversions. The system's performance is further enhanced by leveraging the computational capabilities of NVIDIA's GeForce RTX 3060 GPU and Intel's Core i9 12900H CPU. Additionally, we introduce a hardware implementation on the Xilinx VU37P HBM FPGA platform, optimizing for energy efficiency and real-time processing. The proposed accelerator achieves a throughput of 71.11 Giga-Operations Per Second (GOP/s) with a 3.55 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. Through design-space exploration, we provide insights into optimizing core capacities for audio tasks. Our findings underscore the transformative potential of integrating SNNs, Transformers, and HPC for neuromorphic audio processing, setting a new benchmark for future research and applications.</description><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j8tOwzAQRb1hgQofwAr_QIId22nNLooKRapKpWYfOfG4sdrYkd0H_D1pgNVIV3OP7kHoiZKUL4QgLyp82UuaMUpTmnEu71G_2pYbOAe_gdMrLvRFuda6PZ6y3oehsy0uztp6vLN7p454G3wLMd6ervbU4SooF40PPYRk6bqxDxrvBnv4x4ydEX714RAf0J1RxwiPf3eGqrdlVa6S9ef7R1msE5XPZQLzhc4bLZgkhghKCfBxsgCZyYZDpjIlKGGtobzRnEKeMyBcMGak0YyQhs3Q8y928q2HYHsVvuubdz15sx-zelPo</recordid><startdate>20231121</startdate><enddate>20231121</enddate><creator>Isik, Murat</creator><creator>Vishwamith, Hiruna</creator><creator>Inadagbo, Kayode</creator><creator>Dikmen, I. Can</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231121</creationdate><title>HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks</title><author>Isik, Murat ; Vishwamith, Hiruna ; Inadagbo, Kayode ; Dikmen, I. Can</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-e78d6bd5390f05110e42315e929b4e2a2a5103cf14bd41e663e04533f9fd300b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Isik, Murat</creatorcontrib><creatorcontrib>Vishwamith, Hiruna</creatorcontrib><creatorcontrib>Inadagbo, Kayode</creatorcontrib><creatorcontrib>Dikmen, I. Can</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Isik, Murat</au><au>Vishwamith, Hiruna</au><au>Inadagbo, Kayode</au><au>Dikmen, I. Can</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks</atitle><date>2023-11-21</date><risdate>2023</risdate><abstract>This paper presents a novel approach to neuromorphic audio processing by integrating the strengths of Spiking Neural Networks (SNNs), Transformers, and high-performance computing (HPC) into the HPCNeuroNet architecture. Utilizing the Intel N-DNS dataset, we demonstrate the system's capability to process diverse human vocal recordings across multiple languages and noise backgrounds. The core of our approach lies in the fusion of the temporal dynamics of SNNs with the attention mechanisms of Transformers, enabling the model to capture intricate audio patterns and relationships. Our architecture, HPCNeuroNet, employs the Short-Time Fourier Transform (STFT) for time-frequency representation, Transformer embeddings for dense vector generation, and SNN encoding/decoding mechanisms for spike train conversions. The system's performance is further enhanced by leveraging the computational capabilities of NVIDIA's GeForce RTX 3060 GPU and Intel's Core i9 12900H CPU. Additionally, we introduce a hardware implementation on the Xilinx VU37P HBM FPGA platform, optimizing for energy efficiency and real-time processing. The proposed accelerator achieves a throughput of 71.11 Giga-Operations Per Second (GOP/s) with a 3.55 W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency and design flexibility. Through design-space exploration, we provide insights into optimizing core capacities for audio tasks. Our findings underscore the transformative potential of integrating SNNs, Transformers, and HPC for neuromorphic audio processing, setting a new benchmark for future research and applications.</abstract><doi>10.48550/arxiv.2311.12449</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2311.12449
ispartof
issn
language eng
recordid cdi_arxiv_primary_2311_12449
source arXiv.org
subjects Computer Science - Sound
title HPCNeuroNet: Advancing Neuromorphic Audio Signal Processing with Transformer-Enhanced Spiking Neural Networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T20%3A34%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HPCNeuroNet:%20Advancing%20Neuromorphic%20Audio%20Signal%20Processing%20with%20Transformer-Enhanced%20Spiking%20Neural%20Networks&rft.au=Isik,%20Murat&rft.date=2023-11-21&rft_id=info:doi/10.48550/arxiv.2311.12449&rft_dat=%3Carxiv_GOX%3E2311_12449%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true