Towards high-performance deep learning architecture and hardware accelerator design for robust analysis in diffuse correlation spectroscopy

•We provided a comprehensive discussion of the rigorous mathematical model of diffuse correlation spectroscopy (DCS) and used it to generate synthetic data for training Deep Neural Networks (DNNs). We conducted an extensive comparison of our model with conventional non-linear fitting algorithms and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computer methods and programs in biomedicine 2025-01, Vol.258, p.108471, Article 108471
Hauptverfasser:	Zang, Zhenya, Wang, Quan, Pan, Mingliang, Zhang, Yuanzhe, Chen, Xi, Li, Xingda, Li, David Day Uei
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Blood flow index Deep Learning Deep neural networks Deep-learning hardware accelerator Diffuse correlation spectroscope Equipment Design Humans Neural Networks, Computer Phantoms, Imaging Spectrum Analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•We provided a comprehensive discussion of the rigorous mathematical model of diffuse correlation spectroscopy (DCS) and used it to generate synthetic data for training Deep Neural Networks (DNNs). We conducted an extensive comparison of our model with conventional non-linear fitting algorithms and vanilla DNNs, evaluating their performance in reconstructing the blood flow index (BFi). Our results highlighted the robustness of our model against high levels of noise, using a combination of analytical, in-silico simulation, and real liquid phantom datasets. Additionally, we explored relative BFi measurements for practical clinical applications and provided insights into the interpretability of our DNN model.•We implemented the compact DNN on FPGA to validate its computational efficiency. Our implementation results show faster speeds and a higher throughput-power ratio compared to high-performance CPUs and GPUs across different batch sizes. Our efficient hardware implementation, employing various fixed-point bit-width compression strategies, has shown outstanding performance compared to CPUs and GPUs when executing the same tasks.•We acquired real data from an APD-based DCS platform. Liquid phantom (diluted milk) was used to generate the autocorrelation function from a commercial hardware correlator for quantitative evaluation. This study proposes a compact deep learning (DL) architecture and a highly parallelized computing hardware platform to reconstruct the blood flow index (BFi) in diffuse correlation spectroscopy (DCS). We leveraged a rigorous analytical model to generate autocorrelation functions (ACFs) to train the DL network. We assessed the accuracy of the proposed DL using simulated and milk phantom data. Compared to convolutional neural networks (CNN), our lightweight DL architecture achieves 66.7% and 18.5% improvement in MSE for BFi and the coherence factor β, using synthetic data evaluation. The accuracy of rBFi over different algorithms was also investigated. We further simplified the DL computing primitives using subtraction for feature extraction, considering further hardware implementation. We extensively explored computing parallelism and fixed-point quantization within the DL architecture. With the DL model's compact size, we employed unrolling and pipelining optimizations for computation-intensive for-loops in the DL model while storing all learned parameters in on-chip BRAMs. We also achieved pixel-wise parallelism, enabling simultaneous
ISSN:	0169-2607 1872-7565 1872-7565
DOI:	10.1016/j.cmpb.2024.108471