Policy-Based Reinforcement Learning for Through Silicon Via Array Design in High-Bandwidth Memory Considering Signal Integrity

In this article, a policy-based reinforcement learning (RL) method for optimizing through silicon via (TSV) array design in high-bandwidth memory (HBM) considering signal integrity is proposed. The proposed method can provide an optimal TSV-array signal/ground pattern design to maximize the eye open...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on electromagnetic compatibility 2024-02, Vol.66 (1), p.256-269
Hauptverfasser: Kim, Keunwoo, Park, Hyunwook, Kim, Seongguk, Kim, Youngwoo, Son, Kyungjune, Lho, Daehwan, Son, Keeyoung, Shin, Taein, Sim, Boogyo, Park, Joonsang, Park, Shinyoung, Kim, Joungho
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this article, a policy-based reinforcement learning (RL) method for optimizing through silicon via (TSV) array design in high-bandwidth memory (HBM) considering signal integrity is proposed. The proposed method can provide an optimal TSV-array signal/ground pattern design to maximize the eye opening (EO), which determines the bandwidth of the high-speed TSV channel. The proposed method adopts the proximal policy optimization algorithm, which directly trains the optimal policy, providing efficient handling of large action spaces rather than value-based RL. The convolutional neural network is used as a feature extractor to extract the location information of the TSV-array. To overcome the computational cost of the reward estimation, a fast EO estimation method is developed based on the equivalent circuit modeling and peak distortion analysis. The proposed method is applied to optimize 1-byte of TSV-array in a 16-high HBM and showed an 18.2% increase in EO compared with the initial design. The optimality performance of the proposed method is compared with deep q-network and random search algorithm, and the proposed method shows 3.4% and 9.6% better optimality, respectively.
ISSN:0018-9375
1558-187X
DOI:10.1109/TEMC.2023.3343700