PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Published at 51th IEEE/ACM International Symposium on Computer Architecture (ISCA-51), 2024 Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a s...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Published at 51th IEEE/ACM International Symposium on Computer
Architecture (ISCA-51), 2024 Training recommendation systems (RecSys) faces several challenges as it
requires the "data preprocessing" stage to preprocess an ample amount of raw
data and feed them to the GPU for training in a seamless manner. To sustain
high training throughput, state-of-the-art solutions reserve a large fleet of
CPU servers for preprocessing which incurs substantial deployment cost and
power consumption. Our characterization reveals that prior CPU-centric
preprocessing is bottlenecked on feature generation and feature normalization
operations as it fails to reap out the abundant inter-/intra-feature
parallelism in RecSys preprocessing. PreSto is a storage-centric preprocessing
system leveraging In-Storage Processing (ISP), which offloads the bottlenecked
preprocessing operations to our ISP units. We show that PreSto outperforms the
baseline CPU-centric system with a $9.6\times$ speedup in end-to-end
preprocessing time, $4.3\times$ enhancement in cost-efficiency, and
$11.3\times$ improvement in energyefficiency on average for production-scale
RecSys preprocessing. |
---|---|
DOI: | 10.48550/arxiv.2406.14571 |