Distributed File System to Leverage Data Locality for Large-File Processing

Over the past decade, significant technological advancements have led to a substantial increase in data proliferation. Both scientific computation and Big Data workloads play a central role, manipulating massive data and challenging conventional high-performance computing architectures. Efficiently...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics (Basel) 2024-01, Vol.13 (1), p.106
Hauptverfasser: da Silva, Erico Correia, Sato, Liria Matsumoto, Midorikawa, Edson Toshimi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Over the past decade, significant technological advancements have led to a substantial increase in data proliferation. Both scientific computation and Big Data workloads play a central role, manipulating massive data and challenging conventional high-performance computing architectures. Efficiently processing voluminous files using cost-effective hardware remains a persistent challenge, limiting access to new technologies for individuals and organizations capable of higher investments. In response to this challenge, AwareFS, a novel distributed file system, addresses the efficient reading and updating of large files by consistently exploiting data locality on every copy. Its distributed metadata and lock management facilitate sequential and random I/O patterns with minimal data movement over the network. The evaluation of the AwareFS local-write protocol demonstrated efficiency across various update patterns, resulting in a performance improvement of approximately 13%, while benchmark assessments conducted across diverse cluster sizes and configurations underscored the flexibility and scalability of AwareFS. The innovative distributed mechanisms outlined herein are positioned to contribute to the evolution of emerging technologies related to the computation of data stored in large files.
ISSN:2079-9292
2079-9292
DOI:10.3390/electronics13010106