DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data
In the era of artificial intelligence, the diversity of data modalities and annotation formats often renders data unusable directly, requiring understanding and format conversion before it can be used by researchers or developers with different needs. To tackle this problem, this article introduces...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the era of artificial intelligence, the diversity of data modalities and
annotation formats often renders data unusable directly, requiring
understanding and format conversion before it can be used by researchers or
developers with different needs. To tackle this problem, this article
introduces a framework called Dataset Description Language (DSDL) that aims to
simplify dataset processing by providing a unified standard for AI datasets.
DSDL adheres to the three basic practical principles of generic, portable, and
extensible, using a unified standard to express data of different modalities
and structures, facilitating the dissemination of AI data, and easily extending
to new modalities and tasks. The standardized specifications of DSDL reduce the
workload for users in data dissemination, processing, and usage. To further
improve user convenience, we provide predefined DSDL templates for various
tasks, convert mainstream datasets to comply with DSDL specifications, and
provide comprehensive documentation and DSDL tools. These efforts aim to
simplify the use of AI data, thereby improving the efficiency of AI
development. |
---|---|
DOI: | 10.48550/arxiv.2405.18315 |