Automated metadata annotation: What is and is not possible with machine learning

Automated metadata annotation is only as good as training dataset, or rules that are available for the domain. It's important to learn what type of data content a pre-trained machine learning algorithm has been trained on to understand its limitations and potential biases. Consider what type of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Data intelligence 2023-03, Vol.5 (1), p.122-138
Hauptverfasser:	Wu, Mingfang, Brandhorst, Hans, Marinescu, Maria-Cristina, Lopez, Joaquim More, Hlava, Margorie, Busch, Joseph
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Annotations Archives & records Artificial intelligence Automation Cultural heritage Cultural resources Culture heritage Datasets Digitization Iconography Machine learning Metadata Metadata annotation Metadata, Machine learning Neural networks Pattern recognition Research data Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Automated metadata annotation is only as good as training dataset, or rules that are available for the domain. It's important to learn what type of data content a pre-trained machine learning algorithm has been trained on to understand its limitations and potential biases. Consider what type of content is readily available to train an algorithm—what's popular and what's available. However, scholarly and historical content is often not available in consumable, homogenized, and interoperable formats at the large volume that is required for machine learning. There are exceptions such as science and medicine, where large, well documented collections are available. This paper presents the current state of automated metadata annotation in cultural heritage and research data, discusses challenges identified from use cases, and proposes solutions.
ISSN:	2641-435X 2641-435X
DOI:	10.1162/dint_a_00162