Enabling new interactions with library digital collections: automatic gender recognition in historical postcards via deep learning
The Walter Havighurst Special Collections from University Archives & Preservation at Miami University's King Library has a growing collection of over 600,000 historical postcards, with approximately 30,000 digitized, primarily from the Midwest during 1890–1919. This collection supports vari...
Gespeichert in:
Veröffentlicht in: | The Journal of academic librarianship 2023-07, Vol.49 (4), p.102736, Article 102736 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Walter Havighurst Special Collections from University Archives & Preservation at Miami University's King Library has a growing collection of over 600,000 historical postcards, with approximately 30,000 digitized, primarily from the Midwest during 1890–1919. This collection supports various lines of inquiry from users, such as analyzing the evolution of gender portrayal in popular media in the United States. However, manually separating the collection into postcards of males and females would take thousands of hours, which prevents the library from supporting sociological analyses at scale. After assembling an open postcard dataset, we trained deep neural networks (i.e., YOLOv5x object detection models) to automatically detect people and classify them as male or female. Our approach limited biases in favor of one outcome by balancing the number of males and females via multi-label stratified 10-fold cross-validation. We showed that this approach can accurately detect and classify females and confidently detect and label males for the library's collection of historical postcards. Our precision of 94.9 % and recall of 33.0 % from 1890 to 1919 on male gender detection exceed the performances of 94.7 % and 31 % respectively for recognition on World War I postcards in past studies. By employing our trained deep neural networks, the library can enhance its metadata within hours and support new research inquiries at scale.
•Users lack experience to navigate the unique materials hosted in online special collections, such as historical postcards.•Special collections are increasingly releasing uncatalogued materials online, which provides even less metadata for users.•Machine learning has emerged as a potential solution to derive additional metadata and support more detailed user queries.•Prior research examined the use of text mining (e.g., for historical novels), but there is little work using computer vision.•We create computer vision models that accurately extract gender from 28,308 postcards in special collections in 4.28 h. |
---|---|
ISSN: | 0099-1333 1879-1999 |
DOI: | 10.1016/j.acalib.2023.102736 |