Multi-dimensional fusion: transformer and GANs-based multimodal audiovisual perception robot for musical performance art

In the midst of rapid societal evolution, the appreciation of artistic creations has been undergoing continuous transformation. Audience demands have shifted towards experiences that resonate on deeper emotional levels. Against this backdrop, multimodal robot music performance art emerges as a novel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers in neurorobotics 2023-09, Vol.17, p.1281944-1281944
Hauptverfasser: Lu, Shiyi, Wang, Panpan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the midst of rapid societal evolution, the appreciation of artistic creations has been undergoing continuous transformation. Audience demands have shifted towards experiences that resonate on deeper emotional levels. Against this backdrop, multimodal robot music performance art emerges as a novel form of artistic expression. This paper explores the fusion of music and motion in robot performances to enhance expressiveness and emotional impact. We employ Transformer models to combine audio and video signals, enabling robots to better understand the rhythm, melody, and emotion of music during performances. Generative Adversarial Networks enable robots to create lifelike visual performances based on music, merging auditory and visual perception. Through multimodal reinforcement learning, robots synchronize their actions with music, achieving harmonious alignment between sound and motion. Our experiments validate our approach across diverse music styles and emotions. We use metrics such as accuracy, recall rate, and F1 score to quantify the impact of our methodology. For instance, our approach achieves a performance smoothness score exceeding 94 points, a 95% accuracy rate, and a significant 33% enhancement in performance recall rate compared to baseline modules. The collective elevation in F1 score underscores the advantages of our approach within the realm of robot music performance art.
ISSN:1662-5218
1662-5218
DOI:10.3389/fnbot.2023.1281944