Fine-Tuning Multimodal Transformer Models for Generating Actions in Virtual and Real Environments
In this work, we propose and investigate an original approach to using a pre-trained multimodal transformer of a specialized architecture for controlling a robotic agent in an object manipulation task based on language instruction, which we refer to as RozumFormer. Our model is based on a bimodal (t...
Gespeichert in:
Veröffentlicht in: | IEEE access 2023-01, Vol.11, p.1-1 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Schreiben Sie den ersten Kommentar!