MtArtGPT: A Multi-Task Art Generation System With Pre-Trained Transformer

Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception capabilities. Such models have been used as the core assistant for many tasks including art generation. However, high-quality...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2024-08, Vol.34 (8), p.6901-6912
Hauptverfasser: Jin, Cong, Zhu, Ruolin, Zhu, Zixing, Yang, Lu, Yang, Min, Luo, Jiebo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception capabilities. Such models have been used as the core assistant for many tasks including art generation. However, high-quality art generation relies heavily on human prompt engineering which is in general uncontrollable. To address these issues, we propose a multi-task AI generated content (AIGC) system for art generation. Specifically, a dense representation manager is designed to process multi-modal user queries and generate dense and applicable prompts to GPT. To enhance artistic sophistication of the whole system, we fine-tune the GPT model by a meticulously collected prompt-art dataset. Furthermore, we introduce artistic benchmarks for evaluating the system based on professional knowledge. Experiments demonstrate the advantages of our proposed MtArtGPT system.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2024.3349567