LoCoSV: Logically consistent story visualization with sequential conditional GaN

Interpreting certain story content and producing its corresponding set of images in a manner that is logically coherent is the sole purpose of story visualization. Methods and tools for the generation of an image from texts are in high demand because there are always new challenges for educators and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Devi, M. S. Karthika, Baskaran, R., Bhuvaneshwari, R.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Interpreting certain story content and producing its corresponding set of images in a manner that is logically coherent is the sole purpose of story visualization. Methods and tools for the generation of an image from texts are in high demand because there are always new challenges for educators and students in presenting and understanding any tale material. On the other hand, finding images to picture uncommon characters and events in a story often requires a great deal of manual labor and is occasionally neglected. The proposed work tackles automatic scene sequence generation, a new multimedia task that makes it easier to do this and encourages teachers and students to effectively visualize stories. In order to achieve this, it is necessary to transform the story’s words into a series of visuals. The series of images is an unbroken line of reliable images that all relate to the same narrative or incident. Hence, a novel phenomenon called Story Visualization is proposed. From a collective passage, a narrative story can be conceived by creating an image sequence, one for every sentence or phrase. Less emphasis is placed on the narrative’s flow in story visualization, but more emphasis is placed on overall coherence among the story’s dynamic scenes and characters. This is still a challenge that no conventional image or video generation methods have addressed. Henceforth, we presented an approach namely Logically Consistent Story Visualization (LoCoSV) to confront the issues. LoCoSV efficiently discovers how to represent the story visually through the use of three key modules: figure-ground segmentation (auxiliary task to provide information for maintaining character and story consistency), story and context encoder (learning of story and sentence representation), and figure-ground aware generation (using figure-ground information to generate the sequence of image). Various Evaluation Metrics and human evaluation studies were used to evaluate the performance and they demonstrated that LoCoSV improves generation over baseline models, especially in improving the Logical coherency and background consistency of the stories
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0185004