Generative Disco: Text-to-Video Generation for Music Visualization
Visuals can enhance our experience of music, owing to the way they can amplify the emotions and messages conveyed within it. However, creating music visualization is a complex, time-consuming, and resource-intensive process. We introduce Generative Disco, a generative AI system that helps generate m...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Liu, Vivian Long, Tao Raw, Nathan Chilton, Lydia |
description | Visuals can enhance our experience of music, owing to the way they can
amplify the emotions and messages conveyed within it. However, creating music
visualization is a complex, time-consuming, and resource-intensive process. We
introduce Generative Disco, a generative AI system that helps generate music
visualizations with large language models and text-to-video generation. The
system helps users visualize music in intervals by finding prompts to describe
the images that intervals start and end on and interpolating between them to
the beat of the music. We introduce design patterns for improving these
generated videos: transitions, which express shifts in color, time, subject, or
style, and holds, which help focus the video on subjects. A study with
professionals showed that transitions and holds were a highly expressive
framework that enabled them to build coherent visual narratives. We conclude on
the generalizability of these patterns and the potential of generated video for
creative professionals. |
doi_str_mv | 10.48550/arxiv.2304.08551 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2304_08551</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2304_08551</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-cbe7bbbc5e543ae498185051be34514c5849803ccf5083358f43a37ef3376a1b3</originalsourceid><addsrcrecordid>eNo1j8FqAjEURbNxIdoPcGV-INOkL8-J3bW2asHiZnA7JPEFAtaUzCi2X99xrKsL514uHMYmShbaIMpHmy_xXDyB1IXsgBqy1xUdKds2nom_xcanZ17RpRVtEru4p8TvfTrykDL_PDXR811sTvYQf3s-ZoNgDw09_OeIVcv3arEWm-3qY_GyEXZWKuEdlc45j4QaLOm5UQYlKkegUWmPpkMSvA8oDQCa0M2gpABQzqxyMGLT220vUX_n-GXzT32VqXsZ-AOK-UP2</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Generative Disco: Text-to-Video Generation for Music Visualization</title><source>arXiv.org</source><creator>Liu, Vivian ; Long, Tao ; Raw, Nathan ; Chilton, Lydia</creator><creatorcontrib>Liu, Vivian ; Long, Tao ; Raw, Nathan ; Chilton, Lydia</creatorcontrib><description>Visuals can enhance our experience of music, owing to the way they can
amplify the emotions and messages conveyed within it. However, creating music
visualization is a complex, time-consuming, and resource-intensive process. We
introduce Generative Disco, a generative AI system that helps generate music
visualizations with large language models and text-to-video generation. The
system helps users visualize music in intervals by finding prompts to describe
the images that intervals start and end on and interpolating between them to
the beat of the music. We introduce design patterns for improving these
generated videos: transitions, which express shifts in color, time, subject, or
style, and holds, which help focus the video on subjects. A study with
professionals showed that transitions and holds were a highly expressive
framework that enabled them to build coherent visual narratives. We conclude on
the generalizability of these patterns and the potential of generated video for
creative professionals.</description><identifier>DOI: 10.48550/arxiv.2304.08551</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Human-Computer Interaction</subject><creationdate>2023-04</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2304.08551$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2304.08551$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Vivian</creatorcontrib><creatorcontrib>Long, Tao</creatorcontrib><creatorcontrib>Raw, Nathan</creatorcontrib><creatorcontrib>Chilton, Lydia</creatorcontrib><title>Generative Disco: Text-to-Video Generation for Music Visualization</title><description>Visuals can enhance our experience of music, owing to the way they can
amplify the emotions and messages conveyed within it. However, creating music
visualization is a complex, time-consuming, and resource-intensive process. We
introduce Generative Disco, a generative AI system that helps generate music
visualizations with large language models and text-to-video generation. The
system helps users visualize music in intervals by finding prompts to describe
the images that intervals start and end on and interpolating between them to
the beat of the music. We introduce design patterns for improving these
generated videos: transitions, which express shifts in color, time, subject, or
style, and holds, which help focus the video on subjects. A study with
professionals showed that transitions and holds were a highly expressive
framework that enabled them to build coherent visual narratives. We conclude on
the generalizability of these patterns and the potential of generated video for
creative professionals.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Human-Computer Interaction</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j8FqAjEURbNxIdoPcGV-INOkL8-J3bW2asHiZnA7JPEFAtaUzCi2X99xrKsL514uHMYmShbaIMpHmy_xXDyB1IXsgBqy1xUdKds2nom_xcanZ17RpRVtEru4p8TvfTrykDL_PDXR811sTvYQf3s-ZoNgDw09_OeIVcv3arEWm-3qY_GyEXZWKuEdlc45j4QaLOm5UQYlKkegUWmPpkMSvA8oDQCa0M2gpABQzqxyMGLT220vUX_n-GXzT32VqXsZ-AOK-UP2</recordid><startdate>20230417</startdate><enddate>20230417</enddate><creator>Liu, Vivian</creator><creator>Long, Tao</creator><creator>Raw, Nathan</creator><creator>Chilton, Lydia</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230417</creationdate><title>Generative Disco: Text-to-Video Generation for Music Visualization</title><author>Liu, Vivian ; Long, Tao ; Raw, Nathan ; Chilton, Lydia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-cbe7bbbc5e543ae498185051be34514c5849803ccf5083358f43a37ef3376a1b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Human-Computer Interaction</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Vivian</creatorcontrib><creatorcontrib>Long, Tao</creatorcontrib><creatorcontrib>Raw, Nathan</creatorcontrib><creatorcontrib>Chilton, Lydia</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Vivian</au><au>Long, Tao</au><au>Raw, Nathan</au><au>Chilton, Lydia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generative Disco: Text-to-Video Generation for Music Visualization</atitle><date>2023-04-17</date><risdate>2023</risdate><abstract>Visuals can enhance our experience of music, owing to the way they can
amplify the emotions and messages conveyed within it. However, creating music
visualization is a complex, time-consuming, and resource-intensive process. We
introduce Generative Disco, a generative AI system that helps generate music
visualizations with large language models and text-to-video generation. The
system helps users visualize music in intervals by finding prompts to describe
the images that intervals start and end on and interpolating between them to
the beat of the music. We introduce design patterns for improving these
generated videos: transitions, which express shifts in color, time, subject, or
style, and holds, which help focus the video on subjects. A study with
professionals showed that transitions and holds were a highly expressive
framework that enabled them to build coherent visual narratives. We conclude on
the generalizability of these patterns and the potential of generated video for
creative professionals.</abstract><doi>10.48550/arxiv.2304.08551</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2304.08551 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2304_08551 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Human-Computer Interaction |
title | Generative Disco: Text-to-Video Generation for Music Visualization |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T03%3A22%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generative%20Disco:%20Text-to-Video%20Generation%20for%20Music%20Visualization&rft.au=Liu,%20Vivian&rft.date=2023-04-17&rft_id=info:doi/10.48550/arxiv.2304.08551&rft_dat=%3Carxiv_GOX%3E2304_08551%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |