Measuring When a Music Generation Algorithm Copies Too Much: The Originality Report, Cardinality Score, and Symbolic Fingerprinting by Geometric Hashing

Research on automatic music generation lacks consideration of the originality of musical outputs, creating risks of plagiarism and/or copyright infringement. We present the originality report —a set of analyses that is parameterised by a “similarity score”—for measuring the extent to which an algori...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SN computer science 2022-09, Vol.3 (5), p.340, Article 340
Hauptverfasser:	Yin, Zongyu, Reuben, Federico, Stepney, Susan, Collins, Tom
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Computer Imaging Computer Science Computer Systems Organization and Communication Networks Copying Copyright Court decisions Data Structures and Information Theory Datasets Deep learning Evolutionary Art and Music Fingerprinting Information Systems and Communication Service Melody Music Musical performances Musicians & conductors Original Research Pattern Recognition and Graphics Plagiarism Similarity Software Engineering/Programming and Operating Systems Transformers Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Research on automatic music generation lacks consideration of the originality of musical outputs, creating risks of plagiarism and/or copyright infringement. We present the originality report —a set of analyses that is parameterised by a “similarity score”—for measuring the extent to which an algorithm copies from the input music. First, we construct a baseline, to determine the extent to which human composers borrow from themselves and each other in some existing music corpus. Second, we apply a similar analysis to musical outputs of runs of MAIA Markov and Music Transformer generation algorithms, and compare the results to the baseline. Third, we investigate how originality varies as a function of Transformer’s training epoch. Fourth, we demonstrate the originality report with a different “similarity score” based on symbolic fingerprinting, encompassing music with more complex, expressive timing information. Results indicate that the originality of Transfomer’s output is below the 95% confidence interval of the baseline. Musicological interpretation of the analyses shows that the Transformer model obtained via the conventional stopping criteria produces single-note repetition patterns, resulting in outputs of low quality and originality, while in later training epochs, the model tends to overfit, producing copies of excerpts of input pieces. Even with a larger data set, the same copying issues still exist. Thus, we recommend the originality report as a new means of evaluating algorithm training processes and outputs in future, and question the reported success of language-based deep learning models for music generation. Supporting materials (data sets and code) are available via https://osf.io/96emr/ .
ISSN:	2661-8907 2662-995X 2661-8907
DOI:	10.1007/s42979-022-01220-y