Master's Thesis - Audio Files

Comparison of machine learning-based music source separation algorithms with respect to vocal timbre Master's Thesis - Audio Files State-of-the-art music source separation algorithms are commonly evaluated using standard evaluation metrics. Spectral features such as the Spectral Centroid are us...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Schuster, Paul
Format: Video
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Comparison of machine learning-based music source separation algorithms with respect to vocal timbre Master's Thesis - Audio Files State-of-the-art music source separation algorithms are commonly evaluated using standard evaluation metrics. Spectral features such as the Spectral Centroid are used in this thesis to describe the quality of selected algorithms. Interesting statements about the quality of modern music source separation algorithms can be made based on vocal recordings especially produced for this thesis. It can be solidly argued that the gender of the singers and the song's language have almost no influence on the quality of the algorithms. In contrast, the genre and the associated instrumentation play a much more significant role. This thesis attempts to introduce the evaluation metric Mean Absolute Error of Spectral Centroids (MAESC) among others, which could be used in the development of future MSS algorithms.  In the course of further investigation, vocal recordings were made, which will serve as additional data material. The recordings were made at the University of Music and Performing Arts Vienna (MDW) on the 3rd of July 2024. Two female and two male singers sang the same five pop songs under the same conditions in the same studio with the same microphone and the same vocal processing chain. The songs were selected in advance based on their musical genre and instrumentation in order to achieve variety. The musical accompaniment has been provided by the company “Tency Music”. The vocal quality and technique of the performers as well as the technical equipment is state-of-the-art. The recordings were made with an Apple MacBook Air M2, 2022 using the software Ableton Live 11. The 2-channel USB-C audio interface SSL 2 from “Solid State Logic” was connected to the MacBook Air. The microphone was a Neumann U87 Ai studio microphone. The recordings were made in a professional recording studio and the conditions were identical for all singers. The recording engineer processed all the files with the same effects chain (delay, reverb) so that a comparison can be drawn. The respective MIXES as well as the REFERENCES of the individual singers can be downloaded. The audio files provided may only be used for academic purposes and are protected by copyright. Vienna, 16/09/2024
DOI:10.5281/zenodo.13749284