Personalized Audio Quality Preference Prediction
This paper proposes to use both audio input and subject information to predict the personalized preference of two audio segments with the same content in different qualities. A siamese network is used to compare the inputs and predict the preference. Several different structures for each side of the...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes to use both audio input and subject information to
predict the personalized preference of two audio segments with the same content
in different qualities. A siamese network is used to compare the inputs and
predict the preference. Several different structures for each side of the
siamese network are investigated, and an LDNet with PANNs' CNN6 as the encoder
and a multi-layer perceptron block as the decoder outperforms a baseline model
using only audio input the most, where the overall accuracy grows from 77.56%
to 78.04%. Experimental results also show that using all the subject
information, including age, gender, and the specifications of headphones or
earphones, is more effective than using only a part of them. |
---|---|
DOI: | 10.48550/arxiv.2302.08130 |