Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video
Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video recordings. The conventional wisdom in communication theory predicts people will fall for fake news more often...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advances in technology for hyper-realistic visual and audio effects
provoke the concern that deepfake videos of political speeches will soon be
indistinguishable from authentic video recordings. The conventional wisdom in
communication theory predicts people will fall for fake news more often when
the same version of a story is presented as a video versus text. We conduct 5
pre-registered randomized experiments with 2,215 participants to evaluate how
accurately humans distinguish real political speeches from fabrications across
base rates of misinformation, audio sources, question framings, and media
modalities. We find base rates of misinformation minimally influence
discernment and deepfakes with audio produced by the state-of-the-art
text-to-speech algorithms are harder to discern than the same deepfakes with
voice actor audio. Moreover across all experiments, we find audio and visual
information enables more accurate discernment than text alone: human
discernment relies more on how something is said, the audio-visual cues, than
what is said, the speech content. |
---|---|
DOI: | 10.48550/arxiv.2202.12883 |