Secure Your Voice: An Oral Airflow-Based Continuous Liveness Detection for Voice Assistants
Voice control has attracted extensive attention recently as it is a prospective User Interface (UI) to substitute for conventional touch control on smart devices. Voice assistants have become increasingly popular in our daily lives, especially for those people who are visually impaired. However, the...
Gespeichert in:
Veröffentlicht in: | Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies mobile, wearable and ubiquitous technologies, 2019-12, Vol.3 (4), p.1-28 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Voice control has attracted extensive attention recently as it is a prospective User Interface (UI) to substitute for conventional touch control on smart devices. Voice assistants have become increasingly popular in our daily lives, especially for those people who are visually impaired. However, the inherently insecure nature of voice biometrics means that voice assistants are vulnerable to spoofing attacks as evidenced by security experts. To secure the commands for voice assistants, in this paper, we present a liveness detection system that provides continuous speaker verification on smart devices. The basic aim is to match the voice received by the smart device's microphone with the oral airflow of the user when speaking a command. The airflow is captured by an auxiliary commercial off-the-shelf airflow sensor. Specifically, we establish a theoretical model to depict the relationship between the oral airflow pressure and the phonemes in users' speech. The system estimates a series of pressures from the speech according to the theoretical model, and then calculates the consistency between the estimated pressure signal and the actual pressure signal measured by the airflow sensor to determine whether a command is a genuine "live" voice or an artificially generated one. We evaluate the system with 26 participants and 30 different voice commands. The evaluation showed that our system achieves an overall accuracy of 97.25% with an Equal Error Rate (EER) of 2.08%. |
---|---|
ISSN: | 2474-9567 2474-9567 |
DOI: | 10.1145/3369811 |