Towards Visualizing and Detecting Audio Adversarial Examples for Automatic Speech Recognition
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Automatic speech recognition (ASR) systems are now ubiquitous in many commonly used applications, as various commercial products rely on ASR techniques, which are increasingly based on machine learning, to transcribe voice commands into text for further processing. However, audio adversarial examples (AEs) have emerged as a serious security threat, as they have been shown to be able to fool ASR models into producing incorrect results. Although there are proposed methods to defend against audio AEs, the intrinsic properties of audio AEs compared with benign audio have not been well studied. In this paper, we show that the machine learning decision boundary patterns around audio AEs and benign audio are fundamentally different. In addition, using dimensionality reduction techniques, we show that these different patterns can be distinguished visually in 2D space. Based on dimensionality reduction results, this paper also demonstrates that it is feasible to detect previously unknown audio AEs using anomaly detection methods.
Open Access Status
This publication is not available as open access