Scopus Harvesting Series

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

Kien Do, Deakin University
Haripriya Harikumar, Deakin University
Hung Le, Deakin University
Dung Nguyen, Deakin University
Truyen Tran, Deakin University
Santu Rana, Deakin University
Dang Nguyen, Deakin University
Willy Susilo, University of Wollongong
Svetha Venkatesh, Deakin University

Publication Name

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

Trojan attacks on deep neural networks are both dangerous and surreptitious. Over the past few years, Trojan attacks have advanced from using only a single input-agnostic trigger and targeting only one class to using multiple, input-specific triggers and targeting multiple classes. However, Trojan defenses have not caught up with this development. Most defense methods still make inadequate assumptions about Trojan triggers and target classes, thus, can be easily circumvented by modern Trojan attacks. To deal with this problem, we propose two novel “filtering” defenses called Variational Input Filtering (VIF) and Adversarial Input Filtering (AIF) which leverage lossy data compression and adversarial learning respectively to effectively purify potential Trojan triggers in the input at run time without making assumptions about the number of triggers/target classes or the input dependence property of triggers. In addition, we introduce a new defense mechanism called “Filtering-then-Contrasting” (FtC) which helps avoid the drop in classification accuracy on clean data caused by “filtering”, and combine it with VIF/AIF to derive new defenses of this kind. Extensive experimental results and ablation studies show that our proposed defenses significantly outperform well-known baseline defenses in mitigating five advanced Trojan attacks including two recent state-of-the-art while being quite robust to small amounts of training data and large-norm triggers.

Open Access Status

This publication may be available as open access

Volume

13665 LNCS

First Page

283

Last Page

300

Link to Full Text

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1007/978-3-031-20065-6_17

Scopus Harvesting Series

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

Publication Name

Abstract

Open Access Status

Volume

First Page

Last Page

Link to publisher version (DOI)

Search

Browse

Links

Scopus Harvesting Series

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

Authors

Publication Name

Abstract

Open Access Status

Volume

First Page

Last Page

Share

Link to publisher version (DOI)

Search

Browse

Links