TrojanNet: Embedding Hidden Trojan Horse Models in Neural Networks
The complexity of large-scale neural networks can lead to poor understanding of their internal details. We show that this opaqueness provides an opportunity for adversaries to embed unintended functionalities into the network in the form of Trojan horses. Our novel framework hides the existence of a Trojan network with arbitrary desired functionality within a benign transport network. We prove theoretically that the Trojan network's detection is computationally infeasible and demonstrate empirically that the transport network does not compromise its disguise. Our paper exposes an important, previously unknown loophole that could potentially undermine the security and trustworthiness of machine learning.
READ FULL TEXT