Data Poisoning and Backdoor Attacks: An Overview (Part 1)
Ever wondered whether you could break a system into thinking you were Bill Gates, Steve Jobs, or even your boss at work ? Ever thought about accessing a secret room by manipulating the system identification system protecting it? Backdoor attacks make all of this possible for malicious actors.
Neural networks, in a supervised learning classification task, try to learn associations and links between images provided by the user and the specified target label. Our convolutional kernels try to learn edges, blobs, structures, and more to identify patterns that occur for a particular class. With recent advances in computational hardware (GPUs, CPUs, and TPUs) training deep networks with large learning capacities has become quite common among various tasks. Take for example models like ResNet50 or DenseNet121 these networks have millions of parameters that are learnt by our algorithm to achieve good performance on a particular class. The caveat, however, is that this excessive learning capacity creates some vulnerabilities such as ease of embedding a backdoor into the network.
Backdoor attacks intend to embed hidden backdoor into deep neural networks (DNNS), such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger [1].
The above could be easily explained through graphically as shown below:
In words the recipe goes as follows:
- Choose a target label to attack. That is choose the identity we would like to be classified as.
- Choose a backdoor trigger that will activate the backdoor. That is we need to choose a key that can will allow us to activate the target label.
- Choose the number of samples to apply the trigger and change the label of. We call this step training data poisoning.
- Now proceed with training the network.
This covers up the basic idea of Backdoor Attacks through Data Poisoning. In future parts we will cover up some of the attacks (generation of backdoor triggers) and defenses that work against backdoor attacks. We will cover up some of the latest papers and ideas in that field!
Stay Tuned! Next: “BadNets: Evaluating Backdooring Attacks on Deep Neural Networks”
References:
[1] Li, Yiming et al. “Backdoor Learning: A Survey.” ArXiv abs/2007.08745 (2020): n. pag.