Abstract:
As machine learning models are used in more safety-critical areas, such as self-driving cars or medical analysis, it becomes increasingly important to ensure that they are safe against malicious actors. I will start out my talk by introducing a popular security topic: adversarial examples, where an imperceptible perturbation could change the prediction of a classification model. Even though various methods have been proposed to defend against these adversarial examples, due to the non-convex nature of neural networks, verifying the model’s robustness against adversarial examples remains challenging: the verifiable models are either too small or the certificates are too loose. To overcome the challenge, I will then introduce methods that allow us to defend against adversarial examples while making the model easily verifiable at the same time. Finally, I will demonstrate how the approach can be adapted to defend against adversarial examples for state-of-the-art object detectors. To end, I will touch on a couple of other security problems that I worked on to highlight that there are many more security problems for deep learning models beyond adversarial examples.