Date of Award:
Master of Science (MS)
Electrical and Computer Engineering
Neural networks are tools that are often used to perform functions such as object recognition in images, speech-to-text, and general data classification. Because neural networks have been successful at approximating these functions that are difficult to explicitly write, they are seeing increased usage in fields such as autonomous driving, airplane collision avoidance systems, and other safety-critical applications. Due to the risks involved with safety-critical systems, it is important to provide guarantees about the networks performance under certain conditions. As an example, it is critically important that self driving cars with neural network based vision systems correctly identify pedestrians 100% of the time. The ability to identify pedestrians correctly is considered a safety property of the neural network and this property must be rigorously verified to produce a guarantee of safe functionality. This thesis focuses on a safety property of neural networks called local adversarial robustness. Often, small changes or noise on the input of the network can cause it to behave unexpectedly. Water droplets on the lens of a camera that feeds images to a network for classification may render the classification output useless. When a network is locally robust to adversarial inputs it means that small changes to a known input do not cause the network to behave erratically. Due to some characteristics of neural networks, safety properties like local adversarial robustness are extremely difficult to verify. For example, changing the color of the pedestrians shirt to blue should not effect the network’s classification. What about if the shirt is red? What about all the other colors? What about all the possible color combinations of shirts and pants? The complexity of verifying these safety properties grows very quickly.
This thesis proposes three novel methods for tackling some of the challenges related to verifying safety properties of neural networks. The first is a method to strategically select which dimensions in the input will be searched first. These dimensions are chosen by approximating how much each dimension contributes to the classification output. This helps to manage the issue of high dimensionality. This proposed method is compared with a state-of-the-art technique and shows improvements in efficiency and quality. The second contribution of this work is an abstraction technique that models regions in the input space by a set of potential adversarial inputs. This set of potential adversarial inputs can be generated and verified much quicker than the entire region. If an adversarial input is found in this set then more expensive verification techniques can be skipped because the result is already known. This thesis introduces the randomized fast gradient sign method (RFGSM) that better models regions than its predecessor through increased output variance and maintains its high success rate of adversarial input generation. The final contribution of this work is a framework that adds these previously mentioned optimizations to existing verification techniques. The framework also splits the region being tested up into smaller regions that can be verified simultaneously. The framework focuses on finding as many adversarial inputs as possible so that the network can be retrained to be more robust to them.
Smith, Joshua, "Formal Verification of the Adversarial Robustness Property of Deep Neural Networks Through Dimension Reduction Heuristics, Refutation-based Abstraction, and Partitioning" (2020). All Graduate Theses and Dissertations. 7934.