Suman Jana (Columbia University / Assistant Professor)
NEUZZ: Efficient Fuzzing with Neural Program Smoothing
Fuzzing has become the de facto standard technique for finding software vulnerabilities. However, even state-of-the-art fuzzers are not very efficient at finding hard-to-trigger software bugs. Most popular fuzzers use evolutionary guidance to generate inputs that can trigger different bugs. Such evolutionary algorithms, while fast and simple to implement, often get stuck in fruitless sequences of random mutations. Gradient-guided optimization presents a promising alternative to evolutionary guidance. Gradient-guided techniques have been shown to significantly outperform evolutionary algorithms at solving high-dimensional structured optimization problems in domains like machine learning by efficiently utilizing gradients or higher-order derivatives of the underlying function. However, gradient-guided approaches are not directly applicable to fuzzing as real-world program behaviors contain many discontinuities, plateaus, and ridges where the gradient-based methods often get stuck. We observe that this problem can be addressed by creating a smooth surrogate function approximating the discrete branching behavior of target program. In this paper, we propose a novel program smoothing technique using surrogate neural network models that can incrementally learn smooth approximations of a complex, real-world program's branching behaviors. We further demonstrate that such neural network models can be used together with gradient-guided input generation schemes to significantly improve the fuzzing efficiency. Our extensive evaluations demonstrate that NEUZZ significantly outperforms 10 state-of-the-art graybox fuzzers on 10 real-world programs both at finding new bugs and achieving higher edge coverage. NEUZZ found 31 unknown bugs that other fuzzers failed to find in 10 real world programs and achieved 3X more edge coverage than all of the tested graybox fuzzers for 24 hours running.
Suman Jana is an assistant professor in the department of computer science at Columbia University since January 2016. His primary research interest is in the field of computer security and privacy. His research has won six best paper awards including one at the Symposium on Operating Systems Principles (SOSP) 2017 and two at the IEEE Symposiums on Security and Privacy (S&P) 2014 and 2016. His work has led to reporting and fixing of around 250 high-impact security vulnerabilities across a wide range of software. His research software has also been incorporated as part of Google's malware detection infrastructure, Mozilla Firefox, and Apache Cordova.
Prof. Jana is specifically interested in the security issues that result from deploying Machine Learning (ML) systems in security- and safety-critical domains such as self-driving cars, automated passenger screening, and medical diagnosis. Despite significant progress in ML techniques like deep learning, ML systems often make dangerous and even potentially fatal mistakes, especially for rare corner case inputs. For example, a Tesla autonomous car was recently involved in a fatal crash that resulted from the system’s failure to detect a white truck against a bright sky with white clouds. Such incidents demonstrate the need for rigorous testing and veriﬁcation of ML systems under different rare settings (e.g., different lighting conditions for self-driving cars) to ensure the security and safety of ML systems. Prof. Jana is working on creating effective tools and techniques to detect and eliminate corner-case vulnerabilities through systematic testing and verification of ML systems.