About half a year ago we began our work in improving security for machine learning frameworks (TensorFlow, PyTorch) by applying static and dynamic analysis. We managed to fix some warnings generated by Svace static analyzer:
https://github.com/tensorflow/tensorflow/pull/57892
https://github.com/pytorch/pytorch/pull/85705
There are still a lot of warrings awaiting to be analyzed, but this work is going on step by step.
Applying dynamic analysis (fuzzing) to machine learning frameworks it is not an easy task. There are many nice fuzz targets in TensorFlow, and it is already well fuzzed by https://github.com/google/oss-fuzz. We applied successfully #sydr-fuzz to TensorFlow. It was very nice to see that DSE helps fuzzer on a such complex target. After several runs of fuzzing in our CI system, we managed to find an interesting infinite loop: https://github.com/tensorflow/tensorflow/pull/56455.
I couldn't say was it due to dynamic symbolic execution or we were just so highly motivated:).
Applying hybrid fuzzing to PyTorch, we had to develop fuzz targets from scratch. Of course, new fuzz targets produce lots of crashes. Thanks to #casr (https://github.com/ispras/casr) we managed to convert them into several bugs (https://github.com/ispras/oss-sydr-fuzz/blob/master/TROPHIES.md). Lot's of interesting parsing that could be fuzzed is located in torchvision. We focused on image parsing (https://github.com/pytorch/vision/pull/6456). All fuzz targets for PyTorch and TorchVision could be find here: https://github.com/ispras/oss-sydr-fuzz.
We are open for some new ideas about fuzzing TensorFlow and PyTorch!
#fuzzing #machinelearning #tensorflow #pytorch
#sydr #casr #fuzzing #machinelearning #tensorflow #pytorch