If you build and maintain a database of "fingerprints" of adversarial attacks, you can estimate which kind is being used against your model in real time. This tells you both about the technical sophistication of your adversary, and the strength of possible adversarial defenses.
Learn more at https://adversarial-designs.shop/blogs/blog/know-thy-enemy-classifying-attackers-with-adversarial-fingerprinting
#threatintelligence #adversarialML
In related news, I need to recruit 1-2 new PhD students starting next fall!!
Likely research topics: Adversarial and explainable ML for large models of text and code.
(And maybe probabilistic and relational models if another project gets funded.)
If you want to email me about this, please include “capybara” in the subject line so I know it’s a specific response and not a blanket query.
Since some people have been asking, here's a preprint:
https://arxiv.org/abs/2208.13904
TL;DR: You can get certified guarantees on robust regression against poisoning and other training set attacks. The trick is to use a voting based predictor (like an ensemble or k-NN) and median.
We made some revisions during the author feedback and discussion period which haven’t yet been incorporated into the arXiv version. I’ll post again when we have the camera-ready version.
#SaTML #AdversarialML #NewPaper
#satml #adversarialML #newpaper
Our paper on adversarially-robust regression was accepted to SaTML 2023 (satml.org) -- the first ever IEEE Conference on Secure and Trustworthy Machine Learning!
I'm really excited about this conference and hoping to see it take off. There's so much important work to do in this area.
#SaTML #AdversarialML
In #AdversarialML, targeted training set attacks are one of the biggest threats to #MachineLearning -- highly effective and hard to detect!
In a #NewPaper at #CCS2022 this week, Zayd Hammoudeh and I show how you can use #InfluenceEstimation to detect, understand, and stop these attacks!
Our methods work against backdoor and poisoning attacks, in vision/test/audio domains, and against adaptive attackers.
#adversarialML #machinelearning #newpaper #CCS2022 #InfluenceEstimation
@simon @parasbhargava there’s a while literature on adversarial machine learning — if you search for that term, you’ll find lots of info on automated attacks against machine models. Tutorials, blog posts, books. The field is moving very quickly, and different overviews will focus on different aspects, so I don’t have a single recommended overview offhand.
I’m hoping we can use the hashtag #adversarialML for discussing this topic on Mastodon.
Some thoughts on #adversarialML #machinelearning attacks in different domains (partly in response to comments by @simon):
IMAGES:
Attacks against image classifiers are quite effective because image classification is so hard to begin with! A deep network needs to use every little scrap of signal just to distinguish a dog from a dogwood tree — when the signal is ambiguous, something as small as fur texture vs. bark texture might be the deciding “vote.”
#adversarialML #machinelearning