If you build and maintain a database of "fingerprints" of adversarial attacks, you can estimate which kind is being used against your model in real time. This tells you both about the technical sophistication of your adversary, and the strength of possible adversarial defenses.

Learn more at adversarial-designs.shop/blogs

#threatintelligence #adversarialML

Last updated 1 year ago

Daniel Lowd · @lowd
1178 followers · 188 posts · Server sigmoid.social

In related news, I need to recruit 1-2 new PhD students starting next fall!!

Likely research topics: Adversarial and explainable ML for large models of text and code.

(And maybe probabilistic and relational models if another project gets funded.)

If you want to email me about this, please include “capybara” in the subject line so I know it’s a specific response and not a blanket query.

#recruiting #adversarialML

Last updated 2 years ago

Daniel Lowd · @lowd
944 followers · 132 posts · Server sigmoid.social

Since some people have been asking, here's a preprint:
arxiv.org/abs/2208.13904

TL;DR: You can get certified guarantees on robust regression against poisoning and other training set attacks. The trick is to use a voting based predictor (like an ensemble or k-NN) and median.

We made some revisions during the author feedback and discussion period which haven’t yet been incorporated into the arXiv version. I’ll post again when we have the camera-ready version.

#satml #adversarialML #newpaper

Last updated 2 years ago

Daniel Lowd · @lowd
926 followers · 127 posts · Server sigmoid.social

Our paper on adversarially-robust regression was accepted to SaTML 2023 (satml.org) -- the first ever IEEE Conference on Secure and Trustworthy Machine Learning!

I'm really excited about this conference and hoping to see it take off. There's so much important work to do in this area.

#satml #adversarialML

Last updated 2 years ago

Daniel Lowd · @lowd
922 followers · 124 posts · Server sigmoid.social

In , targeted training set attacks are one of the biggest threats to -- highly effective and hard to detect!

In a at this week, Zayd Hammoudeh and I show how you can use to detect, understand, and stop these attacks!

Our methods work against backdoor and poisoning attacks, in vision/test/audio domains, and against adaptive attackers.

dl.acm.org/doi/10.1145/3548606

#adversarialML #machinelearning #newpaper #CCS2022 #InfluenceEstimation

Last updated 2 years ago

Daniel Lowd · @lowd
922 followers · 124 posts · Server sigmoid.social

@simon @parasbhargava there’s a while literature on adversarial machine learning — if you search for that term, you’ll find lots of info on automated attacks against machine models. Tutorials, blog posts, books. The field is moving very quickly, and different overviews will focus on different aspects, so I don’t have a single recommended overview offhand.

I’m hoping we can use the hashtag for discussing this topic on Mastodon.

#adversarialML

Last updated 2 years ago

Daniel Lowd · @lowd
922 followers · 124 posts · Server sigmoid.social

Some thoughts on attacks in different domains (partly in response to comments by @simon):

IMAGES:
Attacks against image classifiers are quite effective because image classification is so hard to begin with! A deep network needs to use every little scrap of signal just to distinguish a dog from a dogwood tree — when the signal is ambiguous, something as small as fur texture vs. bark texture might be the deciding “vote.”

#adversarialML #machinelearning

Last updated 2 years ago