[ Have a look at the presentation slides: slides-OFFZONE.pdf / slides-ODS.pdf ]
[ Related demonstration (Jupyter notebook): demo.ipynb ]
Overview |
Attacks |
Tools |
More on the topic
An overview of black-box attacks on AI and tools that might be useful during security testing of machine learning models.
demo.ipynb:
A demonstration of use of multifunctional tools during security testing of machine learning models digits_blackbox & digits_keras trained on the MNIST dataset and provided in Counterfit as example targets.
Slides:
βββMachine Learning in products
βββThreats to Machine Learning models
βββExample model overview
βββEvasion attacks
βββModel inversion attacks
βββModel extraction attacks
βββDefences
βββAdversarial Robustness Toolbox
βββCounterfit
- Model inversion attack:
MIFaceβ code / docs / πDOI:10.1145/2810103.2813677 - Model extraction attack:
Copycat CNNβ code / docs / πarXiv:1806.05476 - Evasion attack:
Fast Gradient Method (FGM)β code / docs / πarXiv:1412.6572 - + Evasion attack:
HopSkipJumpβ code / docs / πarXiv:1904.02144
βββ[ Trusted AI, IBM ] Adversarial Robustness Toolbox (ART):
Trusted-AI/adversarial-robustness-toolbox
βββ[ Microsoft Azure ] Counterfit:
Azure/counterfit
-
adversarial examplesevasion attacks
How MIT researchers made Google's AI think tabby cat is guacamole:βoverview / πarXiv:1707.07397 + arXiv:1804.08598 -
model inversion attacks
Apple's take on model inversion:βoverview / πarXiv:2111.03702 -
model inversion attacks
Google's demonstration of extraction of training data that the GPT-2 model has memorized:βoverview / πarXiv:2012.07805 -
attacks on AIadversarial attackspoisoning attacksmodel inference attacks
β Posts on PortSwigger's "The Daily Swig" by Ben Dickson
