Hi HN! I’m one of the researchers who worked on this. Both Nicholas Carlini (a co-author on this paper) and I have done a bunch of work on machine learning security (specifically, adversarial examples), and we’re happy to answer any questions here!<p>Adversarial examples can be thought of as “fooling examples” for machine learning models. For example, for image classifiers, for a given image x classified correctly, an adversarial example is an image x* such that x* is visually similar to an image x, but x* is classified incorrectly.<p>We evaluated the security of 8 defenses accepted at ICLR 2018 (one of the top machine learning conferences) and we find that 7 are broken. Our attacks succeeded when others failed because we show how to work around defenses that cause gradient-descent-based attack algorithms to fail.