Measuring Adversarial Robustness using a Voronoi-Epsilon Adversary

Authors

  • Hyeongji Kim University of Bergen
  • Pekka Parviainen University of Bergen
  • Ketil Malde University of Bergen

DOI:

https://doi.org/10.7557/18.6827

Keywords:

adversarial examples, adversarial robustness, nearest neighbor classifier

Abstract

Previous studies on robustness have argued that there is a tradeoff between accuracy and adversarial accuracy. The tradeoff can be inevitable even when we neglect generalization. We argue that the tradeoff is inherent to the commonly used definition of adversarial accuracy, which uses an adversary that can construct adversarial points constrained by $\epsilon$-balls around data points. As $\epsilon$ gets large, the adversary may use real data points from other classes as adversarial examples. We propose a Voronoi-epsilon adversary which is constrained both by Voronoi cells and by $\epsilon$-balls. This adversary balances two notions of perturbation. As a result, adversarial accuracy based on this adversary avoids a tradeoff between accuracy and adversarial accuracy on training data even when $\epsilon$ is large. Finally, we show that a nearest neighbor classifier is the maximally robust classifier against the proposed adversary on the training data.

References

B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. ˇSrndi ́c, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases, pages 387–402. Springer, 2013. doi: https://doi.org/10.1007/978-3-642-40994-3_25.

J. Cohen, E. Rosenfeld, and Z. Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pages 1310–1320. PMLR, 2019. URL https://proceedings.mlr.press/v97/cohen19c.html.

E. Dohmatob. Generalized no free lunch theorem for adversarial robustness. In International Conference on Machine Learning, pages 1646–1654. PMLR, 2019. URL https://proceedings.mlr.press/v97/dohmatob19a.html.

A. Ghiasi, A. Shafahi, and T. Goldstein. Breaking certified defenses: Semantic adversarial examples with spoofed robustness certificates. In International Conference on Learning Representations, 2019. URL https://iclr.cc/virtual_2020/poster_HJxdTxHYvB.html.

I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015. doi: https://doi.org/10.48550/arXiv.1412.6572.

M. Khoury and D. Hadfield-Menell. Adversarial training with Voronoi constraints. arXiv preprint arXiv:1905.01019, 2019. doi: https://doi.org/10.48550/arXiv.1905.01019.

J. Kim and X. Wang. Sensible adversarial learning, 2020. URL https://openreview.net/forum?id=rJlf_RVKwr.

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJzIBfZAb.

A. Raghunathan, S. M. Xie, F. Yang, J. Duchi, and P. Liang. Understanding and mitigating the tradeoff between robustness and accuracy. In International Conference on Machine Learning, pages 7909–7919. PMLR, 2020. URL https://proceedings.mlr.press/v119/raghunathan20a.html.

A. S. Suggala, A. Prasad, V. Nagarajan, and P. Ravikumar. Revisiting adversarial risk. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2331–2339. PMLR, 2019. URL http://proceedings.mlr.press/v89/suggala19a.html.

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. 2014. doi:https://doi.org/10.48550/arXiv.1312.6199. 2nd International Conference on Learning Representations, ICLR 2014; Conference date: 14-04-2014 Through 16-04-2014.

D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry. Robustness may be at odds with accuracy. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=SyxAb30cY7.

E. Wong and Z. Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, pages 5286–5295. PMLR, 2018. URL http://proceedings.mlr.press/v80/wong18a.html?ref=https://githubhelp.com.

Z. Wu, H. Gao, S. Zhang, and Y. Gao. Understanding the robustness-accuracy tradeoff by rethinking robust fairness. 2021. URL https://openreview.net/forum?id=bl9zYxOVwa.

Y.-Y. Yang, C. Rashtchian, Y. Wang, and K. Chaudhuri. Robustness for non-parametric classification: A generic attack and defense. In International Conference on Artificial Intelligence and Statistics, pages 941–951. PMLR, 2020a. URL http://proceedings.mlr.press/v108/yang20b.html.

Y.-Y. Yang, C. Rashtchian, H. Zhang, R. R. Salakhutdinov, and K. Chaudhuri. A closer look at accuracy vs. robustness. Advances in Neural Information Processing Systems, 33, 2020b. URL https://proceedings.neurips.cc/paper/2020/hash/61d77652c97ef636343742fc3dcf3ba9-Abstract.html.

H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, and M. Jordan. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pages 7472–7482. PMLR, 2019. URL https://proceedings.mlr.press/v97/zhang19p.html.

Downloads

Published

2023-01-23