Robust Deep Interpretable Features for Binary Image Classification




Interpretability, Kernel methods, Generative Modelling


The problem of interpretability for binary image classification is considered through the lens of kernel two-sample tests and generative modeling. A feature extraction framework coined Deep Interpretable Features (DIF) is developed, which is used in combination with IntroVAE, a generative model capable of high-resolution image synthesis. Experimental results on a variety of datasets, including COVID-19 chest x-rays demonstrate the benefits of combining deep generative models with the ideas from kernel-based hypothesis testing in moving towards more robust interpretable deep generative models.


S. Barratt and R. Sharma. A note on the inception score. ArXiv, abs/1801.01973, 2018.

Y. Bengio and Y. LeCun. Convolutional networks for images, speech, and time-series. 111997.

C. Chen, O. Li, C. Tao, A. J. Barnett, J. Su, and C. Rudin. This looks like that: Deep learning for interpretable image recognition, 2018.

D. Cozzi, M. Albanesi, E. Cavigli, C. Moroni, A. Bindi, S. Luvarà, S. Lucarini, S. Busoni,L. N. Mazzoni, and V. Miele. Chest x-ray in new coronavirus disease 2019 (covid-19) infection: findings and correlation with clinical outcome. La radiologia medica, 125(8):730-737, Aug 2020.

L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. Explaining explanations: An overview of interpretability of machine learning, 2018.

G. Gondim-Ribeiro, P. Tabacof, and E. Valle. Adversarial attacks on variational autoencoders, 2018.

I.J.Goodfellow, J.Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial networks, 2014.

A. Gosiewska and P. Biecek. Lifting interpretability-performance trade-off via automated feature engineering, 2020.

P. Hase, C. Chen, O. Li, and C. Rudin. Interpretable image recognition with hierarchical prototypes, 2019.

H. Huang, Z. Li, R. He, Z. Sun, and T. Tan. Introvae: Introspective variational autoencoders for photographic image synthesis. In Proceedings of the 32nd International Conferenceon Neural Information Processing Systems, NIPS'18, page 52-63, Red Hook, NY, USA,2018. Curran Associates Inc.

W. Jitkrittum, Z. Szabo, K. Chwialkowski, and A. Gretton. Interpretable Distribution Features with Maximum Testing Power. 5 2016.

U. Johansson, C. Sönströd, U. Norinder, and H. Boström. Trade-off between accuracy and interpretability for predictive in silico modeling. Future medicinal chemistry, 3:647-63, 042011.

D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. 12 2013.

W. Kong and P. Agarwal. Chest imaging appearance of covid-19 infection. Radiology: Cardiothoracic Imaging, 2(1):e200028, 2020.

Z. C. Lipton and S. Tripathi. Precise recovery of latent vectors from generative adversarial networks, 2017.

F. Locatello, S. Bauer, M. Lucic, G. Rätsch, S. Gelly, B. Schölkopf, and O. Bachem. Challenging common assumptions in the unsupervised learning of disentangled representations, 2018.

D. Lopez-Paz and M. Oquab. Revisiting classifier two-sample tests, 2016.

S. Lundberg and S.-I. Lee. A Unified Approach to Interpreting Model Predictions. 5 2017.

S. Pidhorskyi, D. Adjeroh, and G. Doretto. Adversarial latent autoencoders, 2020.

A. Razavi, A. van den Oord, and O. Vinyals. Generating diverse high-fidelity images with VQ-VAE-2, 2019.

D. J. Rezende and S. Mohamed. Variational Inference with Normalizing Flows. 5 2015.

M. T. Ribeiro, S. Singh, and C. Guestrin."Why Should I Trust You?", in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and DataMining - KDD '16, pages 1135-1144, New York, USA, 2016. ACM Press.

C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1:206-215, May 2019.