3D Masked Modelling Advances Lesion Classification in Axial T2w Prostate MRI


  • Alvaro Fernandez-Quilez Stavanger University Hospital
  • Christoffer Gabrielsen Andersen University of Stavanger
  • Trygve Eftestøl University of Stavanger
  • Svein Reidar Kjosavik Stavanger University Hospital
  • Ketil Oppedal University of Stavanger




MRI, Self-Supervised, Auto-encoder, Prostate Cancer


Masked Image Modelling (MIM) has been shown to be an efficient self-supervised learning (SSL) pre-training paradigm when paired with transformer architectures and in the presence of a large amount of unlabelled natural images. The combination of the difficulties in accessing and obtaining large amounts of labeled data and the availability of unlabelled data in the medical imaging domain makes MIM an interesting approach to advance deep learning (DL) applications based on 3D medical imaging data. Nevertheless, SSL and, in particular, MIM applications with medical imaging data are rather scarce and there is still uncertainty around the potential of such a learning paradigm in the medical domain. We study MIM in the context of Prostate Cancer (PCa) lesion classification with T2 weighted (T2w) axial magnetic resonance imaging (MRI) data. In particular, we explore the effect of using MIM when coupled with convolutional neural networks (CNNs) under different conditions such as different masking strategies, obtaining better results in terms of AUC than other pre-training strategies like ImageNet weight initialization.


Mohammad Hossein Jarrahi. Artificial intelligence and the future of work: Human-ai symbiosis in organizational decision making. Business horizons, 61(4):577–586, 2018. doi: 10.1016/j.bushor.2018.03.007.

Alvaro Fernandez-Quilez. Deep learning in radiology: ethics of data and on the value of algorithm transparency, interpretability and explainability. AI and Ethics, pages 1–9, 2022. doi: 10.1007/s43681-022-00161-9.

Martha MC Elwenspoek, Athena L Sheppard, Matthew DF McInnes, Samuel WD Merriel, Edward WJ Rowe, Richard J Bryant, Jenny L Donovan, and Penny Whiting. Comparison of multiparametric magnetic resonance imaging and targeted biopsy with systematic biopsy alone for the diagnosis of prostate cancer: a systematic review and meta-analysis. JAMA network open, 2(8):e198427–e198427, 2019. doi: 10.1001/jamanetworkopen.2019.8427.

Matthias R¨othke, AG Anastasiadis, M Lichy, M Werner, P Wagner, S Kruck, Claus D. Claussen, A Stenzl, HP Schlemmer, and D Schilling. Mri-guided prostate biopsy detects clinically significant cancer: analysis of a cohort of 100 patients after previous negative trus biopsy. World journal of urology, 30(2): 213–218, 2012. doi: 10.1007/s00345-011-0675-2.

Andrew B Rosenkrantz, Luke A Ginocchio, Daniel Cornfeld, Adam T Froemming, Rajan T Gupta, Baris Turkbey, Antonio C Westphalen, James S Babb, and Daniel J Margolis. Interobserver reproducibility of the pi-rads version 2 lexicon: a multicenter study of six experienced prostate radiologists. Radiology, 280 (3):793, 2016. doi: 10.1148/radiol.2016152542.

Andre Esteva, Katherine Chou, Serena Yeung, Nikhil Naik, Ali Madani, Ali Mottaghi, Yun Liu, Eric Topol, Jeff Dean, and Richard Socher. Deep learning-enabled medical computer vision. NPJ digital medicine, 4(1):1–9, 2021 doi: 10.1038/s41746-020-00376-2.

Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, and Samy Bengio. Transfusion: Understanding transfer learning for medical imaging. Advances in neural information processing systems, 32, 2019.

Longlong Jing and Yingli Tian. Selfsupervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(11):4037–4058, 2020 doi: 10.1109/TPAMI.2020.2992393.

Alvaro Fernandez-Quilez, Trygve Eftestøl, Svein Reidar Kjosavik, Morten Goodwin, and Ketil Oppedal. Contrasting axial t2w mri for prostate cancer triage: A self-supervised learning approach. In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pages 1–5, 2022. doi: 10.1109/ISBI52829.2022.9761573.

Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, et al. Big selfsupervised models advance medical image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3478–3488, 2021.

Jacob Devlin, Ming-Wei Chang, and Kenton Lee. Kristina, toutanova. Bert: Pre-training of deep bidirectional, transformers for language understanding. In, NAACL, 2(3), 2019. doi: 10.18653/v1/N19-1423.

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll´ar, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.

Geert Litjens, Oscar Debats, Jelle Barentsz, Nico Karssemeijer, and Henkjan Huisman. Computer-aided detection of prostate cancer in mri. IEEE transactions on medical imaging, 33(5):1083–1092, 2014. doi: 10.1109/TMI.2014.2303821.

Hari Sowrirajan, Jingbo Yang, Andrew Y Ng, and Pranav Rajpurkar. Moco pretraining improves representation and transferability of chest x-ray models. In Medical Imaging with Deep Learning, pages 728–744. PMLR, 2021.

Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre-Antoine Manzagol, and L´eon Bottou. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research, 11(12), 2010.

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. doi: 10.1007/978-3-319-24574-4 28.

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Alvaro Fernandez-Quilez, Trygve Eftestøl, Svein Reidar Kjosavik, and Ketil Oppedal. Learning to triage by learning to reconstruct: a generative self-supervised approach for prostate cancer based on axial t2w mri. In Medical Imaging 2022: Computer-Aided Diagnosis, volume 12033, pages 460–466. SPIE, 2022. doi: 10.1117/12.2610623.

Robert F Woolson. Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials, pages 1–3, 2007 doi: 10.1002/9780471462422.eoct979