Contrastive learning for unsupervised medical image clustering and reconstruction

Authors

  • Matteo Ferrante University of Rome Tor Vergata
  • Tommaso Boccato University of Rome Tor Vergata
  • Andrea Duggento University of Rome Tor Vergata
  • Simeon Spasov University of Cambridge
  • Nicola Toschi University of Rome Tor Vergata

DOI:

https://doi.org/10.7557/18.6819

Keywords:

contrastive learning, unsupervised learning, patient stratification, deep clustering

Abstract

The lack of large labeled medical imaging datasets, along with significant inter-individual variability compared to clinically established disease classes, poses significant challenges in exploiting medical imaging information in a precision medicine paradigm, where in principle dense patient-specific data can be employed to formulate individual predictions and/or stratify patients into finer-grained groups which may follow more homogeneous trajectories and therefore empower clinical trials. In order to efficiently explore the effective degrees of freedom underlying variability in medical images in an unsupervised manner, in this work we propose an unsupervised autoencoder framework which is augmented with a contrastive loss to encourage high separability in the latent space. The model is validated on (medical) benchmark datasets. As cluster labels are assigned to each example according to cluster assignments, we compare performance with a supervised transfer learning baseline. Our methods achieves similar performance to the supervised architecture, indicating that separation in the latent space reproduces expert medical observer-assigned labels. The proposed method could be beneficial for patient stratification, exploring new subdivision of larger classes or pathological continua or, due to its sampling abilities in a variation setting, data augmentation in medical image processing.

References

R. Aggarwal, V. Sounderajah, G. Martin, D. S. W. Ting, A. Karthikesalingam, D. King, H. Ashrafian, and A. Darzi. Diagnostic accuracy of deep learning in medi-cal imaging: a systematic review and meta-analysis. npj Digital Medicine, 4(1):65, Apr2021.ISSN 2398-6352.doi:10.1038/s41746-021-00438-z. URL https://doi.org/10.1038/s41746-021-00438-z.

B. Bengfort, R. Bilbro, N. Danielsen, L. Gray,K. McIntyre, P. Roman, Z. Poh, et al. Yellow-brick, 2018. URL http://www.scikit-yb.org/en/latest/.

M. Caron, H. Touvron, I. Misra, H. Jegou, J. Mairal, P. Bojanowski, and A. Joulin. Emerging properties in self-supervised vision transformers, 2021. URL https://arxiv.org/abs/2104.14294.

K. Chaitanya, E. Erdil, N. Karani, and E. Konukoglu. Contrastive learning of global and local features for medical image segmentation with limited annotations. In Advances in Neural Infor-mation Processing Systems, volume 33, pages 12546–12558. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/949686ecef4ee20a62d16b4a2d7ccca3-Abstract.html.

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations, 2020. URL https://arxiv.org/abs/2002.05709.

T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. Hinton. Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029,2020.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li,and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. doi: 10.1109/CVPR.2009.5206848.

L. Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.

E. Fix and J. L. Hodges. Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review / Revue Internationale de Statistique, 57(3):238–247, 1989. ISSN 03067734,17515823. URL http://www.jstor.org/stable/1403797.

J.-B. Grill, F. Strub, F. Altche, C. Tallec, P. H. Richemond, E. Buchatskaya, C. Doersch, B. A. Pires, Z. D. Guo, M. G. Azar, B. Piot, K. Kavukcuoglu, R. Munos, and M. Valko. Bootstrap your own latent: A new approach to self-supervised learning, 2020. URL https://arxiv.org/abs/2006.07733.

H. Hampel, F. Caraci, A.C. Cuello, G. Caruso, R. Nistico, M. Corbo, F. Baldacci, N. Toschi, F. Garaci, P. A. Chiesa, S. R. Verdooner, L. Akman-Anderson, F. Hernandez, J.Avila, E. Emanuele, P.L. Valenzuela, A. Lucıa, M. Watling, B. P. Imbimbo, A. Vergallo, and S. Lista. A path toward precision medicine for neuroinflammatory mechanisms in alzheimer’s disease. Front. Immunol., 11:456, Mar. 2020.

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition, 2015. URL https://arxiv.org/abs/1512.03385.

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2014. URL https://arxiv.org/abs/1412.6980.

P. H. Le-Khac, G. Healy, and A. F. Smeaton. Contrastive representation learning: A framework and review. IEEE Access, 8:193907–193934, 2020. doi: 10.1109/access.2020.3031549. URL https://doi.org/10.11092Faccess.2020.3031549.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.

L. van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605,2008. URL http://jmlr.org/papers/v9/vandermaaten08a.html.

X. Wang, Y. Du, S. Yang, J. Zhang, M. Wang, J. Zhang, W. Yang, J. Huang, and X. Han. RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval. Medical Image Analysis, 83:102645, Jan. 2023. ISSN1361-8415. doi: 10.1016/j.media.2022.102645. URL https://www.sciencedirect.com/science/article/pii/S1361841522002730.

Y. Wu, D. Zeng, Z. Wang, Y. Shi, and J. Hu. Distributed Contrastive Learning for Medical Image Segmentation, Aug.2022. URL http://arxiv.org/abs/2208.03808. arXiv:2208.03808.

J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni. Medmnist v2: A large-scale lightweight benchmark for 2d and 3d biomedical image classification, 2021. URL https://arxiv.org/abs/2110.14795.

Y. You, I. Gitman, and B. Ginsburg. Large batch training of convolutional networks, 2017. URL https://arxiv.org/abs/1708.03888.

Y. Zhang, H. Jiang, Y. Miura, C. D. Manning, and C. P. Langlotz. Contrastive Learning of Medical Visual Representations from Paired Images and Text, Sept.2022. URL http://arxiv.org/abs/2010.00747. arXiv:2010.00747.

Downloads

Published

2023-01-23