Efficient Self-Supervision using Patch-based Contrastive Learning for Histopathology Image Segmentation


  • Nicklas Boserup University of Copenhagen
  • Raghavendra Selvan University of Copenhagen




self-supervision, image segmentation, histopathology, efficient machine learning


Learning discriminative representations of unlabelled data is a challenging task. Contrastive self-supervised learning provides a framework to learn meaningful representations using learned notions of similarity measures from simple pretext tasks. In this work, we propose a simple and efficient framework for self-supervised image segmentation using contrastive learning on image patches, without using explicit pretext tasks or any further labeled fine-tuning. A fully convolutional neural network (FCNN) is trained in a self-supervised manner to discern features in the input images and obtain confidence maps which capture the network's belief about the objects belonging to the same class. Positive- and negative- patches are sampled based on the average entropy in the confidence maps for contrastive learning. Convergence is assumed when the information separation between the positive patches is small, and the positive-negative pairs is large. The proposed model only consists of a simple FCNN with 10.8k parameters and requires about 5 minutes to converge on the high resolution microscopy datasets, which is orders of magnitude smaller than the relevant self-supervised methods to attain similar performance. We evaluate the proposed method for the task of segmenting nuclei from two histopathology datasets, and show comparable performance with relevant self-supervised and supervised methods.


A. Baevski, Y. Zhou, A. Mohamed, and M. Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. In H. Larochelle, M. Ranzato, R. Hadsell, M. Bal-can, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33,pages 12449–12460. Curran Associates, Inc., 2020.doi: 10.48550/ARXIV.2006.11477.

T. Chen, S. Kornblith, M. Norouzi, and G. Hin-ton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607.PMLR, 2020. doi: 10.48550/ARXIV.2002.05709.

S. Graham, Q. D. Vu, S. E. A. Raza, A. Azam,Y. W. Tsang, J. T. Kwak, and N. Rajpoot. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical Image Analysis, 58:101563, 2019.doi: 10.1016/j.media.2019.101563.

N. Kumar, R. Verma, S. Sharma, S. Bhargava,A. Vahadane, and A. Sethi. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Transactionson Medical Imaging, 36(7):1550–1560, 2017. doi:10.1109/TMI.2017.2677499.

I. Misra and L. v. d. Maaten. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6707–6717, 2020. doi: 10.48550/ARXIV.1912.01991.

M. Noroozi and P. Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016. doi:10.1007/978-3-319-46466-4 5.

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani,S. Chilamkurthy, B. Steiner, L. Fang, J. Bai,and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer,F. d'Alch ́e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. doi: 10.48550/ARXIV.1912.01703.

M. Sahasrabudhe, S. Christodoulidis, R. Salgado,S. Michiels, S. Loi, F. Andr ́e, N. Paragios,and M. Vakalopoulou. Self-supervised nuclei segmentation in histopathological images using attention. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 393–402. Springer, 2020. doi:10.1007/978-3-030-59722-1 38.

J. Xie, X. Zhan, Z. Liu, Y.-S. Ong, and C. C.Loy. Unsupervised object-level representation learning from scene images. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, 2021. doi: 10.48550/arXiv.2106.11952.

X. Xie, J. Chen, Y. Li, L. Shen, K. Ma, and Y. Zheng. Instance-aware self-supervised learning for nuclei segmentation. In International conference on medical image computing and computer-assisted intervention, pages 341–350. Springer,2020. doi: 10.1007/978-3-030-59722-1 33