Interactive Scribble Segmentation


  • Mathias Micheelsen Lowes Technical University of Denmark
  • Jakob L. Christensen Technical University of Denmark
  • Bjørn Schreblowski Hansen Technical University of Denmark
  • Morten Rieger Hannemose Technical University of Denmark
  • Anders Bjorholm Dahl Technical University of Denmark
  • Vedrana Dahl Technical University of Denmark



Deep learning, Image analysis, Segmentation, Scribble, Scribble segmentation, Interactive segmentation, Human-in-the-loop


We present a deep learning model for image segmentation that uses weakly supervised inputs consisting of scribbles. A user can draw scribbles on an image with a brush tool corresponding to the labels they want segmented. The network can segment images in real time while scribbles are being drawn, giving instant feedback to the user. It is easy to correct mistakes made by the network, as more scribbles can be added. During training we use a similar psuedo-interactive and iterative setup to make sure that the network is optimized towards the human-in-the-loop inference setting. On the contrary, standard scribble segmentation methods do not consider the training of the algorithm as an interactive setting and thus are not suited for interactive inference. Our model is class-agnostic and we are able to generalize across many different data modalities. We compare our model with other weakly supervised methods such as bounding box and extrema point methods, and we show our model achieves a better mean DICE score.


A. Bearman, O. Russakovsky, V. Ferrari, and L. Fei-Fei. What’s the point: Semantic segmentation with point supervision. In B. Leibe, J. Matas, N. Sebe, and. Welling, editors, Computer Vision – ECCV 2016, pages 549–565, Cham, 2016. Springer International Publishing. ISBN 978-3-319-46478-7. doi: 10.48550/ARXIV.1506.02106

M. A. et al. The medical segmentation decathlon. Nature Communications, 2021. doi: 10.48550/ARXIV.2106.05735. URL

M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2):303–338, 2010. doi: 10.1007/s11263-009-0275-4

M. Jahanifar, N. Z. Tajeddin, N. A. Koohbanani, and N. Rajpoot. Robust interactive semantic segmentation of pathology images with minimal user input. 2021. doi:10.48550/ARXIV.2108.13368. URL

A. E. Kavur, N. S. Gezer, M. Barı ̧s, S. Aslan, P.-H. Conze, V. Groza, D. D. Pham, S. Chatterjee, P. Ernst, S. ̈Ozkan, B. Baydar, D. Lachinov, S. Han, J. Pauli, F. Isensee, M. Perkonigg, R. Sathish, R. Rajan, D. Sheet, G. Dovletov, O. Speck, A. N ̈urnberger, K. H. Maier-Hein, G. Bozda ̆gı Akar, G. ̈Unal, O. Dicle, and M. A. Selver. Chaos challenge - combined (ct-mr) healthy abdominal organ segmentation. Medical Image Analysis,69:101950, 2021. ISSN 1361-8415. doi: URL

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. doi: 10.48550/ARXIV.1412.6980

D. Lin, J. Dai, J. Jia, K. He, and J. Sun. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. doi: 10.48550/ARXIV.1604.05144. URL

T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Doll ́ar, and C. L. Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014. doi: 10.48550/ARXIV.1405.0312. URL

K.-K. Maninis, S. Caelles, J. Pont-Tuset, and L. Van Gool. Deep extreme cut: From extreme points to object segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. doi: 10.48550/ARXIV.1711.09081. URL

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. 2015. doi: 10.48550/ARXIV.1505.04597. URL

C. Rother, V. Kolmogorov, and A. Blake. ”grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., 2004. doi: 10.1145/1015706.1015720. URL

M. Tan and Q. V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. International conference on machine learning, 2019. doi: 10.48550/ARXIV.1905.11946. URL

M. Tan and Q. V. Le. Efficientnetv2: Smaller models and faster training. International Conference on Machine Learning, 2021. doi: 10.48550/ARXIV.2104.00298. URL

T. Y. Z. C. Univ., T. Y. Zhang, C. Univ., C. Y. S. C. Univ., C. Y. Suen, C. Univ., N. A. R. Center, and O. M. A. Metrics. A fast parallel algorithm for thinning digital patterns. Communications of the ACM, Mar 1984. doi: /10.1145/357994.358023. URL

G. Valvano, A. Leo, and S. A. Tsaftaris. Weakly supervised segmentation with multiscale adversarial attention gates. CoRR, abs/2007.01152, 2020. doi: 10.1109/tmi.2021.3069634. URL

J. Xu, C. Zhou, Z. Cui, C. Xu, Y. Huang, P. Shen, S. Li, and J. Yang. Scribble-supervised semantic segmentation inference. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. doi: 10.1109/ICCV48922.2021.01507. URL