PLM-AS: Pre-trained Language Models Augmented with Scanpaths for Sentiment Classification

Authors

  • Duo Yang University of Copenhagen
  • Nora Hollenstein University of Copenhagen

DOI:

https://doi.org/10.7557/18.6797

Keywords:

Sentiment classification, Eye tracking, Multimodal fusion, Pre-trained language model

Abstract

Recent research demonstrated that deep neural networks could generate meaningful feature representations from both eye-tracking data and sentences without designing handcrafted features, which achieved competitive performance across cognitive NLP tasks, such as sentiment classification over gaze datasets, but the previous works mainly encode the text and gaze data separately without considering the interaction between these two modalities or applying large-scaled pre-trained models. To address these challenges, we introduce PLM-AS, a novel framework to take full advantage of textual and eye-tracking features by sequence modeling in a highly interactive way for multimodal fusion. It is also the first attempt to combine large-scaled pre-trained models with eye-tracking features in the cognitive reading task. We show that PLM-AS captures cognitive signals from eye-tracking data and shows improved performance on sentiment classification within and across three datasets of different domains.

References

M. Barrett and N. Hollenstein. Sequence labelling and sequence classification with gaze: Novel uses of eye-tracking data for natural language processing. Language and Linguistics Compass, 14(11):1–16, sep 2020. doi:10.1111/lnc3.12396.

X. Chen, J. Mao, Y. Liu, M. Zhang, and S. Ma. Investigating human reading behavior during sentiment judgment. International Journal of Machine Learning and Cybernetics, 13(8):2283–2296, mar 2022. doi: 10.1007/s13042-022-01523-9.

J. Cheri, A. Mishra, and P. Bhattacharyya. Leveraging annotators’ gaze behaviour for coreference resolution. In Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, pages 22–26, Berlin, Aug. 2016. Association for Computational Linguistics. doi:10.18653/v1/W16-1904.

K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio. On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, 2014. doi:10.3115/v1/w14-4012.

K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2014. doi:10.3115/v1/d14-1179.

R. A. J. de Belen, T. Bednarz, and A. Sowmya. ScanpathNet: A recurrent mixture density network for scanpath prediction. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, jun 2022. doi:10.1109/cvprw56347.2022.00549.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423.

N. Hollenstein and C. Zhang. Entity recognition at first sight:. In Proceedings of the 2019 Conference of the North. Association for Computational Linguistics, 2019. doi:10.18653/v1/n19-1001.

N. Hollenstein, J. Rotsztejn, M.Troendle, A. Pedroni, C. Zhang, and N. Langer. ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading. Scientific Data, 5(1), dec 2018. doi: 10.1038/sdata.2018.291.

N. Hollenstein, E. Chersoni, C. L. Jacobs, Y. Oseki, L. Pr ́evot, and E. Santus. CMCL 2021 shared task on eye-tracking prediction. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.cmcl-1.7.

A. Joshi, A. Mishra, N. Senthamilselvan, and P. Bhattacharyya. Measuring sentiment annotation complexity of text. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 2014. doi: 10.3115/v1/p14-2007.

​​S. Klerke, Y. Goldberg, and A. Søgaard. Improving sentence compression by learning to predict gaze. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016. doi:10.18653/v1/n16-1179.

Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach, 2020. doi:10.48550/arXiv.1907.11692.

Y. Long, L. Qin, R. Xiang, M. Li, and C.-R. Huang. A cognition based attention model for sentiment analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2017. doi: 10.18653/v1/d17-1048.

​​I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. doi:10.48550/arXiv.1711.05101.

S. Mathias, D. Kanojia, A. Mishra, and P. Bhattacharya. A survey on using gaze behaviour for natural language processing. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pages 4907–4913. International Joint Conferences on Artificial Intelligence Organization, 2020. doi: 10.24963/ijcai.2020/683.

E. S. McGuire and N. Tomuro. Sentiment analysis with cognitive attention supervision. Proceedings of the Canadian Conference on Artificial Intelligence, jun 2021. doi:10.21428/594757db.90170c50.

A. Mishra and P. Bhattacharyya. Predicting readers’ sarcasm understandability by modeling gaze behavior. In Cognitively Inspired Natural Language Processing, pages 99–115. Springer Singapore, 2018. doi: 10.1007/978-981-13-1516-9 5.

A. Mishra, K. Dey, and P. Bhattacharyya. Learning cognitive features from gaze data for sentiment and sarcasm classification using convolutional neural network. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). Association for Computational Linguistics, 2017. doi: 10.18653/v1/p17-1035.

A. Mishra, S. Tamilselvam, R. Dasgupta, S. Nagar, and K. Dey. Cognition-cognizant sentiment analysis with multitask subjectivity summarization based on annotators’ gaze behavior. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 2018. doi:10.1609/aaai.v32i1.12068.

M. Mosbach, M. Andriushchenko, and D. Klakow. On the stability of fine-tuning BERT: Misconceptions, explanations, and strong baselines. In International Conference on Learning Representations, 2021. doi:10.48550/arXiv.2006.04884.

D. Noton and L. Stark. Scanpaths in eye movements during pattern perception. Science, 171(3968):308–311, jan 1971. doi:10.1126/science.171.3968.308.

B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics. doi: 10.3115/1219840.1219855.

B. Plank. Keystroke dynamics as signal for shallow syntactic parsing. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 609–619, Osaka, Japan, Dec. 2016. The COLING 2016 Organizing Committee. doi: 10.48550/arXiv.1610.03321.

K. Rayner. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3):372–422, 1998. doi: 10.1037/0033-2909.124.3.372.

D. E. Rumelhart, G. E. Hinton, and R. J. Williams.Learning representations by backpropagating errors. Nature, 323(6088):533–536, oct 1986. doi: 10.1038/323533a0.

T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, Oct. 2020. Association for Computational Linguistics. doi:10.18653/v1/2020.emnlp-demos.6. R. Zhang, A. Saran, B. Liu, Y. Zhu, S. Guo, S. Niekum, D. Ballard, and M. Hayhoe. Human gaze assisted artificial intelligence: A review. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, jul 2020. doi: 10.24963/ijcai.2020/689.

Downloads

Published

2023-01-23