Infinite Monkeys of Babel – Crowdsourcing for the betterment of OCR language material

Authors

  • Wouter Van Hemel National Library of Finland Library Network Services PL 26 (Teollisuuskatu 23) FI-00014 University of Helsinki
  • Jussi-Pekka Hakkarainen National Library of Finland Library Network Services PL 26 (Teollisuuskatu 23) FI-00014 University of Helsinki

DOI:

https://doi.org/10.7557/5.3469

Abstract

The OCR editor is the National Library of Finland’s most recent foray into the budding phenomenon of crowd-sourcing. Under the motto of many hands make light work, users can swiftly correct the typical mistakes in OCR scanned text of source materials – often of challenging visual quality – using nothing more than their browser. Improving the quality and availability of the digital text would make it easier to directly study the original sources, and indirectly contribute to other tools depending on accuracy such as word list generators and dictionaries.

Metrics

PDF views
320
Jul 2015Jan 2016Jul 2016Jan 2017Jul 2017Jan 2018Jul 2018Jan 2019Jul 2019Jan 2020Jul 2020Jan 2021Jul 2021Jan 2022Jul 2022Jan 2023Jul 2023Jan 2024Jul 2024Jan 2025Jul 2025Jan 202622
|

Downloads

Published

2015-06-17

How to Cite

Van Hemel, W., & Hakkarainen, J.-P. (2015). Infinite Monkeys of Babel – Crowdsourcing for the betterment of OCR language material. Septentrio Conference Series, (2), 69–74. https://doi.org/10.7557/5.3469