Preparing languages for natural language generation using Wikidata lexicographical data
DOI:
https://doi.org/10.7557/5.5949Keywords:
Arctic Knot Conference 2021, Wikidata, Natural language, Lexicographical dataAbstract
In the lead-up to the launch of Abstract Wikipedia, a sufficient body of linguistic information, based on which the text within for a given language can be generated, must be in place so that different sets of functions, some working with concepts and others turning these into word sequences, can work together to produce something natural in that language. To achieve that information body's development requires more thorough consideration of a number of linguistic aspects sooner rather than later.
This session will thus discuss aspects of language planning with respect to Wikidata lexicographical data and natural language generation, including the compositionality and manipulability of lexical units, the breadth and interconnectedness of units of meaning, and the treatment of variation among a language’s lects broadly construed. Special reference to the handling of each of these aspects for Bengali and those linguistic varieties often grouped with it will be presented.
Metrics
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Mahir Morshed
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.