Options
Toponym disambiguation in historical documents using semantic and geographic features
Journal
Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage Pages 175-180
Date Issued
2017
Author(s)
Ardanuy, Mariona Coll
DOI
10.1145/3078081.3078099
Abstract
Historians are often interested in the locations mentioned in digitized collections. However, place names are highly ambiguous and may change over time, which makes it especially hard to automatically ground mentions of places in historical texts to their real-world referents. Toponym disambiguation is a challenging problem in natural language processing, and has been approached in two different yet related tasks: toponym resolution and entity linking. In this paper, we propose a weakly-supervised method that combines the strengths of both approaches by exploiting both geographic and semantic features. We tested our method against a historical toponym resolution benchmark and improved the state of the art. We also created five datasets and tested the performance of two state-of-the-art out-of-the-box entity linking methods and also improved on their performance when only locations are considered.