Mayr M, Felker A, Maier A, Christlein V (2022)
Publication Type: Conference contribution
Publication year: 2022
Publisher: Springer Science and Business Media Deutschland GmbH
Book Volume: 13237 LNCS
Pages Range: 598-612
Conference Proceedings Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Event location: La Rochelle, FRA
ISBN: 9783031065545
DOI: 10.1007/978-3-031-06555-2_40
Automatically extracting targeted information from historical documents is an important task in the field of document analysis and eases the work of historians when dealing with huge corpora. In this work, we investigate the idea of retrieving the recipient transcriptions from the Nuremberg letterbooks of the 15th century. This task can be solved with fundamentally different ways of approaching it. First, detecting recipient lines solely based on visual features and without any explicit linguistic feedback. Here, we use a vanilla U-Net and an attention-based U-Net as representatives. Second, linguistic feedback can be used to classify each line accordingly. This is done on the one hand with handwritten text recognition (HTR) for predicting the transcriptions and on top of it a light-wight natural language processing (NLP) model distinguishing whether the line is a recipient line or not. On the other hand, we adapt a named entity recognition transformer model. The system jointly performs the line transcription and the recipient line recognition. For improving the performance, we investigated all the possible combinations with the different methods. In most cases the combined output probabilities outperformed the single approaches. The best combination achieved on the hard test set an F1 score of 80% and recipient line recognition accuracy of about 96% while the best single approach only reached about 74% and 94%, respectively.
APA:
Mayr, M., Felker, A., Maier, A., & Christlein, V. (2022). Combining Visual and Linguistic Models for a Robust Recipient Line Recognition in Historical Documents. In Seiichi Uchida, Elisa Barney, Véronique Eglin (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 598-612). La Rochelle, FRA: Springer Science and Business Media Deutschland GmbH.
MLA:
Mayr, Martin, et al. "Combining Visual and Linguistic Models for a Robust Recipient Line Recognition in Historical Documents." Proceedings of the 15th IAPR International Workshop on Document Analysis Systems, DAS 2022, La Rochelle, FRA Ed. Seiichi Uchida, Elisa Barney, Véronique Eglin, Springer Science and Business Media Deutschland GmbH, 2022. 598-612.
BibTeX: Download