Zenk J, Kordon F, Mayr M, Seuret M, Christlein V (2023)
Publication Type: Conference contribution
Publication year: 2023
Publisher: Association for Computing Machinery
Pages Range: 97-102
Conference Proceedings Title: ACM International Conference Proceeding Series
ISBN: 9798400708411
In the context of automated classification of historical documents, we investigate three contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for the pre-training of three different document analysis tasks, namely script-type, font-type, and location classification. Our study draws samples from multiple datasets that contain images of manuscripts, prints, charters, and letters. The representations derived via pre-text training are taken as inputs for k-NN classification and a parametric linear classifier. The latter is placed atop the pre-trained backbones to enable fine-tuning of the entire network to further improve the classification by exploiting task-specific label data. The network’s final performance is assessed via independent test sets obtained from the ICDAR2021 Competition on Historical Document Classification. We empirically show that representations learned with SSL are significantly better suited for subsequent document classification than features generated by commonly used transfer learning on ImageNet.
APA:
Zenk, J., Kordon, F., Mayr, M., Seuret, M., & Christlein, V. (2023). Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents. In ACM International Conference Proceeding Series (pp. 97-102). San Jose, CA, US: Association for Computing Machinery.
MLA:
Zenk, Johan, et al. "Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents." Proceedings of the 7th International Workshop on Historical Document Imaging and Processing, HIP 2023, held in conjunction with ICDAR 2023, San Jose, CA Association for Computing Machinery, 2023. 97-102.
BibTeX: Download