Seuret M, Limbach S, Weichselbaumer N, Maier A, Christlein V (2019)
Publication Type: Conference contribution
Publication year: 2019
Publisher: Association for Computing Machinery
Pages Range: 1-6
Conference Proceedings Title: ACM International Conference Proceeding Series
ISBN: 9781450376686
Open Access Link: http://www.weichselbaumer.info/s/HIP2019_type_groups_dataset.pdf
Based on contemporary scripts, early printers developed a large variety of different fonts. While fonts may slightly differ from one printer to another, they can be divided into font groups, such as Textura, Antiqua, or Fraktur. The recognition of font groups is important for computer scientists to select adequate OCR models, and of high interest to humanities scholars studying early printed books and the history of fonts. In this paper, we introduce a new, public dataset for the recognition of font groups in early printed books, and evaluate several state-of-the-art CNNs for the font group recognition task. The dataset consists of more than 35 600 page images, each page showing up to five different font groups, of which ten are considered in this dataset.
APA:
Seuret, M., Limbach, S., Weichselbaumer, N., Maier, A., & Christlein, V. (2019). Dataset of pages from early printed books with multiple font groups. In ACM International Conference Proceeding Series (pp. 1-6). Sydney, NSW, AU: Association for Computing Machinery.
MLA:
Seuret, Mathias, et al. "Dataset of pages from early printed books with multiple font groups." Proceedings of the 5th International Workshop on Historical Document Imaging and Processing, HIP 2019, held in conjunction with ICDAR 2019, Sydney, NSW Association for Computing Machinery, 2019. 1-6.
BibTeX: Download