Wu F, Seuret M, Mayr M, Kordon F, Zöllner J, Wind S, Maier A, Christlein V (2025)
Publication Language: English
Publication Type: Journal article, Original article
Publication year: 2025
URI: https://link.springer.com/article/10.1007/s10032-025-00519-9
DOI: 10.1007/s10032-025-00519-9
Handwritten document layout analysis is a fundamental step in digitizing scanned ancient documents for further processing (e.g., optical character recognition). So far, single branch-based fully convolutional networks (FCN) dominate this field. However, we contend that this task faces significant challenges, particularly in layouts with only semantic differences rather than differences in character appearance. For example, in the U-DIADS-Bib dataset, distinguishing between the main text and chapter headings can confuse existing FCNs due to the presence of similar distractors. It is, thus, critical to integrate layout structural information into the network learning processes. Moreover, the single branch-based networks have an upper limit of constructing document contextual relationships. Therefore, we propose a novel two-branch framework, called lightweight cross-attention-based HookNet (Light-HookNet), for handwritten document layout segmentation. The layout contextual information is connected and interacted with the cross-attention mechanism between a global context branch and a local target branch. This allows to achieve information enhancement inside the target branch and information exchange across both branches. Additionally, the reduced network parameters and computational costs make the proposed method both lightweight and efficient. Extensive experimental results and performance comparisons with state-of-the-art approaches on the newly proposed U-DIADS-Bib dataset and the popular DIVA-HisDB dataset demonstrate the superiority and effectiveness of the proposed method.
APA:
Wu, F., Seuret, M., Mayr, M., Kordon, F., Zöllner, J., Wind, S.,... Christlein, V. (2025). Lightweight cross-attention-based HookNet for historical handwritten document layout analysis. International Journal on Document Analysis and Recognition. https://doi.org/10.1007/s10032-025-00519-9
MLA:
Wu, Fei, et al. "Lightweight cross-attention-based HookNet for historical handwritten document layout analysis." International Journal on Document Analysis and Recognition (2025).
BibTeX: Download