Internally funded project
Acronym: b194dc-DocTraSeg
Start date : 16.11.2023
End date : 01.12.2024
This project aims at exploring the potential of Convolution Neural Networks (CNNs) and Vision Transformers (ViTs) in two fundamental tasks of computer vision, i.e., handwriting analysis, object tracking and segmentation.
Handwriting document analysis aims to evaluate and recognize the handwritten manuscripts according to different intentions, such as text recognition, spotting, layout analysis, text alignment, and writer recognition. As an important issue in the first step of digitizing scanned documents, this project will focus on layout analysis and line segmentation.
Object tracking and segmentation aims at continuously estimating the state of an object based on a given bounding box extracted by a simple rectangle/mask from the initial frame of a video sequence. It is widely applied in various applications such as surveillance, autonomous driving, human-computer interaction, etc. Despite the progress made so far, its main challenge lies in the limited discriminative power of the classifiers. Also, it is prone to the introduced endless distractors in real-world surveillance applications. This project will investigate state-of-the-art algorithms for achieving accurately and stably object tracking and segmentation.