Contextual Attention Network: Transformer Meets U-Net

Azad R, Heidari M, Wu Y, Merhof D (2022)

Publication Type: Conference contribution

Publication year: 2022

Journal

Lecture Notes in Computer Science Springer Verlag

Publisher: Springer Science and Business Media Deutschland GmbH

Book Volume: 13583 LNCS

Pages Range: 377-386

Conference Proceedings Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Event location: Singapore, SGP

ISBN: 9783031210136

DOI: 10.1007/978-3-031-21014-3_39

Abstract

Convolutional neural networks (CNN) (e.g., UNet) have become the de facto standard and attained immense success in medical image segmentation. However, CNN based methods fail to build long-range dependencies and global context connections due to the limited receptive field of the convolution operation. Therefore, Transformer variants have been proposed for medical image segmentation tasks due to their innate capability of capturing long-range correlations through the attention mechanism. However, since Transformers are not designed to capture local information, object boundaries are not well preserved, especially in difficult segmentation scenarios with partly overlapping objects. To address this issue, we propose a contextual attention network that includes a boundary representation on top of the CNN and Transformer features. It utilizes an CNN encoder to capture local semantic information and includes a Transformer module to model the long-range contextual dependency. The object-level representation is included by extracting hierarchical features that are then fed to the contextual attention module to adaptively recalibrate the representation space using local information. In this way, informative regions are emphasized while taking into account the long-range contextual dependency derived by the Transformer module. The results show that our approach is amongst the top performing methods on the skin lesion segmentation benchmark, and specifically shows its strength on the SegPC challenge benchmark which also includes overlapping objects. Implementation code in.

Involved external institutions

Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen

Germany (DE) Iran University of Science and Technology / دانشگاه علم و صنعت ایران

Iran, Islamic Republic of (IR)

How to cite

APA:

Azad, R., Heidari, M., Wu, Y., & Merhof, D. (2022). Contextual Attention Network: Transformer Meets U-Net. In Chunfeng Lian, Xiaohuan Cao, Islem Rekik, Xuanang Xu, Zhiming Cui (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 377-386). Singapore, SGP: Springer Science and Business Media Deutschland GmbH.

MLA:

Azad, Reza, et al. "Contextual Attention Network: Transformer Meets U-Net." Proceedings of the 13th International Workshop on Machine Learning in Medical Imaging, MLMI 2022, held in conjunction with 25th International Conference on Medical Image Computing and Computer_Assisted Intervention, MICCAI 2022, Singapore, SGP Ed. Chunfeng Lian, Xiaohuan Cao, Islem Rekik, Xuanang Xu, Zhiming Cui, Springer Science and Business Media Deutschland GmbH, 2022. 377-386.

BibTeX: Download