Roessle D, Cremers D, Schoen T (2022)
Publication Type: Conference contribution
Publication year: 2022
Publisher: Springer Science and Business Media Deutschland GmbH
Book Volume: 13529 LNCS
Pages Range: 599-610
Conference Proceedings Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Event location: Bristol, GBR
ISBN: 9783031159183
DOI: 10.1007/978-3-031-15919-0_50
Deep network architectures are usually based on domain-specific assumptions and are specialized to the modalities under consideration. This conceptual behavior also applies to multimodal networks, leading to modality-specific subnetworks. In this paper, we introduce a novel dynamic multi-modal and multi-instance (MM-MI) network based on Perceiver and Hopfield pooling which can learn intrinsic data fusion. We further introduce a novel composite dataset for evaluating MM-MI problems. We successfully show that our proposed architecture outperforms the late fusion baseline in all multi-modal setups by more than 40% accuracy on noisy data. Our simple, generally applicable, yet efficient architecture is a novel generalized approach for data fusion with high potential for future applications.
APA:
Roessle, D., Cremers, D., & Schoen, T. (2022). Perceiver Hopfield Pooling for Dynamic Multi-modal and Multi-instance Fusion. In Elias Pimenidis, Mehmet Aydin, Plamen Angelov, Chrisina Jayne, Antonios Papaleonidas (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 599-610). Bristol, GBR: Springer Science and Business Media Deutschland GmbH.
MLA:
Roessle, Dominik, Daniel Cremers, and Torsten Schoen. "Perceiver Hopfield Pooling for Dynamic Multi-modal and Multi-instance Fusion." Proceedings of the 31st International Conference on Artificial Neural Networks, ICANN 2022, Bristol, GBR Ed. Elias Pimenidis, Mehmet Aydin, Plamen Angelov, Chrisina Jayne, Antonios Papaleonidas, Springer Science and Business Media Deutschland GmbH, 2022. 599-610.
BibTeX: Download