Bueß L, Stollenga M, Schinz D, Wiestler B, Kirschke J, Maier A, Navab N, Keicher M (2024)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2024
Series: Proceedings of Machine Learning Research
Book Volume: Volume 250
Pages Range: 151-167
Conference Proceedings Title: Proceedings of MIDL 2024
URI: https://proceedings.mlr.press/v250/
Open Access Link: https://openreview.net/pdf?id=shuwpLaOJP
Early and accurate diagnosis of vertebral body anomalies is crucial for effectively treating spinal disorders, but the manual interpretation of CT scans can be time-consuming and error-prone. While deep learning has shown promise in automating vertebral fracture detection, improving the interpretability of existing methods is crucial for building trust and ensuring reliable clinical application. Vision Transformers (ViTs) offer inherent interpretability through attention visualizations but are limited in their application to 3D medical images due to reliance on 2D image pretraining. To address this challenge, we propose a novel approach combining the benefits of transfer learning from video-pretrained models and domain adaptation with self-supervised pretraining on a task-specific but unlabeled dataset. Compared to naive transfer learning from Video MAE, our method shows improved downstream task performance by 8.3 in F1 and a training speedup of factor 2. This closes the gap between videos and medical images, allowing a ViT to learn relevant anatomical features while adapting to the task domain. We demonstrate that our framework enables ViTs to effectively detect vertebral fractures in a low data regime, outperforming CNN-based state-of-the-art methods while providing inherent interpretability. Our task adaptation approach and dataset not only improve the performance of our proposed method but also enhance existing self-supervised pretraining approaches, highlighting the benefits of task-specific self-supervised pretraining for domain adaptation.
APA:
Bueß, L., Stollenga, M., Schinz, D., Wiestler, B., Kirschke, J., Maier, A.,... Keicher, M. (2024). Video-CT MAE: Self-supervised Video-CT Domain Adaptation for Vertebral Fracture Diagnosis. In Ninon Burgos, Caroline Petitjean, Maria Vakalopoulou, Stergios Christodoulidis, Pierrick Coupe, Hervé Delingette, Carole Lartizien, Diana Mateus (Eds.), Proceedings of MIDL 2024 (pp. 151-167). Paris, FR.
MLA:
Bueß, Lukas, et al. "Video-CT MAE: Self-supervised Video-CT Domain Adaptation for Vertebral Fracture Diagnosis." Proceedings of the International Conference on Medical Imaging with Deep Learning (MIDL) 2024, Paris Ed. Ninon Burgos, Caroline Petitjean, Maria Vakalopoulou, Stergios Christodoulidis, Pierrick Coupe, Hervé Delingette, Carole Lartizien, Diana Mateus, 2024. 151-167.
BibTeX: Download