Arias Vergara T, Perez Toro PA, Liu X, Xing F, Stone M, Zhuo J, Prince JL, Schuster M, Nöth E, Woo J, Maier A (2024)
Publication Language: English
Publication Type: Conference contribution
Publication year: 2024
Publisher: International Speech Communication Association
Pages Range: 927-931
Conference Proceedings Title: Interspeech 2024
DOI: 10.21437/Interspeech.2024-2236
Magnetic Resonance Imaging (MRI) allows analyzing speech production by capturing high-resolution images of the dynamic processes in the vocal tract. In clinical applications, combining MRI with synchronized speech recordings leads to improved patient outcomes, especially if a phonological-based approach is used for assessment. However, when audio signals are unavailable, the recognition accuracy of sounds is decreased when using only MRI data. We propose a contrastive learning approach to improve the detection of phonological classes from MRI data when acoustic signals are not available at inference time. We demonstrate that frame-wise recognition of phonological classes improves from an f1 of 0.74 to 0.85 when the contrastive loss approach is implemented. Furthermore, we show the utility of our approach in the clinical application of using such phonological classes to assess speech disorders in patients with tongue cancer, yielding promising results in the recognition task.
APA:
Arias Vergara, T., Perez Toro, P.A., Liu, X., Xing, F., Stone, M., Zhuo, J.,... Maier, A. (2024). Contrastive Learning Approach for Assessment of Phonological Precision in Patients with Tongue Cancer Using MRI Data. In Interspeech 2024 (pp. 927-931). Kos Island, GR: International Speech Communication Association.
MLA:
Arias Vergara, Tomás, et al. "Contrastive Learning Approach for Assessment of Phonological Precision in Patients with Tongue Cancer Using MRI Data." Proceedings of the 25th Interspeech Conferece 2024, Kos Island International Speech Communication Association, 2024. 927-931.
BibTeX: Download