Parra-Gallego LF, Purohit T, Vlasenko B, Orozco-Arroyave JR, Magimai.-Doss M (2024)
Publication Type: Conference contribution
Publication year: 2024
Publisher: International Speech Communication Association
Pages Range: 477-481
Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Event location: Kos Island, GRC
DOI: 10.21437/Interspeech.2024-514
Customer Satisfaction (CS) in call centers influences customer loyalty and the company's reputation. Traditionally, CS evaluations were conducted manually or with classical machine learning algorithms; however, advancements in deep learning have led to automated systems that evaluate CS using speech and text analyses. Previous studies have shown the text approach to be more accurate but relies on an external ASR for transcription. This study introduces a cross-transfer knowledge technique, distilling knowledge from the BERT model into speech encoders like Wav2Vec2, WavLM, and Whisper. By enriching these encoders with BERT's linguistic information, we improve speech analysis performance and eliminate the need for an ASR. In evaluations on a dataset of customer opinions, our methods achieve over 92% accuracy in identifying CS categories, providing a faster and cost-effective solution compared to traditional text approaches.
APA:
Parra-Gallego, L.F., Purohit, T., Vlasenko, B., Orozco-Arroyave, J.R., & Magimai.-Doss, M. (2024). Cross-transfer Knowledge between Speech and Text Encoders to Evaluate Customer Satisfaction. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 477-481). Kos Island, GRC: International Speech Communication Association.
MLA:
Parra-Gallego, Luis Felipe, et al. "Cross-transfer Knowledge between Speech and Text Encoders to Evaluate Customer Satisfaction." Proceedings of the 25th Interspeech Conferece 2024, Kos Island, GRC International Speech Communication Association, 2024. 477-481.
BibTeX: Download