Fabiani L, Schlecht SJ, Elvander F (2024)
Publication Type: Conference contribution
Publication year: 2024
Publisher: IEEE Computer Society
Pages Range: 1414-1417
Conference Proceedings Title: Conference Record - Asilomar Conference on Signals, Systems and Computers
Event location: Hybrid, Pacific Grove, CA, USA
ISBN: 9798350354058
DOI: 10.1109/IEEECONF60004.2024.10943074
In audio signal processing, having an effective metric for comparing audio data is essential to ensure an accurate understanding of sound properties and attributes. In this work, we formulate two novel approaches for measuring the similarity between audio signals in the time-frequency domain, taking advantage of principles from classical optimal transport problems and sliced Wasserstein distances. Using optimal transport to construct the metric allows for a more robust signal content comparison, considering not only the signals' individual elements but also the global distribution in the signal space. Additionally, the sliced Wasserstein methods expand the use of the distances to high dimensional problems. By integrating both time and frequency aspects into our metrics, we aim for a more comprehensive comparison that can better handle various types of signal distortions. Results show promising behavior in accurately measuring distances for increasing signal differences and avoiding the presence of local minima in the loss curves.
APA:
Fabiani, L., Schlecht, S.J., & Elvander, F. (2024). Time-Frequency Audio Similarity Using Optimal Transport. In Michael B. Matthews (Eds.), Conference Record - Asilomar Conference on Signals, Systems and Computers (pp. 1414-1417). Hybrid, Pacific Grove, CA, USA: IEEE Computer Society.
MLA:
Fabiani, Linda, Sebastian J. Schlecht, and Filip Elvander. "Time-Frequency Audio Similarity Using Optimal Transport." Proceedings of the 58th Asilomar Conference on Signals, Systems and Computers, ACSSC 2024, Hybrid, Pacific Grove, CA, USA Ed. Michael B. Matthews, IEEE Computer Society, 2024. 1414-1417.
BibTeX: Download