Herre J, Delgado P (2024)
Publication Language: English
Publication Type: Journal article, Original article
Publication year: 2024
Book Volume: vol. 32
Pages Range: 4661-4675
DOI: 10.1109/TASLP.2024.3477291
Efficient audioquality assessment is vital for stream-
lining audio codec development. Objective assessment tools have
been developed over time to algorithmically predict quality rat-
ings from subjective assessments, the gold standard for quality
judgment. Many of these tools use perceptual auditory models to
extract audio features that are mapped to a basic audio quality
score prediction using machine learning algorithms and subjec-
tive scores as training data. However, existing tools struggle with
generalization in quality prediction, especially when faced with
unknown signal and distortion types. This is particularly evident in
the presence of signals coded using non-waveform-preserving para-
metric techniques. Addressing these challenges, this two-part work
proposes extensions to the Perceptual Evaluation of Audio Quality
(PEAQ - ITU-R BS.1387-1) recommendation. Part 1 focuses on
increasing generalization, while Part 2 targets accurate spatial
audio quality measurement in audio coding. To enhance prediction
generalization, this paper (Part 1) introduces a novel machine
learning approach that uses subjective data to model cognitive as-
pects of audio quality perception. The proposed method models the
perceived severity of audible distortions by adaptively weighting
different distortion metrics. The weights are determined using an
interaction cost function that captures relationships between dis-
tortion salience and cognitive effects. Compared to other machine
learning methods and established tools, the proposed architecture
achieves higher prediction accuracy on large databases of previ-
ously unseen subjective quality scores. The perceptually-motivated
model offers a more manageable alternative to general-purpose
machine learning algorithms, allowing potential extensions and
improvements to multi-dimensional quality measurement without
complete retraining.
APA:
Herre, J., & Delgado, P. (2024). Towards Improved Objective Perceptual Audio Quality Assessment - Part 1: A Novel Data-Driven Cognitive Model. IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 32, 4661-4675. https://doi.org/10.1109/TASLP.2024.3477291
MLA:
Herre, Jürgen, and Pablo Delgado. "Towards Improved Objective Perceptual Audio Quality Assessment - Part 1: A Novel Data-Driven Cognitive Model." IEEE/ACM Transactions on Audio, Speech and Language Processing vol. 32 (2024): 4661-4675.
BibTeX: Download