Burton J, Frank D, Saleh M, Navab N, Bear HL (2018)
Publication Type: Conference contribution
Publication year: 2018
Publisher: Institute of Electrical and Electronics Engineers Inc.
Pages Range: 125-130
Conference Proceedings Title: IEEE 3rd International Conference on Image Processing, Applications and Systems, IPAS 2018
Event location: Sophia Antipolis, FRA
ISBN: 9781728102474
DOI: 10.1109/IPAS.2018.8708874
Lipreading is a difficult gesture classification task. One problem in computer lipreading is speaker-independence. Speaker-independence means to achieve the same accuracy on test speakers not included in the training set as speakers within the training set. Current literature is limited on speaker-independent lipreading, the few independent test speaker accuracy scores are usually aggregated within dependent test speaker accuracies for an averaged performance. This leads to unclear independent results. Here we undertake a systematic survey of experiments with the TCD-TIMIT dataset using both conventional approaches and deep learning methods to provide a series of wholly speaker-independent benchmarks and show that the best speaker-independent machine scores 69.58% accuracy with CNN features and an SVM classifier. This is less than state-of-the-art speaker-dependent lipreading machines, but greater than previously reported in independence experiments.
APA:
Burton, J., Frank, D., Saleh, M., Navab, N., & Bear, H.L. (2018). The speaker-independent lipreading play-off; A survey of lipreading machines. In IEEE 3rd International Conference on Image Processing, Applications and Systems, IPAS 2018 (pp. 125-130). Sophia Antipolis, FRA: Institute of Electrical and Electronics Engineers Inc..
MLA:
Burton, Jake, et al. "The speaker-independent lipreading play-off; A survey of lipreading machines." Proceedings of the 3rd IEEE International Conference on Image Processing, Applications and Systems, IPAS 2018, Sophia Antipolis, FRA Institute of Electrical and Electronics Engineers Inc., 2018. 125-130.
BibTeX: Download