Gebeşce A, Safa A, Amasya EU, Şahin GG (2026)
Publication Type: Conference contribution
Publication year: 2026
Publisher: Association for Computational Linguistics (ACL)
Pages Range: 236-247
Conference Proceedings Title: SIGTURK 2026 - 2nd Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop
Event location: Rabat, MAR
ISBN: 9798891763708
DOI: 10.18653/v1/2026.sigturk-1.20
This paper presents an overview of the SIGTURK 2026 Shared Task on Terminology-Aware Machine Translation for English-Turkish Scientific Texts. We address the critical challenge of terminological accuracy in low-resource settings by constructing the first terminology-rich English-Turkish parallel corpus, comprising 3,300 sentence pairs from STEM domains with 10,157 expert-validated term pairs. The shared task consists of three subtasks: term detection, expert-guided correction, and end-to-end post-editing. We evaluate state-of-the-art baselines (including GPT-5.2 and Claude Sonnet 4.5) alongside participant systems employing diverse strategies from fine-tuning to Retrieval-Augmented Generation (RAG). Our results highlight that while massive generalist models dominate zero-shot detection, smaller, domain-adapted models using Supervised Fine-Tuning and Reinforcement Learning can significantly outperform them in end-to-end post-editing. Furthermore, we find that rigid retrieval pipelines often disrupt fluency, whereas Chain-of-Thought prompting allows models to integrate terminology more naturally. Despite these advances, a significant gap remains between automated systems and human expert performance in strict terminology correction.
APA:
Gebeşce, A., Safa, A., Amasya, E.U., & Şahin, G.G. (2026). Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English-Turkish Scientific Texts. In Kemal Oflazer, Abdullatif Koksal, Onur Varol (Eds.), SIGTURK 2026 - 2nd Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop (pp. 236-247). Rabat, MAR: Association for Computational Linguistics (ACL).
MLA:
Gebeşce, Ali, et al. "Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English-Turkish Scientific Texts." Proceedings of the 2nd Workshop on Natural Language Processing for Turkic Languages, SIGTURK 2026, Rabat, MAR Ed. Kemal Oflazer, Abdullatif Koksal, Onur Varol, Association for Computational Linguistics (ACL), 2026. 236-247.
BibTeX: Download