Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English-Turkish Scientific Texts

Gebeşce A, Safa A, Amasya EU, Şahin GG (2026)


Publication Type: Conference contribution

Publication year: 2026

Publisher: Association for Computational Linguistics (ACL)

Pages Range: 236-247

Conference Proceedings Title: SIGTURK 2026 - 2nd Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop

Event location: Rabat, MAR

ISBN: 9798891763708

DOI: 10.18653/v1/2026.sigturk-1.20

Abstract

This paper presents an overview of the SIGTURK 2026 Shared Task on Terminology-Aware Machine Translation for English-Turkish Scientific Texts. We address the critical challenge of terminological accuracy in low-resource settings by constructing the first terminology-rich English-Turkish parallel corpus, comprising 3,300 sentence pairs from STEM domains with 10,157 expert-validated term pairs. The shared task consists of three subtasks: term detection, expert-guided correction, and end-to-end post-editing. We evaluate state-of-the-art baselines (including GPT-5.2 and Claude Sonnet 4.5) alongside participant systems employing diverse strategies from fine-tuning to Retrieval-Augmented Generation (RAG). Our results highlight that while massive generalist models dominate zero-shot detection, smaller, domain-adapted models using Supervised Fine-Tuning and Reinforcement Learning can significantly outperform them in end-to-end post-editing. Furthermore, we find that rigid retrieval pipelines often disrupt fluency, whereas Chain-of-Thought prompting allows models to integrate terminology more naturally. Despite these advances, a significant gap remains between automated systems and human expert performance in strict terminology correction.

Involved external institutions

How to cite

APA:

Gebeşce, A., Safa, A., Amasya, E.U., & Şahin, G.G. (2026). Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English-Turkish Scientific Texts. In Kemal Oflazer, Abdullatif Koksal, Onur Varol (Eds.), SIGTURK 2026 - 2nd Workshop on Natural Language Processing for Turkic Languages, Proceedings of the Workshop (pp. 236-247). Rabat, MAR: Association for Computational Linguistics (ACL).

MLA:

Gebeşce, Ali, et al. "Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English-Turkish Scientific Texts." Proceedings of the 2nd Workshop on Natural Language Processing for Turkic Languages, SIGTURK 2026, Rabat, MAR Ed. Kemal Oflazer, Abdullatif Koksal, Onur Varol, Association for Computational Linguistics (ACL), 2026. 236-247.

BibTeX: Download