Keßler F (2026)
Publication Type: Journal article
Publication year: 2026
Original Authors: Florian Keßler
Book Volume: 80
Pages Range: 151-183
Issue: 1
Journal Issue: 1
With the availability of large corpora that can be queried using computational methods, the importance of considering large amounts of usage data to accurately describe the behavior of a word has increasingly been recognized. The most common method to conduct such research is the reading of concordances, i.e. compilations of a word in its surrounding contexts sampled from a corpus. In this study, an unsupervised way of clustering concordances using high-dimensional contextual word embeddings that capture semantic as well as syntactic information about a word is proposed, and tested on yi 一 (one) in a large corpus of pre-modern Chinese. The results show that the approach yields clusters that can be fruitfully explored in a Construction Grammar framework but might be inconsistent in the sense that instances of what a researcher would consider one phenomenon can be split over more than one cluster.
APA:
Keßler, F. (2026). “One” in 768 Dimensions: Using Language Models to Study the Usage of yi 一 on a Large Scale. Asiatische Studien - Études Asiatiques, 80(1), 151-183. https://doi.org/10.1515/asia-2025-0022
MLA:
Keßler, Florian. "“One” in 768 Dimensions: Using Language Models to Study the Usage of yi 一 on a Large Scale." Asiatische Studien - Études Asiatiques 80.1 (2026): 151-183.
BibTeX: Download