“One” in 768 Dimensions: Using Language Models to Study the Usage of yi 一 on a Large Scale

Keßler F (2026)


Publication Type: Journal article

Publication year: 2026

Journal

Original Authors: Florian Keßler

Book Volume: 80

Pages Range: 151-183

Issue: 1

Journal Issue: 1

DOI: 10.1515/asia-2025-0022

Abstract

With the availability of large corpora that can be queried using computational methods, the importance of considering large amounts of usage data to accurately describe the behavior of a word has increasingly been recognized. The most common method to conduct such research is the reading of concordances, i.e. compilations of a word in its surrounding contexts sampled from a corpus. In this study, an unsupervised way of clustering concordances using high-dimensional contextual word embeddings that capture semantic as well as syntactic information about a word is proposed, and tested on yi 一 (one) in a large corpus of pre-modern Chinese. The results show that the approach yields clusters that can be fruitfully explored in a Construction Grammar framework but might be inconsistent in the sense that instances of what a researcher would consider one phenomenon can be split over more than one cluster.

Authors with CRIS profile

How to cite

APA:

Keßler, F. (2026). “One” in 768 Dimensions: Using Language Models to Study the Usage of yi 一 on a Large Scale. Asiatische Studien - Études Asiatiques, 80(1), 151-183. https://doi.org/10.1515/asia-2025-0022

MLA:

Keßler, Florian. "“One” in 768 Dimensions: Using Language Models to Study the Usage of yi 一 on a Large Scale." Asiatische Studien - Études Asiatiques 80.1 (2026): 151-183.

BibTeX: Download