Experience of syntactic annotation of Turkic languages

Authors

DOI:

https://doi.org/10.26577/EJPh202519932
        78 32

Abstract

The article examines the experience of describing the syntactic structure of Turkic languages from the perspective of formal grammar and based on modern annotation models. Syntactic annotation is recognized as an important tool for formally describing a language’s grammatical system and enabling its automatic processing. Relying on projects such as Universal Dependencies (UD), MaTT (Multilingual Aligned Treebank of Turkic), and the Kazakh Dependency Treebank (KazDT), the study describes morphological and syntactic features characteristic of Turkic languages.
Various syntactic annotation models – “phrase structure grammar,” “hybrid,” and “dependency grammar,” among others are analyzed in terms of their characteristics, differences, as well as their advantages and disadvantages for Turkic languages. The results demonstrate that a syntactic annotation model based on head-dependent relations grammar provides an effective means of describing the structure of Turkic languages. The article outlines the theoretical foundations of dependency grammar (head-dependent relations), and the formats and standards for syntactic annotation. It also discusses the adaptation of the agglutinative nature and free word order of Turkic languages to universal projects such as UD.

In addition, future directions are identified, including the enhancement of annotated corpora for the Kazakh language, automatic parsing, and integration into language education systems. The article aims to scientifically substantiate syntactic annotation as one of the key steps in integrating the Kazakh language into the digital space, building on the syntactic annotation experience of Turkic languages.

Keywords: Turkic languages, syntactic annotation, dependency grammar (head-dependent relations), UD, KazDT, formal models, parsing.

Author Biographies

L. Alimtayevа, Al-Farabi Kazakh National University, Almaty, Kazakhstan

Alimtaeva Lazzat – Candidate of Philological Sciences, Al-Farabi Kazakh National University (Kazakhstan, Almaty, *e-mail: alimtayeva.lazzat@gmail.com);

D. Tokmyrzayev, Al-Farabi Kazakh National University, Almaty, Kazakhstan

Tokmyrzaev Darkhan – Programmer, Al-Farabi Kazakh National University (Almaty, Kazakhstan, e-mail: dark.han@mail.ru);

К. Pirmanova, Al-Farabi Kazakh National University, Almaty, Kazakhstan

Pirmanova Kunsulu – PhD, Postdoctoral Researcher, Al-Farabi Kazakh National University (Kazakhstan, Almaty, e-mail: kunsulu.pirmanova@mail.ru).

Downloads

How to Cite

Alimtayevа Л., Tokmyrzayev Д., & Pirmanova К. (2025). Experience of syntactic annotation of Turkic languages. Eurasian Journal of Philology. Science and Education, 199(3), 17–29. https://doi.org/10.26577/EJPh202519932