Digital Resources and AI Technologies
for the Sustainability of the Latvian Language

Project updates


28.01.2026

Researchers from the DigiLATE project at the University of Latvia Institute of Mathematics and Computer Science participated in the international scientific conference “Grammar and Semantics”, organised by the Department of Latvian and Baltic Studies at the Faculty of Humanities, University of Latvia. Gunta Nešpore-Bērzkalne presented the paper “Variants of Multiword Lexemes in the Electronic Dictionary ‘Tēzaurs’”; Baiba Valkovska and Roberts Darģis presented a poster “Word Sketches in the National Corpus Collection”; and doctoral student Agute Klints presented a poster “Hierarchical Relations in the Latvian Lexical Network.”

22.12.2025

The agreement for the implementation of the DigiLATE project was signed.

Project information

The project "Digital Resources and AI Technologies for the Sustainability of the Latvian Language” (DigiLATE) is implemented within the framework of the National Research Programme "Letonika – Fostering a Latvian and European Society".

Project No: VPP-IZM-Letonika-2025/1-0004
Implementation period: 22.12.2025.–21.12.2028.
Project funding: 1 320 600 EUR
Funded by: Latvian Council of Science of the Ministry of Education and Science

Project partners: Institute of Mathematics and Computer Science of the University of Latvia (leading partner), University of Latvia Faculty of Humanities, Riga Technical University Rēzekne Academy

Project head: Ilze Auziņa (IMCS)

Contacts: [email protected] (IMCS)

Summary

The "Digital Resources and AI Technologies for the Sustainability of the Latvian Language" (DigiLATE) project aims to ensure the long-term sustainability of the Latvian language in a digital environment. This will be achieved by developing essential language resources for Latvian, enhancing digital research infrastructure and creating innovative, inclusive artificial intelligence solutions. Bringing together Latvia's leading institutions in linguistics, computational linguistics and digital humanities, the project intends to create substantial digital resources for Latvian and Latgalian. Specifically, DigiLATE plans to create and analyse speech recognition and synthesis systems; evaluate large language models for Latvian language applications; create new speech, text, and sign language corpora; and improve the main Latvian language resource platforms, Tēzaurs.lv and Korpuss.lv.

To continue developing modern language data, the project also plans to conduct linguistic research on the syntactic and prosodic marking of spontaneous speech, as well as research to advance natural language processing technology for the Latgalian language. DigiLATE's innovations in linguistics and artificial intelligence technologies will strengthen Latvia's position in digital humanities and ensure the development of citizen science and inclusive artificial intelligence solutions for people with special needs. Project results will be made available as open data, in accordance with FAIR principles, and integrated into European research infrastructures, such as CLARIN and DARIAH ERIC.

The tasks of DigiLATE are (a) research and development of generative AI for the use in Latvian tasks; (b) development of digital resources for Latvian, ensuring their integration into European language resource repositories; (c) development of Latvian language technology solutions, including solutions for people with disabilities; (d) development of Latvian sign language technologies.

Research directions:

WP1 Evaluation and Adaptation of AI Models for Latvian
WP2 Development of Latvian Language Resources and Integration into European Multilingual Initiatives and Infrastructures
WP3 Research for the Development of New Language Resources and Tools
WP4 Infrastructure Development of Digital Resources and Tools for Latvian
WP5 Development of Latvian Sign Language Resources and Pilot Solutions




IMCSULlogoLV.svg

download__5_-1.jpg