DOI: 10.18413/2518-1092-2026-11-1-0-3

SYSTEM ARCHITECTURE FOR ASR OF AGGLUTINATIVE LOW-RESOURCE LANGUAGES

Olga Viktorovna Timchenko
Darya Konstantinovna Alekseeva
Zalina Khamidbievna Abregova
Valeriya Andreevna Grechko

The relevance of the research is driven by the need to overcome the digital divide, which is particularly acute for low-resource languages. While speakers of widely spoken languages actively use voice assistants, transcription systems, and other speech technologies, small indigenous peoples are left behind in the digital progress. This inequality deprives people of access to modern means of communication, education, and information in their native language, leading to their further marginalization and accelerating the process of language extinction. The development of specialized solutions for automatic speech recognition (ASR) under low-resource conditions is a key step towards expanding technological accessibility. The article addresses the problem
of developing automatic speech recognition (ASR) systems for low-resource languages, specifically Kabardian. It presents a comprehensive approach, including the adaptation of the Massively Multilingual Speech (MMS) model, data preprocessing, as well as the development and integration of language models for post-processing. The main focus is on the MMS model architecture, based on Wav2Vec 2.0, and its modification using Language-Specific Adapter Heads (LSAH), which enables efficient fine-tuning of the model on limited datasets. The stages of audio and text data preprocessing are described. The architectures and results of applying n-gram (3-gram, 5-gram) and neural network (mT5-base) language models for correcting errors in the ASR output are considered. The practical significance of the work is confirmed by the creation of a functional open-source system with a web interface on the Hugging Face Spaces platform, demonstrating the feasibility of building effective ASR solutions for minority languages.

Keywords: automatic speech recognition (ASR), low-resource languages, Kabardian language, MMS (Massively Multilingual Speech), Wav2Vec 2.0, adapters, language models, post-processing, n-grams, mT5.

Number of views: 244 (view statistics)

Количество скачиваний: 881

Full text (PDF)Скачать XML To articles list

Information for citation:

Timchenko O.V., Alekseeva D.K., Abregova Z.Kh., Grechko V.A. System Architecture for ASR of Agglutinative Low-Resource Languages // Research result. Information technologies. – Т.11, №1, 2026. – P. 20-28. DOI: 10.18413/2518-1092-2026-11-1-0-3

User comments
Reference lists

While nobody left any comments to this publication.
You can be first.

1. Alekseeva D.K. Technologies of automatic speech recognition in low-resource minority languages of the North Caucasus / D.K. Alekseeva, O.V. Timchenko // Naukosphere. – 2024.

2. Kipyatkova I.S., Kagirov, I.A. System of automatic recognition of Karelian speech / I.S. Kipyatkova, I.A. Kagirova // Information and control systems. – 2023. – Vol. 3. – P. 16-25.

3. Kuzmin E.I. Modern problems of preservation and development of minority languages in the context of multilingualism in Russia and in the world: solutions and prospects / E.I. Kuzmin // University book. – 2022. – URL: https://www.unkniga.ru/kultura/13442-sovremennye-problemysohraneniya-i-razvitiyaminoritarnyh-yazykov-v-usloviyahmnogoyazychiya.html.

4. Orken M. Study of transformer-based end-to-end speech recognition system for Kazakh language / M. Orken, O. Dina, A. Keylan [et al.] // Sci Rep. – 2022. – Vol. 12. – Pp. 8337.

5. Baevski A. Wav2vec 2.0: a framework for self-supervised learning of speech representations / A. Baevski [et al.] // Advances in Neural Information Processing Systems. – 2020.

6. Boosting Wav2Vec2 with N-Grams in Transformers // Hugging Face Blog. – URL: https://huggingface.co/blog/wav2vec2with-ngram (date of access: 17.06.2025).

7. Wang H. Understanding knowledge transferability for transfer learning: a survey / H. Wang [et al.] // ACM Comput. Surv. – 2025. – Vol. 1, No. 1. – July. – 35 p. DOI: 10.1145/XXXXXXX.XXXXXXX.

8. Dialectal diversity and its effect on the language model landscape // Appen blog. – URL: https://www.appen.com/blog/pulseoflanguageevolution (date of access: 10.03.2025).

9. Fine-tuning MMS adapter models for multi-lingual ASR // Hugging Face Blog. – URL: https://huggingface.co/blog/mmsadapters. (date of access: 04/08/2025).

10. Hou W. Exploiting adapters for cross-lingual low-resource speech recognition / W. Hou [et al.] // IEEE/ACM Transactions on Audio, Speech, and Language Processing. – 2021.

11. Pratap V. Scaling speech technology to 1,000+ languages / V. Pratap [et al.] // Journal of Machine Learning Research. – 2024.

12. Protasov V. Super donors and super recipients: studying cross-lingual transfer between high-resource and low-resource languages / V. Protasov [et al.] // Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages. – 2024. – Pp. 94-108.

13. Supriyono A. Advancements in Natural Language Processing: Implications, Challenges, and Future Directions / A. Supriyono [et al.] // Telematics and Informatics Reports. – 2024. – Vol. 16. – Art. no. 100173. DOI: 10.1016/j.teler.

14. Transfer learning with Keras // Neurohive.io. – Access mode: https://neurohive.io/ru/tutorial/transfer-learningkeras (date of access: 23.02.2025).

15. Latif S. Transformers in Speech Processing: A Survey / S. Latif [et al.] // 2023. – URL: https://arxiv.org/abs/2303.11607.

16. Babu A. XLS-R: SelfSupervised Cross-Lingual Speech Representation Learning at Scale / A. Babu [et al.] // Proc. Interspeech. – 2022. – Pp. 2278-2282.

17. Xue L. MT5: A Massively Multilingual PreTrained TexttoText Transformer / L. Xue [et al.] // ArXiv preprint arXiv:2010.11934. – 2020.

All journals

Send article

Research result. Information technologies is included in the scientific database of the RINTs (license agreement No. 765-12/2014 dated 08.12.2014).

Журнал включен в перечень рецензируемых научных изданий, рекомендуемых ВАК

The journal is indexed by the following scientific databases and platforms

Research Result. Research result. Information technologies (ISSN 2518-1092)

The journal materials and website are licensed under Creative Commons «Attribution» 4.0 International.

The Founder: Federal State Autonomous Educational Institution of Higher Education "Belgorod National Research University"The Founder’s address: 85 Pobedy Street, Belgorod, the Belgorod region, 308015, Russia

The Publisher: Federal State Autonomous Educational Institution of HigherEducation "Belgorod National Research University" The Founder’s address:85 Pobedy Street, Belgorod, the Belgorod region, 308015, Russia

Editors Office: chief editor Chernomorets Andrey Alekseevich, e-mail: RR_IT@bsuedu.ru, phone: +7 (4722) 30-13-92.

Registered by the Federal Service for Supervision of Communications, Information Technology and Mass Media (Roskomnadzor)

Certificate

Info letter (Russian)

Order No. 1097-OD from 15.11.2023 "On approval of the Regulations for the publication of scientific journals of Belgorod State National Research University"

Order No. 144-OD from 16.03.2026 "On approval of the composition of the Editorial Board of the journal "Research Result. Information technology""

Order No. 145-OD dated 16.03.2026 "On approval of the Charter of the editorial board of the mass media of scientific journal "Research Result. Information Technologies"

Charter of the editorial board of the mass media "Research result. Information technologies"

Have questions?
You can write to us:

✉ Site administration

✉ Content manager

✉ Executive Secretary