<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2518-1092</journal-id><journal-title-group><journal-title>Научный результат. Информационные технологии</journal-title></journal-title-group><issn pub-type="epub">2518-1092</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2518-1092-2026-11-1-0-4</article-id><article-id pub-id-type="publisher-id">4098</article-id><article-categories><subj-group subj-group-type="heading"><subject>ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ И ПРИНЯТИЕ РЕШЕНИЙ</subject></subj-group></article-categories><title-group><article-title>&lt;strong&gt;АНАЛИЗ ПРОСОДИЧЕСКИХ ПАРАМЕТРОВ ЭМОЦИОНАЛЬНО ОКРАШЕННОЙ РЕЧИ&lt;/strong&gt;</article-title><trans-title-group xml:lang="en"><trans-title>&lt;strong&gt;ANALYSIS OF PROSODIC PARAMETERS&amp;nbsp;OF EMOTIONALLY COLORED SPEECH&lt;/strong&gt;</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Балабанова</surname><given-names>Татьяна Николаевна</given-names></name><name xml:lang="en"><surname>Balabanova</surname><given-names>Tatiana Nikolaevna</given-names></name></name-alternatives><email>sozonova@bsuedu.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Белов</surname><given-names>Александр Сергеевич</given-names></name><name xml:lang="en"><surname>Belov</surname><given-names>Alexander Sergeevich</given-names></name></name-alternatives><email>belov_as@bsu.edu.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Пашков</surname><given-names>Александр Сергеевич</given-names></name><name xml:lang="en"><surname>Pashkov</surname><given-names>Alexander Sergeevich</given-names></name></name-alternatives><email>Pogosad@yandex.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Маматов</surname><given-names>Евгений Михайлович</given-names></name><name xml:lang="en"><surname>Mamatov</surname><given-names>Evgeny Mikhailovich</given-names></name></name-alternatives><email>mamatov@bsuedu.ru</email></contrib></contrib-group><pub-date pub-type="epub"><year>2026</year></pub-date><volume>11</volume><issue>1</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/information/2026/1/НР.ИТ_11.1_4.pdf" /><abstract xml:lang="ru"><p>В работе представлено исследование просодических параметров эмоционально окрашенной речи на русском языке. Целью исследования является выявление наиболее информативных акустических признаков, позволяющих различать эмоциональные состояния говорящего. В качестве экспериментальных данных использовались аудиозаписи из корпуса эмоциональной речи Dusha, включающие четыре эмоциональных состояния: злость, радость, грусть и нейтральную речь. Всего было проанализировано 240 аудиофайлов, содержащих записи мужской и женской речи.

В работе были извлечены и исследованы просодические характеристики речевого сигнала, включающие параметры высоты основного тона, энергетические, темпоральные и фонационные признаки. Для анализа данных применялся комплекс статистических методов и методов машинного обучения, включающий корреляционный анализ, оценку важности признаков с использованием алгоритма Random Forest, а также анализ главных компонент (Principal Component Analysis (PCA)).

Результаты эксперимента показали, что наибольшую информативность для распознавания эмоций в речи имеют энергетические и интонационные характеристики сигнала, в частности средняя энергия речи, вариативность частоты основного тона, темп речи и среднее значение F0. Проведённый анализ позволил выделить компактное пространство признаков и выявить характерные акустические профили для различных эмоциональных состояний. Полученные результаты могут быть использованы при разработке систем автоматического распознавания эмоций в речевых сигналах и интеллектуальных речевых интерфейсов.</p></abstract><trans-abstract xml:lang="en"><p>This paper presents a study of prosodic parameters of emotionally colored speech in the Russian language. The aim of the study is to identify the most informative acoustic features that allow distinguishing the emotional state of a speaker. The experimental data consisted of audio recordings from the Dusha emotional speech dataset, including four emotional states: anger, joy, sadness, and neutral speech. In total, 240 audio recordings of both male and female speakers were analyzed.

The study focused on extracting and analyzing prosodic characteristics of speech signals, including pitch-related, energy, temporal, and phonation features. A combination of statistical analysis and machine learning methods was applied, including correlation analysis, feature importance estimation using the Random Forest algorithm, and Principal Component Analysis (PCA).

The experimental results demonstrate that energy and pitch-related characteristics of speech are the most informative features for emotion recognition. In particular, mean signal energy, variability of the fundamental frequency, speech rate, and mean F0 showed the highest contribution to emotion classification. The analysis allowed identifying a compact feature space and revealing characteristic acoustic profiles for different emotional states. The obtained results can be used in the development of automatic speech emotion recognition systems and intelligent speech-based human&amp;ndash;computer interaction technologies.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>просодические параметры речи</kwd><kwd>эмоциональная речь</kwd><kwd>распознавание эмоций</kwd><kwd>анализ речевых сигналов</kwd><kwd>частота основного тона</kwd><kwd>машинное обучение</kwd><kwd>Random Forest</kwd><kwd>анализ главных компонент</kwd><kwd>акустические признаки</kwd></kwd-group><kwd-group xml:lang="en"><kwd>prosodic parameters</kwd><kwd>emotional speech</kwd><kwd>speech emotion recognition</kwd><kwd>speech signal analysis</kwd><kwd>fundamental frequency</kwd><kwd>machine learning</kwd><kwd>Random Forest</kwd><kwd>principal component analysis</kwd><kwd>acoustic features</kwd></kwd-group></article-meta></front><back><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>1.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Scherer K.R. Vocal communication of emotion: A review of research paradigms // Speech Communication. &amp;ndash; 2003. &amp;ndash; Vol. 40, № 1&amp;ndash;2. &amp;ndash; P. 227-256.</mixed-citation></ref><ref id="B2"><mixed-citation>2.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Scherer K.R., Wallbott H.G. Evidence for universality and cultural variation of emotional expression in voice // Journal of Cross-Cultural Psychology. &amp;ndash; 1994. &amp;ndash; Vol. 25, № 1. &amp;ndash; P. 92-110.</mixed-citation></ref><ref id="B3"><mixed-citation>3.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; B&amp;auml;nziger T., Scherer K.R. The role of intonation in emotional expressions // Speech Communication. &amp;ndash; 2005. &amp;ndash; Vol. 46. &amp;ndash; P. 252-267.</mixed-citation></ref><ref id="B4"><mixed-citation>4.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Schuller B., Batliner A. Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. &amp;ndash; Chichester: Wiley, 2014. &amp;ndash; 324 p.</mixed-citation></ref><ref id="B5"><mixed-citation>5.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Schr&amp;ouml;der M. Emotional speech synthesis: A review // Proceedings of the European Conference on Speech Communication and Technology. &amp;ndash; Geneva, 2003. &amp;ndash; P. 561-564.</mixed-citation></ref><ref id="B6"><mixed-citation>6.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Busso C., Bulut M., Narayanan S. Toward effective automatic recognition systems of emotion in speech&amp;nbsp;// IEEE Transactions on Audio, Speech, and Language Processing. &amp;ndash; 2009. &amp;ndash; Vol. 17, № 5. &amp;ndash; P. 846-859.</mixed-citation></ref><ref id="B7"><mixed-citation>7.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Narayanan S., Busso C. Analysis of emotional speech: A review // IEEE Signal Processing Magazine. &amp;ndash; 2011. &amp;ndash; Vol. 28, № 5. &amp;ndash; P. 98-112.</mixed-citation></ref><ref id="B8"><mixed-citation>8.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Ekman P. An argument for basic emotions // Cognition and Emotion. &amp;ndash; 1992. &amp;ndash; Vol. 6, № 3&amp;ndash;4. &amp;ndash;</mixed-citation></ref><ref id="B9"><mixed-citation>P. 169-200.</mixed-citation></ref><ref id="B10"><mixed-citation>9.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Cowie R., Douglas-Cowie E. Emotion recognition in human-computer interaction // IEEE Signal Processing Magazine. &amp;ndash; 2001. &amp;ndash; Vol. 18, № 1. &amp;ndash; P. 32-80.</mixed-citation></ref><ref id="B11"><mixed-citation>10.&amp;nbsp;&amp;nbsp; Ververidis D., Kotropoulos C. Emotional speech recognition: Resources, features and methods // Speech Communication. &amp;ndash; 2006. &amp;ndash; Vol. 48, № 9. &amp;ndash; P. 1162-1181.</mixed-citation></ref><ref id="B12"><mixed-citation>11.&amp;nbsp;&amp;nbsp; Rabiner L., Juang B.-H. Fundamentals of Speech Recognition. &amp;ndash; New Jersey: Prentice Hall, 1993. &amp;ndash; 507&amp;nbsp;p.</mixed-citation></ref><ref id="B13"><mixed-citation>12.&amp;nbsp;&amp;nbsp; Murray I., Arnott J. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion // Journal of the Acoustical Society of America. &amp;ndash; 1993. &amp;ndash; Vol. 93, № 2. &amp;ndash; P. 1097&amp;ndash;1108.</mixed-citation></ref><ref id="B14"><mixed-citation>13.&amp;nbsp;&amp;nbsp; Banse R., Scherer K.R. Acoustic profiles in vocal emotion expression // Journal of Personality and Social Psychology. &amp;ndash; 1996. &amp;ndash; Vol. 70, № 3. &amp;ndash; P. 614-636.</mixed-citation></ref><ref id="B15"><mixed-citation>14.&amp;nbsp;&amp;nbsp; El Ayadi M., Kamel M., Karray F. Survey on speech emotion recognition: Features, classification schemes and databases // Pattern Recognition. &amp;ndash; 2011. &amp;ndash; Vol. 44, № 3. &amp;ndash; P. 572-587.</mixed-citation></ref><ref id="B16"><mixed-citation>15.&amp;nbsp;&amp;nbsp; Schuller B. Speech emotion recognition: Two decades in a nutshell, benchmarks and ongoing trends // Communications of the ACM. &amp;ndash; 2018. &amp;ndash; Vol. 61, № 5. &amp;ndash; P. 90-99.</mixed-citation></ref><ref id="B17"><mixed-citation>16.&amp;nbsp;&amp;nbsp; Latif S., Rana R., Qadir J., Epps J. Speech emotion recognition: State-of-the-art review // IEEE Access.&amp;nbsp;&amp;ndash; 2021. &amp;ndash; Vol. 9. &amp;ndash; P. 114509-114539.</mixed-citation></ref><ref id="B18"><mixed-citation>17.&amp;nbsp;&amp;nbsp; Neumann M., Vu N.T. Improving speech emotion recognition with unsupervised representation learning on unlabeled speech // IEEE/ACM Transactions on Audio, Speech, and Language Processing. &amp;ndash; 2021. &amp;ndash; Vol.&amp;nbsp;29.&amp;nbsp;&amp;ndash; P. 2388-2399.</mixed-citation></ref><ref id="B19"><mixed-citation>18.&amp;nbsp;&amp;nbsp; Pepino L., Riera P., Ferrer L. Emotion recognition from speech using wav2vec 2.0 embeddings // Proceedings of the Interspeech Conference. &amp;ndash; 2021. &amp;ndash; P. 3400-3404.</mixed-citation></ref><ref id="B20"><mixed-citation>19.&amp;nbsp;&amp;nbsp; Wagner J., Triantafyllopoulos A., Schuller B. Deep learning in paralinguistics: Recent trends and perspectives // IEEE Signal Processing Magazine. &amp;ndash; 2023. &amp;ndash; Vol. 40, № 3. &amp;ndash; P. 104-118.</mixed-citation></ref><ref id="B21"><mixed-citation>20.&amp;nbsp;&amp;nbsp; Zhang Z., Deng J., Schuller B. Advances in speech emotion recognition: A survey // IEEE Transactions on Affective Computing. &amp;ndash; 2024. &amp;ndash; Vol. 15, № 1. &amp;ndash; P. 123-139.</mixed-citation></ref></ref-list></back></article>