<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2518-1092</journal-id><journal-title-group><journal-title>Research result. Information technologies</journal-title></journal-title-group><issn pub-type="epub">2518-1092</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2518-1092-2026-11-1-0-4</article-id><article-id pub-id-type="publisher-id">4098</article-id><article-categories><subj-group subj-group-type="heading"><subject>ARTIFICIAL INTELLIGENCE AND DECISION MAKING</subject></subj-group></article-categories><title-group><article-title>&lt;strong&gt;ANALYSIS OF PROSODIC PARAMETERS&amp;nbsp;OF EMOTIONALLY COLORED SPEECH&lt;/strong&gt;</article-title><trans-title-group xml:lang="en"><trans-title>&lt;strong&gt;ANALYSIS OF PROSODIC PARAMETERS&amp;nbsp;OF EMOTIONALLY COLORED SPEECH&lt;/strong&gt;</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Balabanova</surname><given-names>Tatiana Nikolaevna</given-names></name><name xml:lang="en"><surname>Balabanova</surname><given-names>Tatiana Nikolaevna</given-names></name></name-alternatives><email>sozonova@bsuedu.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Belov</surname><given-names>Alexander Sergeevich</given-names></name><name xml:lang="en"><surname>Belov</surname><given-names>Alexander Sergeevich</given-names></name></name-alternatives><email>belov_as@bsu.edu.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Pashkov</surname><given-names>Alexander Sergeevich</given-names></name><name xml:lang="en"><surname>Pashkov</surname><given-names>Alexander Sergeevich</given-names></name></name-alternatives><email>Pogosad@yandex.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Mamatov</surname><given-names>Evgeny Mikhailovich</given-names></name><name xml:lang="en"><surname>Mamatov</surname><given-names>Evgeny Mikhailovich</given-names></name></name-alternatives><email>mamatov@bsuedu.ru</email></contrib></contrib-group><pub-date pub-type="epub"><year>2026</year></pub-date><volume>11</volume><issue>1</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/information/2026/1/НР.ИТ_11.1_4.pdf" /><abstract xml:lang="ru"><p>This paper presents a study of prosodic parameters of emotionally colored speech in the Russian language. The aim of the study is to identify the most informative acoustic features that allow distinguishing the emotional state of a speaker. The experimental data consisted of audio recordings from the Dusha emotional speech dataset, including four emotional states: anger, joy, sadness, and neutral speech. In total, 240 audio recordings of both male and female speakers were analyzed.

The study focused on extracting and analyzing prosodic characteristics of speech signals, including pitch-related, energy, temporal, and phonation features. A combination of statistical analysis and machine learning methods was applied, including correlation analysis, feature importance estimation using the Random Forest algorithm, and Principal Component Analysis (PCA).

The experimental results demonstrate that energy and pitch-related characteristics of speech are the most informative features for emotion recognition. In particular, mean signal energy, variability of the fundamental frequency, speech rate, and mean F0 showed the highest contribution to emotion classification. The analysis allowed identifying a compact feature space and revealing characteristic acoustic profiles for different emotional states. The obtained results can be used in the development of automatic speech emotion recognition systems and intelligent speech-based human&amp;ndash;computer interaction technologies.</p></abstract><trans-abstract xml:lang="en"><p>This paper presents a study of prosodic parameters of emotionally colored speech in the Russian language. The aim of the study is to identify the most informative acoustic features that allow distinguishing the emotional state of a speaker. The experimental data consisted of audio recordings from the Dusha emotional speech dataset, including four emotional states: anger, joy, sadness, and neutral speech. In total, 240 audio recordings of both male and female speakers were analyzed.

The study focused on extracting and analyzing prosodic characteristics of speech signals, including pitch-related, energy, temporal, and phonation features. A combination of statistical analysis and machine learning methods was applied, including correlation analysis, feature importance estimation using the Random Forest algorithm, and Principal Component Analysis (PCA).

The experimental results demonstrate that energy and pitch-related characteristics of speech are the most informative features for emotion recognition. In particular, mean signal energy, variability of the fundamental frequency, speech rate, and mean F0 showed the highest contribution to emotion classification. The analysis allowed identifying a compact feature space and revealing characteristic acoustic profiles for different emotional states. The obtained results can be used in the development of automatic speech emotion recognition systems and intelligent speech-based human&amp;ndash;computer interaction technologies.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>prosodic parameters</kwd><kwd>emotional speech</kwd><kwd>speech emotion recognition</kwd><kwd>speech signal analysis</kwd><kwd>fundamental frequency</kwd><kwd>machine learning</kwd><kwd>Random Forest</kwd><kwd>principal component analysis</kwd><kwd>acoustic features</kwd></kwd-group><kwd-group xml:lang="en"><kwd>prosodic parameters</kwd><kwd>emotional speech</kwd><kwd>speech emotion recognition</kwd><kwd>speech signal analysis</kwd><kwd>fundamental frequency</kwd><kwd>machine learning</kwd><kwd>Random Forest</kwd><kwd>principal component analysis</kwd><kwd>acoustic features</kwd></kwd-group></article-meta></front><back><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>1.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Scherer K.R. Vocal communication of emotion: A review of research paradigms // Speech Communication. &amp;ndash; 2003. &amp;ndash; Vol. 40, № 1&amp;ndash;2. &amp;ndash; P. 227-256.</mixed-citation></ref><ref id="B2"><mixed-citation>2.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Scherer K.R., Wallbott H.G. Evidence for universality and cultural variation of emotional expression in voice // Journal of Cross-Cultural Psychology. &amp;ndash; 1994. &amp;ndash; Vol. 25, № 1. &amp;ndash; P. 92-110.</mixed-citation></ref><ref id="B3"><mixed-citation>3.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; B&amp;auml;nziger T., Scherer K.R. The role of intonation in emotional expressions // Speech Communication. &amp;ndash; 2005. &amp;ndash; Vol. 46. &amp;ndash; P. 252-267.</mixed-citation></ref><ref id="B4"><mixed-citation>4.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Schuller B., Batliner A. Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. &amp;ndash; Chichester: Wiley, 2014. &amp;ndash; 324 p.</mixed-citation></ref><ref id="B5"><mixed-citation>5.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Schr&amp;ouml;der M. Emotional speech synthesis: A review // Proceedings of the European Conference on Speech Communication and Technology. &amp;ndash; Geneva, 2003. &amp;ndash; P. 561-564.</mixed-citation></ref><ref id="B6"><mixed-citation>6.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Busso C., Bulut M., Narayanan S. Toward effective automatic recognition systems of emotion in speech&amp;nbsp;// IEEE Transactions on Audio, Speech, and Language Processing. &amp;ndash; 2009. &amp;ndash; Vol. 17, № 5. &amp;ndash; P. 846-859.</mixed-citation></ref><ref id="B7"><mixed-citation>7.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Narayanan S., Busso C. Analysis of emotional speech: A review // IEEE Signal Processing Magazine. &amp;ndash; 2011. &amp;ndash; Vol. 28, № 5. &amp;ndash; P. 98-112.</mixed-citation></ref><ref id="B8"><mixed-citation>8.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Ekman P. An argument for basic emotions // Cognition and Emotion. &amp;ndash; 1992. &amp;ndash; Vol. 6, № 3&amp;ndash;4. &amp;ndash;</mixed-citation></ref><ref id="B9"><mixed-citation>P. 169-200.</mixed-citation></ref><ref id="B10"><mixed-citation>9.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Cowie R., Douglas-Cowie E. Emotion recognition in human-computer interaction // IEEE Signal Processing Magazine. &amp;ndash; 2001. &amp;ndash; Vol. 18, № 1. &amp;ndash; P. 32-80.</mixed-citation></ref><ref id="B11"><mixed-citation>10.&amp;nbsp;&amp;nbsp; Ververidis D., Kotropoulos C. Emotional speech recognition: Resources, features and methods // Speech Communication. &amp;ndash; 2006. &amp;ndash; Vol. 48, № 9. &amp;ndash; P. 1162-1181.</mixed-citation></ref><ref id="B12"><mixed-citation>11.&amp;nbsp;&amp;nbsp; Rabiner L., Juang B.-H. Fundamentals of Speech Recognition. &amp;ndash; New Jersey: Prentice Hall, 1993. &amp;ndash; 507&amp;nbsp;p.</mixed-citation></ref><ref id="B13"><mixed-citation>12.&amp;nbsp;&amp;nbsp; Murray I., Arnott J. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion // Journal of the Acoustical Society of America. &amp;ndash; 1993. &amp;ndash; Vol. 93, № 2. &amp;ndash; P. 1097&amp;ndash;1108.</mixed-citation></ref><ref id="B14"><mixed-citation>13.&amp;nbsp;&amp;nbsp; Banse R., Scherer K.R. Acoustic profiles in vocal emotion expression // Journal of Personality and Social Psychology. &amp;ndash; 1996. &amp;ndash; Vol. 70, № 3. &amp;ndash; P. 614-636.</mixed-citation></ref><ref id="B15"><mixed-citation>14.&amp;nbsp;&amp;nbsp; El Ayadi M., Kamel M., Karray F. Survey on speech emotion recognition: Features, classification schemes and databases // Pattern Recognition. &amp;ndash; 2011. &amp;ndash; Vol. 44, № 3. &amp;ndash; P. 572-587.</mixed-citation></ref><ref id="B16"><mixed-citation>15.&amp;nbsp;&amp;nbsp; Schuller B. Speech emotion recognition: Two decades in a nutshell, benchmarks and ongoing trends // Communications of the ACM. &amp;ndash; 2018. &amp;ndash; Vol. 61, № 5. &amp;ndash; P. 90-99.</mixed-citation></ref><ref id="B17"><mixed-citation>16.&amp;nbsp;&amp;nbsp; Latif S., Rana R., Qadir J., Epps J. Speech emotion recognition: State-of-the-art review // IEEE Access.&amp;nbsp;&amp;ndash; 2021. &amp;ndash; Vol. 9. &amp;ndash; P. 114509-114539.</mixed-citation></ref><ref id="B18"><mixed-citation>17.&amp;nbsp;&amp;nbsp; Neumann M., Vu N.T. Improving speech emotion recognition with unsupervised representation learning on unlabeled speech // IEEE/ACM Transactions on Audio, Speech, and Language Processing. &amp;ndash; 2021. &amp;ndash; Vol.&amp;nbsp;29.&amp;nbsp;&amp;ndash; P. 2388-2399.</mixed-citation></ref><ref id="B19"><mixed-citation>18.&amp;nbsp;&amp;nbsp; Pepino L., Riera P., Ferrer L. Emotion recognition from speech using wav2vec 2.0 embeddings // Proceedings of the Interspeech Conference. &amp;ndash; 2021. &amp;ndash; P. 3400-3404.</mixed-citation></ref><ref id="B20"><mixed-citation>19.&amp;nbsp;&amp;nbsp; Wagner J., Triantafyllopoulos A., Schuller B. Deep learning in paralinguistics: Recent trends and perspectives // IEEE Signal Processing Magazine. &amp;ndash; 2023. &amp;ndash; Vol. 40, № 3. &amp;ndash; P. 104-118.</mixed-citation></ref><ref id="B21"><mixed-citation>20.&amp;nbsp;&amp;nbsp; Zhang Z., Deng J., Schuller B. Advances in speech emotion recognition: A survey // IEEE Transactions on Affective Computing. &amp;ndash; 2024. &amp;ndash; Vol. 15, № 1. &amp;ndash; P. 123-139.</mixed-citation></ref></ref-list></back></article>