<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2518-1092</journal-id><journal-title-group><journal-title>Research result. Information technologies</journal-title></journal-title-group><issn pub-type="epub">2518-1092</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2518-1092-2021-6-1-0-2</article-id><article-id pub-id-type="publisher-id">2371</article-id><article-categories><subj-group subj-group-type="heading"><subject>INFORMATION SYSTEM AND TECHNOLOGIES</subject></subj-group></article-categories><title-group><article-title>IMPLEMENTATION OF THE SPEECH ACTIVITY DETECTING ALGORITHM AT CONDUCTING PARALINGUISTIC ANALYSIS</article-title><trans-title-group xml:lang="en"><trans-title>IMPLEMENTATION OF THE SPEECH ACTIVITY DETECTING ALGORITHM AT CONDUCTING PARALINGUISTIC ANALYSIS</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Diachenko</surname><given-names>Anna Vitalievna</given-names></name><name xml:lang="en"><surname>Diachenko</surname><given-names>Anna Vitalievna</given-names></name></name-alternatives><email>ayrimur@mail.ru</email></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Podolsky</surname><given-names>Dmitry Anatolievich</given-names></name><name xml:lang="en"><surname>Podolsky</surname><given-names>Dmitry Anatolievich</given-names></name></name-alternatives><email>podolsky.dmitry94@gmail.com</email></contrib></contrib-group><pub-date pub-type="epub"><year>2021</year></pub-date><volume>6</volume><issue>1</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/information/2021/1/ИТ_2.pdf" /><abstract xml:lang="ru"><p>Algorithms for the speech activity detecting are now widely used. Such algorithms are used in various tasks: transmitting a human speech stream, storing information for compressing audio recordings, for recognizing a person&amp;#39;s state in the paralinguistic analysis, etc. The goal of this work is to develop and implement an algorithm for detecting human speech activity using the Csound software environment. Recently, there are already a number of methods for human speech activity recognition, such as the speed determination algorithm, the adaptive multi rate speech detection method, the method based on the analysis of the spectral shape and energy of subbands, etc. [13, 16, 17], however, at the moment, these algorithms haven&amp;#39;t been implemented in the Csound environment. This article categorizes speech features, describes an implemented algorithm for detecting speech activity, namely, determining pauses in paralinguistic analysis of speech audio using the Hilbert transform, which reduces the complexity of the algorithm, while maintaining its accuracy. The aim of this work is to modify and implement an algorithm for detecting speech activity in a room based on the speech flow in the Csound environment for conducting paralinguistic analysis of human speech activity.</p></abstract><trans-abstract xml:lang="en"><p>Algorithms for the speech activity detecting are now widely used. Such algorithms are used in various tasks: transmitting a human speech stream, storing information for compressing audio recordings, for recognizing a person&amp;#39;s state in the paralinguistic analysis, etc. The goal of this work is to develop and implement an algorithm for detecting human speech activity using the Csound software environment. Recently, there are already a number of methods for human speech activity recognition, such as the speed determination algorithm, the adaptive multi rate speech detection method, the method based on the analysis of the spectral shape and energy of subbands, etc. [13, 16, 17], however, at the moment, these algorithms haven&amp;#39;t been implemented in the Csound environment. This article categorizes speech features, describes an implemented algorithm for detecting speech activity, namely, determining pauses in paralinguistic analysis of speech audio using the Hilbert transform, which reduces the complexity of the algorithm, while maintaining its accuracy. The aim of this work is to modify and implement an algorithm for detecting speech activity in a room based on the speech flow in the Csound environment for conducting paralinguistic analysis of human speech activity.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>voice activity detection</kwd><kwd>Csound</kwd><kwd>paralinguistic analysis</kwd><kwd>speech activity</kwd></kwd-group><kwd-group xml:lang="en"><kwd>voice activity detection</kwd><kwd>Csound</kwd><kwd>paralinguistic analysis</kwd><kwd>speech activity</kwd></kwd-group></article-meta></front><back><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>Ayvazyan OO Verbal and non-verbal communication as factors of speech development // Conference &amp;quot;Strategic directions of sustainable development of socio-economic policy of the southern region&amp;quot;. Maykop, 2012.</mixed-citation></ref><ref id="B2"><mixed-citation>Basov OO, Karpov A.A., Saitov I.A. Methodological foundations for the synthesis of polymodal infocommunication systems of public administration. Oryol: Academy of the FSO of Russia, 2015. &amp;ndash; 271 p.</mixed-citation></ref><ref id="B3"><mixed-citation>Vasilik. M.A. Para- and extralinguistic features of non-verbal communication // Elitarium. 2018. URL: www.elitarium.ru/neverbalnoe-obshhenie-temp-rech-golos-informacija-kommunikacija-intonacija-vnimanie (date of access: 16.12.2020).</mixed-citation></ref><ref id="B4"><mixed-citation>Velichko A. N., Budkov V. Yu., Karpov A. A. Analytical review of computer paralinguistic systems for automatic recognition of lies in human speech // Information and control systems. 2017. &amp;ndash; No. 5(90). &amp;ndash; P. 30-41.</mixed-citation></ref><ref id="B5"><mixed-citation>Werderber, R., K. Werderber. Psychology of communication: Secrets are effective. Interactions. M.: Prime-EVROZNAK: Olma-Press, 2003. &amp;ndash; 320 p.</mixed-citation></ref><ref id="B6"><mixed-citation>Karpov A.A., Kaya H., Salah A.A. Actual problems and achievements of systems for paralinguistic speech analysis // Scientific and technical bulletin of information technologies, mechanics and optics. 2016. &amp;ndash; Vol.&amp;nbsp;16. &amp;ndash; No. 4. &amp;ndash; PP. 581-592.</mixed-citation></ref><ref id="B7"><mixed-citation>Potapova R.K., Bobrov N.V. Main trends in the development of the interdisciplinary concept &amp;ldquo;Analysis-synthesis-analysis of speech&amp;rdquo; // Mathematical methods in engineering and technology. 2019. &amp;ndash; T.7. &amp;ndash; P. 124-129.</mixed-citation></ref><ref id="B8"><mixed-citation>Practical algorithm for determining the rate of speech for use in contact centers / Nikiforov S.N., Nikiforov D.S., Vitorsky I.I., Tanyukevich M.S. // Speech technologies. 2020. &amp;ndash; No. 1. &amp;ndash; P. 6-12.</mixed-citation></ref><ref id="B9"><mixed-citation>Simonchik K.K., Galinina O.S., Kapustin A.I. Algorithm for detecting speech activity based on pitch statistics in the problem of speaker recognition // Scientific and technical bulletins of the St. Petersburg State Polytechnic University. Computer science, telecommunications and management. 2010. No. 4 (103). P. 23-31.</mixed-citation></ref><ref id="B10"><mixed-citation>Chukhrova M.G. The relationship between the psychoemotional state of primary schoolchildren and their voice-speech characteristics // Science and society / Mat. Vseros. Scientific &amp;ndash; practical. conf. with int. participation. March 1, 2018 &amp;ndash; Novosibirsk: CHUDPO 2018. &amp;ndash; P. 99-104</mixed-citation></ref><ref id="B11"><mixed-citation>Shelukhin O.I., Lukyantsev V.G. Digital processing and transmission of speech. M.: Radio and communication, 2000, 456 p.</mixed-citation></ref><ref id="B12"><mixed-citation>Volchenkov V.A., Vityazev V.V. Methods and algorithms for using speech activity detection // Digital signal processing. &amp;ndash; 2013. &amp;ndash; No. 1. &amp;ndash; P. 54-60.</mixed-citation></ref><ref id="B13"><mixed-citation>Adil Benyassine, H.Y. Eyal Shlomot, Dominique Massaloux Su. Silence compression scheme for use with g. 729 // Digital simultaneous voice and data applications IEEE Commun. Mag., 1997. &amp;ndash; №35(9). &amp;ndash; P. 64-73.</mixed-citation></ref><ref id="B14"><mixed-citation>Boulanger R. The Csound Book: Perspectives in Software Synthesis, Sound Design, Signal Processing, and Programming. Cambridge: MIT Press, 2000. &amp;ndash; 782 p.</mixed-citation></ref><ref id="B15"><mixed-citation>Kondoz A.M. Digital Speech. Coding for Low Bit Rate Communication Systems. John Wiley &amp;amp; Sons, Ltd. 2004. &amp;ndash; 442 p.</mixed-citation></ref><ref id="B16"><mixed-citation>Prasad R. Comparison of Voice Activity Detection Algorithms for VoIP // In proc. 7th IEEE symp. on Computer science. 2005. &amp;ndash; p. 567-576.</mixed-citation></ref><ref id="B17"><mixed-citation>Sunil Kumar S.B., Sreenivasa Rao K.. Voice/non-voice detection using phase of zero frequency filtered speech signal // Speech Communication. &amp;ndash; 2016. &amp;ndash; No 81. &amp;ndash; P. 90-103.</mixed-citation></ref><ref id="B18"><mixed-citation>Vercoe B. Csound: A Manual for the Audio-Processing System. MIT Media Lab, 1995. &amp;ndash; 341p.</mixed-citation></ref></ref-list></back></article>