Список литературы

2518-1092

Research result. Information technologies

2518-1092

10.18413/2518-1092-2021-6-1-0-2

2371

INFORMATION SYSTEM AND TECHNOLOGIES

IMPLEMENTATION OF THE SPEECH ACTIVITY DETECTING ALGORITHM AT CONDUCTING PARALINGUISTIC ANALYSIS

Diachenko

Anna Vitalievna

Diachenko

Anna Vitalievna

ayrimur@mail.ru

Podolsky

Dmitry Anatolievich

Podolsky

Dmitry Anatolievich

podolsky.dmitry94@gmail.com

2021

6100

Algorithms for the speech activity detecting are now widely used. Such algorithms are used in various tasks: transmitting a human speech stream, storing information for compressing audio recordings, for recognizing a person's state in the paralinguistic analysis, etc. The goal of this work is to develop and implement an algorithm for detecting human speech activity using the Csound software environment. Recently, there are already a number of methods for human speech activity recognition, such as the speed determination algorithm, the adaptive multi rate speech detection method, the method based on the analysis of the spectral shape and energy of subbands, etc. [13, 16, 17], however, at the moment, these algorithms haven't been implemented in the Csound environment. This article categorizes speech features, describes an implemented algorithm for detecting speech activity, namely, determining pauses in paralinguistic analysis of speech audio using the Hilbert transform, which reduces the complexity of the algorithm, while maintaining its accuracy. The aim of this work is to modify and implement an algorithm for detecting speech activity in a room based on the speech flow in the Csound environment for conducting paralinguistic analysis of human speech activity.

voice activity detectionCsoundparalinguistic analysisspeech activity

Список литературы

Ayvazyan OO Verbal and non-verbal communication as factors of speech development // Conference "Strategic directions of sustainable development of socio-economic policy of the southern region". Maykop, 2012.

Basov OO, Karpov A.A., Saitov I.A. Methodological foundations for the synthesis of polymodal infocommunication systems of public administration. Oryol: Academy of the FSO of Russia, 2015. – 271 p.

Vasilik. M.A. Para- and extralinguistic features of non-verbal communication // Elitarium. 2018. URL: www.elitarium.ru/neverbalnoe-obshhenie-temp-rech-golos-informacija-kommunikacija-intonacija-vnimanie (date of access: 16.12.2020).

Velichko A. N., Budkov V. Yu., Karpov A. A. Analytical review of computer paralinguistic systems for automatic recognition of lies in human speech // Information and control systems. 2017. – No. 5(90). – P. 30-41.

Werderber, R., K. Werderber. Psychology of communication: Secrets are effective. Interactions. M.: Prime-EVROZNAK: Olma-Press, 2003. – 320 p.

Karpov A.A., Kaya H., Salah A.A. Actual problems and achievements of systems for paralinguistic speech analysis // Scientific and technical bulletin of information technologies, mechanics and optics. 2016. – Vol. 16. – No. 4. – PP. 581-592.

Potapova R.K., Bobrov N.V. Main trends in the development of the interdisciplinary concept “Analysis-synthesis-analysis of speech” // Mathematical methods in engineering and technology. 2019. – T.7. – P. 124-129.

Practical algorithm for determining the rate of speech for use in contact centers / Nikiforov S.N., Nikiforov D.S., Vitorsky I.I., Tanyukevich M.S. // Speech technologies. 2020. – No. 1. – P. 6-12.

Simonchik K.K., Galinina O.S., Kapustin A.I. Algorithm for detecting speech activity based on pitch statistics in the problem of speaker recognition // Scientific and technical bulletins of the St. Petersburg State Polytechnic University. Computer science, telecommunications and management. 2010. No. 4 (103). P. 23-31.

Chukhrova M.G. The relationship between the psychoemotional state of primary schoolchildren and their voice-speech characteristics // Science and society / Mat. Vseros. Scientific – practical. conf. with int. participation. March 1, 2018 – Novosibirsk: CHUDPO 2018. – P. 99-104

Shelukhin O.I., Lukyantsev V.G. Digital processing and transmission of speech. M.: Radio and communication, 2000, 456 p.

Volchenkov V.A., Vityazev V.V. Methods and algorithms for using speech activity detection // Digital signal processing. – 2013. – No. 1. – P. 54-60.

Adil Benyassine, H.Y. Eyal Shlomot, Dominique Massaloux Su. Silence compression scheme for use with g. 729 // Digital simultaneous voice and data applications IEEE Commun. Mag., 1997. – №35(9). – P. 64-73.

Boulanger R. The Csound Book: Perspectives in Software Synthesis, Sound Design, Signal Processing, and Programming. Cambridge: MIT Press, 2000. – 782 p.

Kondoz A.M. Digital Speech. Coding for Low Bit Rate Communication Systems. John Wiley & Sons, Ltd. 2004. – 442 p.

Prasad R. Comparison of Voice Activity Detection Algorithms for VoIP // In proc. 7th IEEE symp. on Computer science. 2005. – p. 567-576.

Sunil Kumar S.B., Sreenivasa Rao K.. Voice/non-voice detection using phase of zero frequency filtered speech signal // Speech Communication. – 2016. – No 81. – P. 90-103.

Vercoe B. Csound: A Manual for the Audio-Processing System. MIT Media Lab, 1995. – 341p.