16+
DOI: 10.18413/2518-1092-2017-2-2-40-48

THE USING OF ORTHOGONAL BASIS FOR THE STEGANOGRAPHIC CODING OF INFORMATION IN MULTIMEDIA

Abstract

This article discusses methods for steganographic encoding additional information using three different orthogonal bases. The bases are represented by the functions, which occupies a different bandwidth in the spectrum. There are comparison of the approach to the definition of DCT-coefficients with the approaches used in the methods of the spread spectrum and subband projections. The approaches of the coefficients of the implementation to ensure secrecy by adaptive determination of their value are considered. However, their value is determined based on the energy structure of the segment of the speech signal. Criteria to evaluate steganographic encoding are the secrecy and accuracy of decoding control information. As the control information is a sequence of numbers in binary form. For the proposed principles of adaptation the results of numerical experiments the estimates that determine stealth: mean square error, the distance Itakura-Saito, correlation. In the computational experiments was found the probability of error for bits at different signal-to-noise ratio. The corresponding computational experiments were carried out for all outlined approaches.


INTRODUCTION

Speech is the most common and natural method of the transmission of information between the people. For the transfer up to the distance spoken language is fixed, and they convert the result of fixation into the code sequence. In the methods of coding, it is possible to isolate a number of the characteristic operations, one of which is the removal of redundancy for decreasing the volume of the transferred code combinations. With the strong decrease of volume (high compression ratio) are possible the changes with which the reproducible speech will be essentially they will differ from the initial. Often this does not influence the transmission of information. In cases when information is important and it is necessary to ensure its authenticity, but the channel capacity of communications does not make it possible to transmit redundant information, in this case for guaranteeing the authenticity it is possible to use methods of cryptography.

The procedures of the decrease of redundancy, as the methods of cryptography are combined with the use of psychoacoustic models. Naturally, for achievement maximum compression are moved away all frequency-time components, which carry in themselves the redundancy, determined based on psychoacoustic models [1, 2]. This does not make it possible to use excess frequency-time components for coding of additional information. By additional information, we will understand the digital code, which makes it possible to determine the authenticity of speech.

MAIN PART

For the solution of the problem of coding additional information, it is proposed to use the methods, based on the mathematical approach different from that, which was used with the compression. It is worthwhile to note that for guaranteeing the durability of information coding must be accomplished in the space (further the space of coding), and decoding in other space (further the space of decoding).

Ensuring reserve one of requirements imposed to steganographic methods [1-8]. Ensuring reserve is reached when decoding in a component(s) containing the smallest share of energy (fig. 1).

Fig. 1. Audio-signal pieces: a) sound "sh"; b) a range for a sound "sh"; c) a sound "o"; d) a range for a sound "o"

At the same time, not unimportant value plays ensuring probability of an error of decoding of the hidden information, close to zero. Reduction of probability of an error of decoding can be reached thanks to coding of information in a signal component(s) the having overwhelming share of energy of rather synthesizable piece (fig. 1). In this regard, there is a need of the choice between firmness and reserve, for this purpose choose to a component in which coding is carried out. The choice of the fixed threshold or coding in in advance set number components, doesn't provide necessary reserve [9], it is visually illustrated in fig. 1. Apparently from ranges (fig. 1, b and d) sounds "o" and "ш" having different distribution of energy on a frequency axis. The choice, components need to be carried out proceeding from time-and-frequency characteristics of a piece in which reserved coding is carried out, i.e. is adapted to choose to a component for coding [1, 10]. For achievement of high reserve and reduction of probability of a mistake, adaptation under each piece, it is offered to carry out, using the average value having on a component.

Let us consider one of widespread methods of the steganography coding using decomposition of a piece of an audio-signal on DCT coefficients of a look [11, 12]:

,                                                                        (1)

, ,                             (2)

where  – value of signal amplitude; – number of DCT coefficient;  – DCT coefficient.

Results of calculation of DCT coefficients for segments of the audio signals given on fig. 1 are given below.

    

Fig. 2. DCT coefficients: a) sound "sh"; b) sound "o"

An alternative proposed method [13] choice of coefficients is underwritten method.

Among the calculated DCT coefficients (2), it is possible to select the component defined according to one of rules:

– the DCT coefficient having the minimum value:

.                                                        (3)

– the DCT coefficient the close to mean value:

.                                                    (4)

– the DCT coefficient having the maximum value:

.                                                         (5)

The operations procedure explained below allows realizing steganographic coding of bit in DCT coefficient:

Input data:

- bit of encoded information of a segment.

- segment duration.

- values of amplitudes of a segment: .

Output data:

- Values of amplitudes of a segment: .

1. Let's divide an audio signal into segments, the size  of reports.

2. According to conversion (1) we will calculate DCT coefficients for a segment, i.e. it is feasible direct DCT conversion.

3. Let's calculate energy of a segment

.                                                                        (6)

4. It agrees to one of rules (3)-(5) we will define number  of DCT coefficients in which we will realize that coding.

5. We realize coding of bit of information , by means of change of a sign of DCT coefficient:

.                                                                  (7)

where  – the operation discarding a sign у at number;  – value of DCT coefficient;

6. We realize the reverse IDCT conversion:

.                               (8)

Method of expansion of a range

The essence of a method of expansion of a range consists in addition to a piece of an initial speech signal of the pseudorandom sequence (SSp) according to expression [1, 3, 4, 14]:

 

,                                                                (9)

where  – an initial piece of speech data;  – the piece corresponding to the pseudorandom sequence; α – weight coefficient; e – the code display of binary bit of the hidden speech message determined by a formula:

 

The weight coefficient  defines reserve of system. In works [8, 9] him is offered to be chosen equal:

.                                                              (10)

Decoding of bit of control information comes from data by definition of a sign of a scalar product of a piece of data and the pseudorandom sequence:

,                                                              (11)

where  – operation of allocation of a sign.

 

Method of subband projections

Also for assessment, the model of a method of subband projections, which is carrying out reserved coding of bits of control information  in a piece of speech data  is offered [6, 14]:

,                                                        (12)

The weight coefficient  defines reserve of system. In works [6, 14] him is offered to be chosen equal:

                                                                            (13)

Decoding of control information is carried out by definition of signs of projections  for own vectors  of a subband matrix :

, ,                                                  (14)

where  – the symbol decoded by method of subband projections.

 

Reserve assessment technique

For determination of overall performance of a method, we use indicators the estimating misstatements brought in an audio-signal when coding by the offered approach. For identification of statistics, the following metrics were counted [1, 2, 7, 8, 15]:

Mean square error, MSE:

,                                                   (15)

where   - value of amplitude of the initial audio signal;  - value of amplitude of the synthesized audio signal.

Correlation :

,                                                           (16)

where  – a constant component of an initial audio-signal;  – a constant component of the synthesized audio-signal.

Changes in a time domain it is also necessary to consider distinctions in frequency area. The measure based on Itakura-Saito's distance is for this purpose used [15, 16]:

,                                                     (17)

where  – value of energy frequency components of an initial piece of data;  – value of energy frequency components of the piece of data containing additional information.

 

The measure makes sense of distance between ranges of two signals and estimates discrepancy between energy of the changed and initial piece of data. At equality of pieces of data the measure addresses in zero.

,                                         (18)

where  – a subband matrix [5];  – width of a frequency interval.

 

As the tool, allowing to make energy calculations, without passing into the frequency area, it is offered to use a mathematical apparatus of subband matrixes [4, 5]:

,                                                                          (19)

where  – the subband matrix determined by elements:

,,                                 (20)

where  – an element line item in a line of a matrix;  – an element line item in a matrix column;  – sampling rate;  – band width (in case of normalization respectively );  – central frequency (in case of normalization respectively ).

Mean squared error (MSE) measures the relative difference between the energy of segments signals in the time domain. This measure allows identifying the differences in the envelopes of the amplitudes of the segments of speech signals. The fewer changes can be made when introduced additional information, the closer the value for this score to zero [15]:

.                                                    (21)

where  – value of amplitude of the initial segment of data;  – value of amplitude of the segment of data containing additional information, N – the number of counting of the compared segments of signals.

Reliability assessment technique

Assessment of reliability of decodable information, we will carry out proceeding from probability of an error
() [1, 7]:

.                            (22)

where  - the number of encoded bits;  - operation "the amount on the module two";  – operation of separation of a sign;  – decodable bit.

Results of simulation

For check of operability of a method based on DCT-conversion, audio signal fragments with sampling rate 8 kHz and digit capacity of 16 bits were used [10]. The general duration of speech material made 23 minutes, lasting 0,032sec, (the segments, which are not containing energy - pauses, were excluded from material). As noise not repeating PSP, segments were taken. As a result of simulation it was implemented  bit, results of simulation are provided to tab. 1.

Table 1

Reliability assessment

0.001

0.01

0.1

1

1

Maximum DCT, (3, 8)

 0

 0

 0

2.1646*10-5

2

Average DCT (4, 8)

 0

 0

2,4071*10-5

0,0396

3

Minimum DCT, (5, 8)

0,1181

0,1230

0,1247

0,1252

4

SSp, (8, 9, 10)

0,1285

0,1290

0,1439

0,2133

5

SubBand, (8, 10, 12)

0,0219

0,0675

0,1803

0,3345

Results of reserve of the introduced information are given in tab. 2 for the parameters of modelling specified above.

Table 2

Reserve assessment

Choice of coefficients principle

1

Maximum, (3)

2.427 E-0

0.8472

5.41 E-01

3.712

2

Average (4)

2.875 E-3

0.9960

4.28 E-04

1.054

3

Minimum, (5)

8.341 E-8

0.9999

2.31 E-16

0.023

4

SSp

1.102 E-3

0.9923

0.14 E-03

0.031

5

SubBand

3.256 E-3

0.9931

1.21 E-16

0.003

CONCLUSIONS

The given algorithm is optimum from a position of the accounting of frequency properties of the audio-signal containing digital submission of the speech as solving the rule considers uneven distribution of energy on a frequency strip and perception of a sound by the person. Use of DCT coefficient with average value of energy, for reserved coding of information, will allow to reduce by two orders changes of energy in we synthesize an audio-signal piece{Malvar, 1992, Signal processing with lapped transforms}.{Malvar, 1992, Signal processing with lapped transforms}

Reference lists

  1. Fridrich, J. 2012. Steganography in digital media: Principles, algorithms, and applications. Steganography in Digital Media, P. 1-441.
  2. Furui, Sadaoki. 2000. Digital speech processing, synthesis, and recognition. 2nd ed., rev. and expanded
  3. Cox I. J., Kilian J., Leighton F. T., Shamoon T. Secure spread spectrum watermarking for multimedia // IEEE transactions on image processing. ‒ 1997. ‒ V. 6, № 12. ‒ P. 1673-1687.
  4. Nedeljko Cvejic, Tapio Seppanen. 2004. Spread spectrum audio watermarking using frequency hopping and attack characterization. Signal Processing 84. P. 207 – 213.
  5. Lykholob, P.G. Research of sensitivity of some measures of quality assessment of hidden information in the audio content [Текст] // Medvedeva, A.A., Likhogodina, E.S., Mishina, O.O. RESEARCH RESULT. Information technologies. №4. v.1. 2016. pp.21-25 URL: http://rr.bsu.edu.ru/media/information/2016/4/3_it.pdf DOI: 10.18413/2518-1092-2016-1-4-21-24
  6. Zhilyakov E.G., Pashintsev V.P., Belov S.P., Likholob P.G. About the secretive method of encoding control information in the speech data// Infocommunicatsionnye technologii. ‒ Samara, 2015. ‒ V. 13, № 3. ‒ P. 325-333.
  7. Fridrich, J. 2012. Steganography in digital media: Principles, algorithms, and applications. Steganography in Digital Media, P. 1-441.
  8. Furui, Sadaoki. 2000. Digital speech processing, synthesis, and recognition. 2nd ed., rev. and expanded
  9. GOST 16600-72. The transmission of speech by radio communication paths. The requirements for intelligibility of speech and methods of articulation measurements [Sound recording] / GOST 16600-72; isp.: D.I. Biblev. – Belgorod: NIU BelGU, 2016. – 1380 sec. – Access mode: https://www.researchgate.net/publication/312167036_Recording_Gost_16600-72 DOI: 10.13140/RG.2.2.33677.74720
  10. Kisilenko А.V., Likhogodina E.S., Likholob P.G. About choice of the place for hiding information [Text] / Kisilenko А.V., Likhogodina E.S., Likholob P.G. // Sovremennoe obschestvo, obrazovanie i nauka. Sbornik nauchnyh trudov po matherialam Mezhdunarodnoi nauchno-practicheskoi konferencii: v 9 chastyah. – Tambov: ООО "Konsaltingovaya kompaniya Yukom ", 2014. – P. 76-78
  11. Signal processing with lapped transforms. / Malvar H. S. ‒ Boston: Artech House, 1992.
  12. Ahmed N., Natarajan T., Rao K. R. Discrete cosine transform // IEEE transactions on Computers. ‒ 1974. ‒ V. 100, № 1. ‒ P. 90-93.
  13. On uniqueness of determination of identity-relevant frequency bands in the sounds of Russian speech affected by noise [Текст] / Zhilyakov E.G., Likholob P.G., Kurlov A.V., Medvedeva А.А. // // Nauchnye vedomosti Belgorodskogo gosudarstvennogo universiteta. Seriya: Economica. Informatica. 2016. V. 37. № 2 (223). P. 167-173
  14. Evgeny G.  Zhilyakov, Sergey P.  Belov, Likholob P. G., Pashintsev V. P. On the Steganography in Voice Data // Asian Journal of Information Technology. ‒ 2016. ‒ V. 15, № 12. ‒ P. 1949-1952.
  15. Zhilyakov E.G., Likholob P.G., Medvedeva А.А., Prochorenko Е.I. Research of the sensitivity of certain quality measures to hide information in the speech data // Nauchnye vedomosti Belgorodskogo gosudarstvennogo universiteta. Seriya: Economica. Informatica. ‒ Belgorod, 2016. ‒ V. 9, № 230. ‒ P. 174-179.
  16. Zhilyakov E.G. Optimal subband methods of analysis and synthesis of signals of finite duration. Automation and mechanics. – М.: Akademicheskiy nauchno-izdatelskiy, proizvodstvenno-poligraficheskiy i knigoraspredelitelskiy tsentr Rossiyskoi akademii nauk “Izdatelstvo “Nauka” № 4, 2015г. P. 51-66