<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2518-1092</journal-id><journal-title-group><journal-title>Research result. Information technologies</journal-title></journal-title-group><issn pub-type="epub">2518-1092</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2518-1092-2026-11-1-0-9</article-id><article-id pub-id-type="publisher-id">4102</article-id><article-categories><subj-group subj-group-type="heading"><subject>COMPUTER SIMULATION</subject></subj-group></article-categories><title-group><article-title>&lt;strong&gt;DATA ANOMALY TAXONOMY AND METHOD SELECTION&lt;/strong&gt;</article-title><trans-title-group xml:lang="en"><trans-title>&lt;strong&gt;DATA ANOMALY TAXONOMY AND METHOD SELECTION&lt;/strong&gt;</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Kotov</surname><given-names>Dmitry Vasilyevich</given-names></name><name xml:lang="en"><surname>Kotov</surname><given-names>Dmitry Vasilyevich</given-names></name></name-alternatives><email>kotovdv2101@outlook.com</email></contrib></contrib-group><pub-date pub-type="epub"><year>2026</year></pub-date><volume>11</volume><issue>1</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/information/2026/1/НР.ИТ_11.1_9.pdf" /><abstract xml:lang="ru"><p>Anomaly detection has become a core component of modern data analysis due to the growth of data volumes, the increasing complexity of information systems, and the demand for real-time monitoring. Although anomalies are usually rare, they often correspond to high-impact events such as equipment failures, fraud, cyberattacks, and critical medical conditions. There is no universal notion of an anomaly: deviations can be absolute, context-dependent, or emergent at the level of groups and sequences. Misclassifying the anomaly type leads to inappropriate modeling assumptions, poorly calibrated thresholds, and a trade-off skewed toward false alarms or missed events. This paper reviews three major anomaly types - global (point), contextual, and collective &amp;ndash; and relates them to three detection paradigms: statistical tests, density/distance-based methods, and model-based approaches. For each type, we discuss representative algorithms (Z-score, IQR, Mahalanobis distance, kNN/LOF, Isolation Forest, clustering, time-series and sequence models, LSTM/autoencoders, and variational models), together with their data requirements and practical limitations. We provide a method-selection scheme aligned with data modality (tabular data, time series, streaming data), recommend threshold calibration strategies, and outline evaluation protocols for highly imbalanced settings (PR-AUC, MCC, event-based metrics). Correct anomaly typing is a prerequisite for effective monitoring; in applied scenarios, the most robust solutions are typically cascade and ensemble pipelines that combine interpretable baselines with flexible machine learning models.</p></abstract><trans-abstract xml:lang="en"><p>Anomaly detection has become a core component of modern data analysis due to the growth of data volumes, the increasing complexity of information systems, and the demand for real-time monitoring. Although anomalies are usually rare, they often correspond to high-impact events such as equipment failures, fraud, cyberattacks, and critical medical conditions. There is no universal notion of an anomaly: deviations can be absolute, context-dependent, or emergent at the level of groups and sequences. Misclassifying the anomaly type leads to inappropriate modeling assumptions, poorly calibrated thresholds, and a trade-off skewed toward false alarms or missed events. This paper reviews three major anomaly types - global (point), contextual, and collective &amp;ndash; and relates them to three detection paradigms: statistical tests, density/distance-based methods, and model-based approaches. For each type, we discuss representative algorithms (Z-score, IQR, Mahalanobis distance, kNN/LOF, Isolation Forest, clustering, time-series and sequence models, LSTM/autoencoders, and variational models), together with their data requirements and practical limitations. We provide a method-selection scheme aligned with data modality (tabular data, time series, streaming data), recommend threshold calibration strategies, and outline evaluation protocols for highly imbalanced settings (PR-AUC, MCC, event-based metrics). Correct anomaly typing is a prerequisite for effective monitoring; in applied scenarios, the most robust solutions are typically cascade and ensemble pipelines that combine interpretable baselines with flexible machine learning models.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>data anomalies</kwd><kwd>anomaly detection</kwd><kwd>outliers</kwd><kwd>context</kwd><kwd>time series</kwd><kwd>autoencoder</kwd><kwd>Isolation Forest</kwd><kwd>LOF</kwd></kwd-group><kwd-group xml:lang="en"><kwd>data anomalies</kwd><kwd>anomaly detection</kwd><kwd>outliers</kwd><kwd>context</kwd><kwd>time series</kwd><kwd>autoencoder</kwd><kwd>Isolation Forest</kwd><kwd>LOF</kwd></kwd-group></article-meta></front><back><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>1. Larin D.O. Information revolutions and their role in the development of mankind / D.O. Larin // Bulletin of Omsk University. &amp;ndash; 2025. - Vol. 30, No. 1. &amp;ndash; Pp. 37-50. &amp;ndash; DOI 10.24147/1812-3996.2025.1.37-50. &amp;ndash; EDN GJVYOV.</mixed-citation></ref><ref id="B2"><mixed-citation>2. Shkodyrev V.P. Overview of anomaly detection methods in data streams / V.P. Shkodyrev, K.I.&amp;nbsp;Yagafarov, V.A. Bashtovenko, E.E. Ilyina // Proceedings of the Second Conference on Software Engineering and Information Management (SEIM-2017). &amp;ndash; St. Petersburg: SPbSUT, 2017. &amp;ndash; Vol. 1864. &amp;ndash; Pp. 215-225.</mixed-citation></ref><ref id="B3"><mixed-citation>3. Shorgin S.Ya. Statistics and clusters in search of anomalous inclusions under big data conditions // Informatics and Its Applications.&amp;nbsp;&amp;ndash; 2021. &amp;ndash; Vol. 15, No. 4. &amp;ndash; Pp. 142-151. &amp;ndash; DOI: 10.15393/j12.art.2021.7987.</mixed-citation></ref><ref id="B4"><mixed-citation>4. Andrianova E.G., Golovin S.A., Zykov S.V., Les&amp;#39;ko S.A., Chukalina E.R. Review of modern models and methods for analyzing time series dynamics in social, economic, and sociotechnical systems // Russian Technological Journal. &amp;ndash; 2020. &amp;ndash; Vol. 8, No. 4. &amp;ndash; Pp. 7-45. &amp;ndash; DOI: 10.32362/2500-316X-2020-8-4-7-45.</mixed-citation></ref><ref id="B5"><mixed-citation>5. Vidishcheva E.V., Kopyrin A.S., Vasilenko M.S. Analysis and refinement of anomaly and outlier classification on economic data // Bulletin of the Altai Academy of Economics and Law. &amp;ndash; 2019. &amp;ndash; No. 6-1. &amp;ndash;</mixed-citation></ref><ref id="B6"><mixed-citation>Pp.&amp;nbsp;41-46. &amp;ndash; URL: https://vaael.ru/ru/article/view?id=589 (date of access: 22.01.2026).</mixed-citation></ref><ref id="B7"><mixed-citation>6. Andrianova E.G., Zykov S.V., et al. Review of modern models and methods for time series analysis // Russian Technological Journal. &amp;ndash; 2020. &amp;ndash; Vol. 8, No. 4. &amp;ndash; Pp. 7-45.</mixed-citation></ref><ref id="B8"><mixed-citation>7. Bardasova I.A., Volkova E.A. Anomaly detection in emails using machine learning // Bulletin of Science, No. 5(74), Vol. 4, pp. 1350-1358. 2024. ISSN 2712-8849 // URL: https://www.vesnik-nauki.rf/article/14991 (Accessed: 22.01.2026).</mixed-citation></ref><ref id="B9"><mixed-citation>8. Mikhailov A.N. Anomaly detection in network traffic using machine learning methods // Bulletin of Science, No. 12(81), Vol. 3, pp. 1463-1466. 2024. ISSN 2712-8849 // URL: https://www.vesnik-nauki.rf/article/19907 (Accessed: 22.01.2026).</mixed-citation></ref><ref id="B10"><mixed-citation>9. Domashkin A.A. Application of two-stage clustering method for anomaly detection: Conference abstract / A.A. Domashkin // International Conference on Computer Systems and Technologies (ICCSS-2024). &amp;ndash; Moscow: IPU, 2024. &amp;ndash; Pp. 112-120. &amp;ndash; URL: https://iccss2024.ipu.ru/proceedings/Домашкин.pdf (accessed: 22.01.2026).</mixed-citation></ref><ref id="B11"><mixed-citation>10. Glukhov K.A. Application of two-stage clustering method based on self-organizing Kohonen map for anomaly detection in synthetic datasets / K.A. Glukhov, A.A. Domashkin // Secure and Information Technologies. &amp;ndash; 2023. &amp;ndash; Vol. 1, No. 1. &amp;ndash; Pp. 1-10. &amp;ndash; URL: https://info-secur.ru/index.php/ojs/article/view/482 (date of access: 22.01.2026).</mixed-citation></ref><ref id="B12"><mixed-citation>11. Kraeva Ya.A. Neural network method for anomaly detection in multidimensional streaming time series // Bulletin of SPbPU. Series: Radio Engineering, Telecommunications and Computer Engineering. &amp;ndash; 2024. &amp;ndash; No. 2. &amp;ndash; Pp. 45-58.</mixed-citation></ref><ref id="B13"><mixed-citation>12. Gritsenko A.V. Types of anomalies in video images // Applied Informatics. &amp;ndash; 2012. &amp;ndash; No. 5. &amp;ndash; Pp. 78-92.</mixed-citation></ref><ref id="B14"><mixed-citation>13. Litvinovich A.V., Smirnov S.V. Methods for analyzing multidimensional data in anomaly detection tasks&amp;nbsp;// Software Products and Systems. &amp;ndash; 2022. &amp;ndash; Vol. 135, No. 3. &amp;ndash; P. 45-52.</mixed-citation></ref><ref id="B15"><mixed-citation>14. Gerasimov M.A., Petrov I.V. Anomaly Detection in Large-Scale Data Using Isolation Forest and an Autoencoder // Bulletin of St. Petersburg State University. Series 15. Computational Mathematics and Informatics.&amp;nbsp;&amp;ndash; 2024. &amp;ndash; Vol. 20, No. 1. &amp;ndash; P. 112&amp;ndash;125.</mixed-citation></ref><ref id="B16"><mixed-citation>15. Levshun D.A., Popov D.A., Kozlov A.S. Detection and explanation of anomalies in industrial IoT systems based on autoencoders // Software Products and Systems. &amp;ndash; 2023. &amp;ndash; Vol. 141, No. 4. &amp;ndash; P. 123&amp;ndash;135.</mixed-citation></ref><ref id="B17"><mixed-citation>16. Butusov D.N. Numerical methods for analyzing non-stationary signals in image processing tasks tasks [Doctoral dissertation, Cand. Phys.-Math. Sci., 05.12.04]. SPbGETU &amp;quot;LETİ&amp;quot;. &amp;ndash; St. Petersburg, 2021. &amp;ndash; 150 p.</mixed-citation></ref><ref id="B18"><mixed-citation>17. Nosko V.P. Introduction to regression analysis of time series // [study guide]. &amp;ndash; Moscow: HSE, 2010. &amp;ndash; 120 p.</mixed-citation></ref><ref id="B19"><mixed-citation>18. Brykin D.O. Investigation of time series processing algorithms considering non-stationarity [Doctoral dissertation, Cand. Phys.-Math. Sci., 05.13.18]. MIPT. &amp;ndash; Moscow, 2023. &amp;ndash; 145 p.</mixed-citation></ref><ref id="B20"><mixed-citation>19. Lyubushin A.A. Analysis of geophysical and engineering monitoring system data. &amp;ndash; 3rd ed. &amp;ndash; M.: Nauka, 2024. &amp;ndash; 320 p.</mixed-citation></ref></ref-list></back></article>