16+
DOI: 10.18413/2518-1092-2026-11-1-0-9

DATA ANOMALY TAXONOMY AND METHOD SELECTION

Anomaly detection has become a core component of modern data analysis due to the growth of data volumes, the increasing complexity of information systems, and the demand for real-time monitoring. Although anomalies are usually rare, they often correspond to high-impact events such as equipment failures, fraud, cyberattacks, and critical medical conditions. There is no universal notion of an anomaly: deviations can be absolute, context-dependent, or emergent at the level of groups and sequences. Misclassifying the anomaly type leads to inappropriate modeling assumptions, poorly calibrated thresholds, and a trade-off skewed toward false alarms or missed events. This paper reviews three major anomaly types - global (point), contextual, and collective – and relates them to three detection paradigms: statistical tests, density/distance-based methods, and model-based approaches. For each type, we discuss representative algorithms (Z-score, IQR, Mahalanobis distance, kNN/LOF, Isolation Forest, clustering, time-series and sequence models, LSTM/autoencoders, and variational models), together with their data requirements and practical limitations. We provide a method-selection scheme aligned with data modality (tabular data, time series, streaming data), recommend threshold calibration strategies, and outline evaluation protocols for highly imbalanced settings (PR-AUC, MCC, event-based metrics). Correct anomaly typing is a prerequisite for effective monitoring; in applied scenarios, the most robust solutions are typically cascade and ensemble pipelines that combine interpretable baselines with flexible machine learning models.

Number of views: 56 (view statistics)
Количество скачиваний: 171
Full text (PDF)Скачать XMLTo articles list
  • User comments
  • Reference lists

While nobody left any comments to this publication.
You can be first.

Leave comment: