16+
DOI: 10.18413/2518-1092-2025-10-4-0-6

QUANTIZATION METHOD FOR DETECTION NEURAL NETWORKS ON EMBEDDED SYSTEMS

Model quantization is a key method for deploying high-performance neural network object detectors on resource-constrained devices. However, standard quantization approaches, such as PTQ, QAT, and even mixed-precision methods, optimize the distribution of bits based on the sensitivity of layers, ignoring the semantic specificity of the task. This leads to a significant decrease in accuracy when distinguishing between semantically similar classes, which is critical for many practical applications. The article proposes a new approach to mixed-precision quantization that takes into account the semantics of the task. A metric of semantic significance of network components that make a key contribution to the discrimination of difficult-to-distinguish classes is introduced. Based on it, a heterogeneous bit configuration is formed, which ensures high accuracy of critically important parts of the model, allowing aggressive compression of the rest. A plan for experimental validation of the approach on the task of determining the type of vehicle is presented. A significantly better compromise between accuracy and resource intensity of the modified neural network model is expected compared to standard quantization techniques.

Number of views: 4 (view statistics)
Количество скачиваний: 15
Full text (PDF)Скачать XMLTo articles list
  • User comments
  • Reference lists

While nobody left any comments to this publication.
You can be first.

Leave comment: