CAQ：A Context-aware adaptive quantization framework

Mar 30, 2023

A Context-aware adaptive quantization framework that dynamically switches different gates based on the resource environment of the model to generate mixed-precision quantization strategies that match the hierarchical structure of the backbone network.

Model Architecture Diagram： your_image

Description： The CAQ algorithm is developed based on Python 3.6+ and dependent libraries such as CUDA cuDNN and torch. The algorithm takes the deployment context of a deep model, such as device power, memory, and computing power, as input, and outputs a quantized model that can be quickly adapted to dynamic runtime contexts. This algorithm supports adaptive quantization of deep learning models on datasets such as CIFAR-10, CIFAR-100, and ImageNet. In the implementation process, the algorithm references third-party resources such as Fractional Skipping and SkipNet.

Source Code