Advances in quantization for efficient on-device inference