Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions docs/common/ai/_cubie_quant_acc_improve.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,30 @@ pegasus_quantize.sh MobileNetV2_Imagenet uint8 10

用户可以参考 `entropy.txt` 的值对 MODEL_DIR_QUANTIZE.quantize 中的 customized_quantize_layers 里的层适当进行增删修改。

**编辑 customized_quantize_layers 配置示例:**

打开生成的 `.quantize` 文件,找到 `customized_quantize_layers` 部分。你可以手动编辑这个部分来指定某些层使用不同的量化精度。例如,如果你需要某些层使用 int16 量化而其他层使用 uint8:

```json
"customized_quantize_layers": {
"conv1": "int16",
"conv2": "int16",
"conv3": "uint8",
"conv4": "uint8"
}
```

**配置注意事项:**

1. **层名匹配**:确保使用的层名与模型中实际的层名完全一致
2. **数据类型兼容性**:注意相邻层之间的数据类型兼容性。例如,如果前一层输出 int16,后一层输入 uint8,可能需要添加转换层或确保有自动类型转换
3. **熵值参考**:优先为熵值高(接近 1.0)的层指定更高精度的量化类型(如 int16)
4. **混合量化执行**:编辑完 `.quantize` 文件后,使用 `--hybrid` 标志(而不是 `--rebuild`)执行量化命令,系统会读取 `customized_quantize_layers` 配置

**常见问题解决:**

如果遇到数据类型不匹配错误(如 "Inputs/Outputs data type not supported"),检查 `customized_quantize_layers` 配置,确保相邻层的数据类型兼容。对于复杂模型如 YOLO,可能需要更仔细地规划各层的量化策略。

#### 执行混合量化命令

:::tip
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,30 @@ Execution of the quantization script `pegasus_quantize.sh` will generate `MODEL_

Users can refer to the value of `entropy.txt` to appropriately add or delete the layer in `MODEL_DIR_QUANTIZE.quantize` customized_quantize_layers.

**Editing customized_quantize_layers configuration example:**

Open the generated `.quantize` file and find the `customized_quantize_layers` section. You can manually edit this section to specify different quantization precision for certain layers. For example, if you need some layers to use int16 quantization while others use uint8:

```json
"customized_quantize_layers": {
"conv1": "int16",
"conv2": "int16",
"conv3": "uint8",
"conv4": "uint8"
}
```

**Configuration considerations:**

1. **Layer name matching**: Ensure the layer names used exactly match the actual layer names in the model
2. **Data type compatibility**: Pay attention to data type compatibility between adjacent layers. For example, if the previous layer outputs int16 and the next layer expects uint8, you may need to add conversion layers or ensure automatic type conversion exists
3. **Entropy reference**: Prioritize specifying higher precision quantization types (like int16) for layers with high entropy values (close to 1.0)
4. **Hybrid quantization execution**: After editing the `.quantize` file, execute the quantization command with the `--hybrid` flag (instead of `--rebuild`), and the system will read the `customized_quantize_layers` configuration

**Common issue resolution:**

If you encounter data type mismatch errors (such as "Inputs/Outputs data type not supported"), check the `customized_quantize_layers` configuration to ensure data type compatibility between adjacent layers. For complex models like YOLO, more careful planning of quantization strategies for each layer may be required.

#### Execute hybrid quantization command

:::tip
Expand Down
Loading