Skip to content

docs: add detailed configuration guide for customized_quantize_layers#1592

Open
tangzz-radxa wants to merge 1 commit intoradxa-docs:mainfrom
tangzz-radxa:support/issue-1321-hybrid-quant-config
Open

docs: add detailed configuration guide for customized_quantize_layers#1592
tangzz-radxa wants to merge 1 commit intoradxa-docs:mainfrom
tangzz-radxa:support/issue-1321-hybrid-quant-config

Conversation

@tangzz-radxa
Copy link
Copy Markdown
Contributor

Summary

This PR adds detailed configuration guidance for the customized_quantize_layers section in ACUITY toolkit's hybrid quantization feature. The update addresses a documentation gap identified in issue #1321 where users need clearer instructions on how to configure mixed precision quantization (e.g., specifying int16 for certain layers while using uint8 for others).

Why

Issue #1321 reported a data type mismatch problem when quantizing YOLOv26 models with the AllWinner NPU toolkit. The user encountered errors where certain layers were quantized to int16 while subsequent layers expected uint8 input, causing compatibility issues during OpenVX conversion.

The existing documentation mentioned the customized_quantize_layers feature but lacked:

  • Concrete examples of how to edit the configuration
  • Guidance on data type compatibility between layers
  • Troubleshooting advice for common quantization errors
  • Specific instructions for complex models like YOLO

Changes

  1. Added configuration example showing how to specify different quantization types for different layers:

    "customized_quantize_layers": {
      "conv1": "int16",
      "conv2": "int16",
      "conv3": "uint8",
      "conv4": "uint8"
    }
  2. Included configuration considerations covering:

    • Layer name matching requirements
    • Data type compatibility between adjacent layers
    • Using entropy values to identify layers needing higher precision
    • Proper execution of hybrid quantization with the --hybrid flag
  3. Added troubleshooting guidance for common issues like "Inputs/Outputs data type not supported" errors, with specific mention of YOLO architecture considerations.

Verification

  • The changes have been applied to both Chinese (docs/) and English (i18n/en/) documentation files
  • The content is consistent with existing ACUITY toolkit documentation
  • The examples follow the same format and style as other configuration examples in the documentation
  • The guidance addresses the specific problem reported in issue Issue from cubie/a7z/app-dev/npu-dev/cubie-acuity-usage #1321

Related Issues

Addresses documentation gap identified in #1321

- Add example JSON configuration for mixed precision quantization (int16/uint8)
- Include configuration notes on layer naming, data type compatibility, and entropy-based selection
- Add troubleshooting guidance for data type mismatch errors
- Address issue radxa-docs#1321 regarding hybrid quantization configuration for YOLOv26 models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant