Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 33 additions & 163 deletions versal_2ve/examples/tutorials/resnet18_bf16/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
<table class="sphinxhide" width="100%">
<tr width="100%">
<td align="center"><img src="https://raw.githubusercontent.com/Xilinx/Image-Collateral/main/xilinx-logo.png" width="30%"/><h1>Vitis AI Development</h1>
</td>
</tr>
</table>

# Getting Started with Vitis AI: ResNet-18 End-to-End Flow

## Introduction
Expand Down Expand Up @@ -35,26 +28,27 @@ Before starting Docker, get the tutorial repository and adjust the access permis
chmod -R a+w <path/to/resnet18_bf16>
```

Load the docker image:
Pull the Docker image on a system with Docker installed:

```
docker load -i <docker_image_file>.tgz
docker pull amdih/vitis-ai:versal-2ve-release_v6.2_0612
```

Run `docker images` to verify docker REPOSITORY, IMAGEID and TAG information.

|REPOSITORY | TAG | IMAGE ID | CREATED | SIZE |
|--------------------|-------------------|-------------|---------------|--------|
|vitis_ai_2ve_docker | release_v6.2 | ??????? | xx hours ago | 39.1GB |
|REPOSITORY | TAG | IMAGE ID | CREATED | SIZE |
|--------------------|------------------------------|-------------|---------------|--------|
|amdih/vitis-ai | versal-2ve-release_v6.2_0612 | ?????? | xx hours ago | xx.xGB |

Start the docker:
Star the docker:

```
docker run -it --network host \
-v /path/to/your/license:/usr/licenses \
-v $PWD/resnet18_bf16:/resnet18_bf16 \
--rm vitis_ai_2ve_docker:release_v6.2 "bash"
-v $PWD/resnet50_quark:/resnet50_quark \
--rm amdih/vitis-ai:versal-2ve-release_v6.2_0612 "bash"
```

## Vitis AI Compilation & Deployment Flow

1. Inside the docker, change directory to the tutorial folder, install python packages required by the example, and export the ResNet-18 ONNX model:
Expand Down Expand Up @@ -99,10 +93,10 @@ To get more details about compilation results you can display the content of the
```
--------- Final Summary of VAIML Pass ----------
OS: Linux X64
VAIP commit: 744227ab2a0fddec1eccdfe04ca222afd339f53f
VAIP commit: ......
Model: ....../models/resnet18.a1_in1k.onnx
Model signature: 41d764d4ef1d716a260bc7b2b4e07ff1
Device: ve2-xc2ve3858
Device: ve2
Model data type: float32
Device data type: bfloat16
Number of operators in the model: 49
Expand All @@ -121,17 +115,13 @@ Subgraph vaiml_par_0 stats:
Type: npu
Operators: 49 (100.000%)
GOPs : 3.644 (100.000%) OPs: 3,643,881,552
fp32 ops %: 99.731
```

3. Refer to Vitis AI User Guide for Versal AI Edge Series Gen 2, boot up the AIE-ML_v2 board, and run following commands in the board to setup environment:
3. Refer to Vitis AI User Guide for Versal AI Edge Series Gen 2, boot up the AIE-ML_v2 board, and setup environment:

```
sudo su # To avoid permission issues while creating the hw context
echo 1 > /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress
export XRT_AIARM=true
export LD_LIBRARY_PATH=/usr/lib/python3.12/site-packages/voe/lib/:/usr/lib/python3.12/site-packages/flexmlrt/lib/
export XLNX_ENABLE_CACHE=0
export XRT_ELF_FLOW=1
export LD_LIBRARY_PATH=/usr/lib/python3.12/site-packages/voe/lib:/usr/lib/python3.12/site-packages/flexmlrt/lib:/usr/lib/python3.12/site-packages/onnxruntime/capi
```

4. Run the inference on the board. The working directory can be mounted on the board or copied to the board by scp:
Expand All @@ -150,10 +140,10 @@ The script runs four inferences of the model and displays messages similar to th

```
Running 4 inferences, comparing CPU and NPU outputs
Iteration 1: Max absolute difference = 0.198444, Root mean squared error = 0.081654
Iteration 2: Max absolute difference = 0.154286, Root mean squared error = 0.068371
Iteration 3: Max absolute difference = 0.210051, Root mean squared error = 0.081457
Iteration 4: Max absolute difference = 0.196577, Root mean squared error = 0.063051
Iteration 1: Max absolute difference = 0.228190, Root mean squared error = 0.085001
Iteration 2: Max absolute difference = 0.231800, Root mean squared error = 0.086444
Iteration 3: Max absolute difference = 0.248591, Root mean squared error = 0.092887
Iteration 4: Max absolute difference = 0.177171, Root mean squared error = 0.069008
Inference Done!
```

Expand All @@ -167,20 +157,22 @@ python runmodel.py
The output includes the number of operators offloaded to the NPU and the number of NPU-executed subgraphs:

```
I20250529 19:23:43.187124 1265 stat.cpp:193] [Vitis AI EP] No. of Operators :
I20250529 19:23:43.187179 1265 stat.cpp:204] VAIML 49
I20250529 19:23:43.187194 1265 stat.cpp:213]
I20250529 19:23:43.187206 1265 stat.cpp:218] [Vitis AI EP] No. of Subgraphs :
I20250529 19:23:43.187219 1265 stat.cpp:226] NPU 1
I20250529 19:23:43.187227 1265 stat.cpp:229] Actually running on NPU 1
I20250529 19:23:43.188418 1265 vitisai_compile_model.cpp:1477] AVG CPU Usage 95.4545%
I20250529 19:23:43.188459 1265 vitisai_compile_model.cpp:1478] Peak Working Set size 213.195 MB
[2025-05-29 19:23:43.261] [console] [info] [FLEXMLRT] FlexMLClient.cpp:1269 FlexMLRT Git Hash: 512d4e65
I20260608 01:57:50.007778 1212 stat.cpp:198] [Vitis AI EP] No. of Operators :
I20260608 01:57:50.007839 1212 stat.cpp:198] VAIML 49
I20260608 01:57:50.007958 1212 stat.cpp:198]
I20260608 01:57:50.007978 1212 stat.cpp:198] [Vitis AI EP] No. of Subgraphs :
I20260608 01:57:50.007992 1212 stat.cpp:198] NPU 1
I20260608 01:57:50.008001 1212 stat.cpp:198] Actually running on NPU 1
......
Running 4 inferences, comparing CPU and NPU outputs
Iteration 1: Max absolute difference = 0.193388, Root mean squared error = 0.083394
Iteration 2: Max absolute difference = 0.241203, Root mean squared error = 0.090799
Iteration 3: Max absolute difference = 0.190464, Root mean squared error = 0.080506
Iteration 4: Max absolute difference = 0.217875, Root mean squared error = 0.087615
......
Iteration 1: Max absolute difference = 0.211893, Root mean squared error = 0.075283
......
Iteration 2: Max absolute difference = 0.220449, Root mean squared error = 0.082772
......
Iteration 3: Max absolute difference = 0.169577, Root mean squared error = 0.055290
......
Iteration 4: Max absolute difference = 0.223119, Root mean squared error = 0.077367
Inference Done!
```

Expand All @@ -197,128 +189,6 @@ And then run the inference. The output contains information as follows:
[xrt_xdna] DEBUG: Partition Created with start_col 0 num_columns 4 partition_id 1024
```

## Vitis AI Flow Essential

This section covers some essential concepts in Vitis AI model compilation and inference. By learning the concepts and example codes, the flow can be extended to other ONNX models.

1. Onnx model is used as input to the model compilation, which tries to accelerate the operators in NPU. So, prepare the ONNX model in ML frameworks.

2. Models are compiled for the NPU by creating an ONNX inference session using the Vitis AI Execution Provider (VAI EP). The example python code can be found in `compile.py`.

```
import onnxruntime

provider_options_dict = {
"config_file": 'vitisai_config.json',
"cache_dir": 'my_cache_dir',
"cache_key": 'resnet18.a1_in1k',
"log_level": 'info',
"target": 'VAIML'
}

print(f"Creating ORT inference session for model models/resnet18.a1_in1k.onnx")
session = onnxruntime.InferenceSession(
'models/resnet18.a1_in1k.onnx',
providers=["VitisAIExecutionProvider"],
provider_options=[provider_options_dict]
)
```

The example configuration file `vitisai_config.json` contains options for Vitis AI compiler:

```
{
"passes": [
{
"name": "init",
"plugin": "vaip-pass_init"
},
{
"name": "vaiml_partition",
"plugin": "vaip-pass_vaiml_partition",
"vaiml_config":
{
"device": "ve2-xc2ve3858",
"optimize_level": 2,
"logging_level": "info",
"keep_outputs": true,
"threshold_gops_percent": 20
}
}
],
"target": "VAIML",
"targets": [
{
"name": "VAIML",
"pass": [
"init",
"vaiml_partition"
]
}
]
}
```

The value `ve2-xc2ve3858` for the `device` option selects the VEK385 part on Versal AI Edge Series Gen 2 (AIE-ML_v2) for Vitis AI 6.2 compilation.

3. To execute the compiled model on hardware, transfer the compiled model artifacts and the original ONNX model file to the target board. The compiled ONNX graph is automatically partitioned into multiple subgraphs by the VitisAI Execution Provider (EP). The subgraph(s) containing operators supported by the NPU are executed on the NPU. The remaining subgraph(s) are executed on the CPU. This graph partitioning and deployment technique across CPU and NPU is fully automated by the VAI EP and is totally transparent to the end-user.

Model execution is performed using a Python script that establishes an ONNX Runtime (ORT) inference session. This session is initialized with the target ONNX model and configured to utilize the Vitis AI Execution Provider (EP). Upon execution, the ORT session leverages the Vitis AI EP, which utilizes the compiled model binaries in the specified directory and deploys the ONNX subgraph(s) on the NPU and the CPU.

The example python code for deploying on the hardware can be found in `runmodel.py`. It creates `InferenceSession` for CPU and NPU and runs inferences. And then compute the RMSE (Root Mean Square Error) between the CPU and NPU results:

```
import numpy as np
import onnxruntime as ort

provider_options_dict = {
"config_file": 'vitisai_config.json',
"cache_dir": 'my_cache_dir',
"cache_key": 'resnet18.a1_in1k',
"log_level": 'info',
"target": 'VAIML',
}

print(f"Creating ORT inference session for model models/resnet18.a1_in1k.onnx")

onnx_model="models/resnet18.a1_in1k.onnx"
# CPU session to compute reference values
cpu_session = ort.InferenceSession(
onnx_model,
)
# NPU session
npu_session = ort.InferenceSession(
onnx_model,
providers=["VitisAIExecutionProvider"],
provider_options=[provider_options_dict]
)

num_iter = 4
print(f"Running {num_iter} inferences, comparing CPU and NPU outputs")
for i in range(num_iter):
# Generate random data
input_data = {}
for input in npu_session.get_inputs():
fixed_shape = [1 if isinstance(dim, str) else dim for dim in input.shape]
input_data[input.name] = np.random.rand(*fixed_shape).astype(np.float32)

# Compute CPU results (reference values)
cpu_outputs = cpu_session.run(None, input_data)
# Compute NPU results
try:
npu_outputs = npu_session.run(None, input_data)
except Exception as e:
print(f"Failed to run on NPU: {e}")
sys.exit(1)

# Compare CPU and NPU results
max_diff = np.max(np.abs(cpu_outputs[0] - npu_outputs[0]))
rmse = np.sqrt(np.mean((cpu_outputs[0] - npu_outputs[0]) ** 2))
print(f'Iteration {i+1:3d}: Max absolute difference = {max_diff:.6f}, Root mean squared error = {rmse:.6f}')

print("Inference Done!")
```

## Summary

By completing this tutorial, you learned:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
{
"device": "ve2-xc2ve3858",
"optimize_level": 2,
"logging_level": "info",
"keep_outputs": true,
"threshold_gops_percent": 20
"logging_level": "info",
"threshold_gops_percent": 20,
"dp_size": 1,
"tp_size": 1
}
}
],
Expand Down