This repository was archived by the owner on Feb 20, 2026. It is now read-only.
nlzy/vllm-gfx906
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
vLLM for gfx906
===================
ARICHIVED
-------------------
Over the past year, almost all open-weight models have grown increasingly large,
far exceeding my capacity for development and testing. Coupled with the rising
price of MI50 GPUs, I am reluctant to keep adding more cards. I have now
archived this repository; everyone is welcome to fork it.
ORIGINAL README
-------------------
This is a modified version of vLLM, works with (and only works with) AMD gfx906
GPUs such as Radeon VII / Radeon Pro VII / Instinct MI50 / Instinct MI60.
This fork was (and still is) just a passion project shared for fun. I won't be
putting much effort into it. Use it at your own risk, especially please don't
use it as a reference for your GPU purchasing decisions.
RUN WITH DOCKER
-------------------
Please install ROCm 6.3 first, only kernel-mode driver is required. Refer to
the official documentation by AMD.
```
docker pull nalanzeyu/vllm-gfx906
docker run -it --rm --shm-size=2g --device=/dev/kfd --device=/dev/dri \
--group-add video -p 8000:8000 -v <YOUR_MODEL_PATH>:/model \
nalanzeyu/vllm-gfx906 vllm serve /model
```
SUPPORT QUANTIZATIONS
-------------------
See #29
GPTQ and AWQ are the first recommended quantization formats.
vLLM's llm-compressor with W4A16 INT format is also recommended. Other formats
in llm-compressor are not support.
All MoE quantization models are significantly slow, and all unquantized models
are slightly slow. Not recommended to use.
BUILD
-------------------
Please install ROCm 6.3 first. You need to install both kernel-mode driver and
ROCm packages. Refer to the official documentation by AMD.
You also need python-venv / python-dev, on Debian / Ubuntu use this command:
$ sudo apt install python3-venv python3-dev
You also need triton-gfx906 v3.5.0+gfx906 see:
https://github.com/nlzy/triton-gfx906/tree/v3.5.0+gfx906
```
cd vllm-gfx906
python3 -m venv vllmenv
source vllmenv/bin/activate
pip3 install torch==2.9 torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3
pip3 install -r requirements/rocm-build.txt -r requirements/rocm.txt
pip3 install --no-build-isolation --no-deps -v .
```
CREDITS
-------------------
https://github.com/Said-Akbar/vllm-rocm