feat: Make vLLM reasoning tags configurable from YAML#1402
feat: Make vLLM reasoning tags configurable from YAML#1402hokuyama0106 wants to merge 5 commits into
Conversation
Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>
Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>
Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>
Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/67b6eeed-667b-4547-a561-f280274265c0 Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>
|
Did you test this on older vllm? I think these were only introduced in vllm 0.19 Also, have you tried thinking token budget with these changes? https://docs.vllm.ai/en/latest/features/reasoning_outputs/#thinking-budget-control |
|
Hello, thank you for your reply.
I am using trl-nemo gym integration and the vllm server is not from the pure vllm liblrary. https://github.com/huggingface/trl/blob/v0.28.0/trl/scripts/vllm_serve.py
No, we haven't, but I can implement it in this PR if you want. |
|
The integration with TRL is currently unstable, I would recommend using Nemo RL or verl, sorry for the redirection. |
|
Ok, I will try it. https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_models/vllm_model/app.py#L626-L630 |
|
Do you have a use case where there are different think tags? I do think it would be good to update for compatibility with vllm 0.19, just need to be careful as this would affect all environments. https://docs.vllm.ai/en/latest/features/reasoning_outputs/#thinking-budget-control cc @bxyu-nvidia |
The current vllm server from the trl integration only outputs the raw output including reasoning tags.
I got it. I fully agree with your suggestion. |
This updates
responses_api_models/vllm_model/app.pyso reasoning wrappers/parsers no longer assume<think>...</think>. The tag pair is now configurable via model YAML while preserving the current<think>defaults for existing configs.