feat: Make vLLM reasoning tags configurable from YAML by hokuyama0106 · Pull Request #1402 · NVIDIA-NeMo/Gym

hokuyama0106 · 2026-05-24T13:53:38Z

This updates responses_api_models/vllm_model/app.py so reasoning wrappers/parsers no longer assume <think>...</think>. The tag pair is now configurable via model YAML while preserving the current <think> defaults for existing configs.

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

copy-pr-bot · 2026-05-24T13:53:41Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/67b6eeed-667b-4547-a561-f280274265c0 Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

cmunley1 · 2026-05-24T20:18:46Z

Did you test this on older vllm? I think these were only introduced in vllm 0.19

Also, have you tried thinking token budget with these changes?

https://docs.vllm.ai/en/latest/features/reasoning_outputs/#thinking-budget-control

hokuyama0106 · 2026-05-25T09:29:30Z

Hello, thank you for your reply.

Did you test this on older vllm? I think these were only introduced in vllm 0.19

I am using trl-nemo gym integration and the vllm server is not from the pure vllm liblrary.
The changes of this PR are just parsing the reasoning tags from the server outputs so I think it doesn't conflict any other features.
If you have any points that I should check, please give me those.

https://github.com/huggingface/trl/blob/v0.28.0/trl/scripts/vllm_serve.py
https://huggingface.co/docs/trl/nemo_gym

Also, have you tried thinking token budget with these changes?

No, we haven't, but I can implement it in this PR if you want.

cmunley1 · 2026-05-25T20:47:42Z

The integration with TRL is currently unstable, I would recommend using Nemo RL or verl, sorry for the redirection.

hokuyama0106 · 2026-05-26T01:22:47Z

Ok, I will try it.
Should we temporarily keep the current code and close this PR even if the think tags are hardcoded and there is a TODO comment?

https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_models/vllm_model/app.py#L626-L630
https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_models/vllm_model/app.py#L916-L917

cmunley1 · 2026-05-26T02:49:44Z

Do you have a use case where there are different think tags?

I do think it would be good to update for compatibility with vllm 0.19, just need to be careful as this would affect all environments. https://docs.vllm.ai/en/latest/features/reasoning_outputs/#thinking-budget-control

cc @bxyu-nvidia

hokuyama0106 · 2026-05-26T04:56:10Z

Do you have a use case where there are different think tags?

The current vllm server from the trl integration only outputs the raw output including reasoning tags.
We are training models having different reasoning tags from <think> with trl so we use the modification as this PR.

I do think it would be good to update for compatibility with vllm 0.19, just need to be careful as this would affect all environments.

I got it. I fully agree with your suggestion.
It does make more sense to implement the thinking-budget-control in this repo and modify the post-processing of the trl vllm server if we want to use trl.
In that case, I will close this PR and try to update a function with the thinking-budget-control.

Copilot AI and others added 4 commits May 24, 2026 13:40

Initial plan

16120d7

Make vllm reasoning tags configurable

c790d41

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

Address review feedback for reasoning tag config

2a71b86

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

Finalize configurable reasoning tag update

0388ae0

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/4fd844ac-7f43-4ab8-92cd-2ee453b53c9d Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

Add explicit reasoning tags to vLLM configs

c5fac38

Agent-Logs-Url: https://github.com/hokuyama0106/Gym/sessions/67b6eeed-667b-4547-a561-f280274265c0 Co-authored-by: hokuyama0106 <50006381+hokuyama0106@users.noreply.github.com>

hokuyama0106 changed the title ~~Make vLLM reasoning tags configurable from YAML~~ feat: Make vLLM reasoning tags configurable from YAML May 24, 2026

hokuyama0106 marked this pull request as ready for review May 24, 2026 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Make vLLM reasoning tags configurable from YAML#1402

feat: Make vLLM reasoning tags configurable from YAML#1402
hokuyama0106 wants to merge 5 commits into
NVIDIA-NeMo:mainfrom
hokuyama0106:update-reasoning-tags-config

hokuyama0106 commented May 24, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 24, 2026

Uh oh!

cmunley1 commented May 24, 2026

Uh oh!

hokuyama0106 commented May 25, 2026 •

edited

Loading

Uh oh!

cmunley1 commented May 25, 2026

Uh oh!

hokuyama0106 commented May 26, 2026 •

edited

Loading

Uh oh!

cmunley1 commented May 26, 2026

Uh oh!

hokuyama0106 commented May 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hokuyama0106 commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented May 24, 2026

Uh oh!

cmunley1 commented May 24, 2026

Uh oh!

hokuyama0106 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmunley1 commented May 25, 2026

Uh oh!

hokuyama0106 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmunley1 commented May 26, 2026

Uh oh!

hokuyama0106 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hokuyama0106 commented May 24, 2026 •

edited

Loading

hokuyama0106 commented May 25, 2026 •

edited

Loading

hokuyama0106 commented May 26, 2026 •

edited

Loading

hokuyama0106 commented May 26, 2026 •

edited

Loading