refactor: split RL trainer into optional in-repo verifiers-rl package #843
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
verifiersby moving RL training/inference code into an optional package.vf-rl,vf-train,vf-vllm) while making RL tooling installable on demand.Description
packages/verifiers-rlwith its ownpyproject.toml,verifiers_rlmodule, RL code (rl.trainer,rl.inference,scripts) and entrypoints (vf-rl,vf-train,vf-vllm).verifiers_rl.rl.*and update internal imports in the new package accordingly.verifiers/rl/*that proxy toverifiers_rland emit clear install guidance (uv add verifiers-rl) when the optional package is absent.verifiers/scripts/*with thin wrappers that import the new package and show install hints when missing, and updatepyproject.tomlto pointvf-vllmat the shim wrapper.verifiers.__getattr__lazy-import mapping to point RL symbols toverifiers_rland to raise a package-specific install hint (verifiers-rl) for RL-related names.rlextra and RL-specific build settings from corepyproject.tomland update documentation to recommenduv add verifiers-rl..github/workflows/publish-verifiers-rl.ymlto support programmatic builds/publishing of the new package.Testing
uv run ruff check --fix .which completed with all checks passing.import verifiersworks in a fresh virtualenv viauv run python -c "import verifiers; print('ok')".verifiers.RLConfig/verifiers.RLTrainerand observing theAttributeErrormessaging directing users toverifiers-rl.uv build packages/verifiers-rlsuccessfully (wheel and sdist produced).uv pip install -e packages/verifiers-rlin this environment and the editable install failed while buildingflash-attnin isolation due to a missingtorchbuild-time requirement; this is an environment/build isolation issue and not a correctness or lint failure of the refactor.Codex Task
Note
Medium Risk
Moderate risk due to packaging/import path refactors and new release automation; breakage would primarily surface as missing RL symbols/CLIs or incorrect install guidance rather than core runtime behavior.
Overview
Extracts the RL trainer/inference implementation into a new optional package,
packages/verifiers-rl, with its ownpyproject.toml,verifiers_rlmodule, scripts (vf-rl,vf-train,vf-vllm), and docs.Core
verifiersdrops therlextra and rewiresverifiers.__getattr__lazy imports to targetverifiers_rl, emitting a dedicated install hint (uv add verifiers-rl) when RL symbols are accessed without the optional package. Existingverifiers/rl/*modules andverifiers/scripts/{rl,train}.pyare replaced with thin proxy shims, andvf-vllmnow points to a newverifiers/scripts/vllm.pywrapper.Adds a GitHub Actions workflow to build/publish
verifiers-rlonverifiers-rl-v*tags (with tag↔version validation), updates style CI touv syncwithout RL extras, and refreshes training docs to point users atverifiers-rland the new source locations.Written by Cursor Bugbot for commit 3dbac23. This will update automatically on new commits. Configure here.