[infra] Enable CPU-only development: bindings stubs, profiler fix, conftest guard#2
Draft
tburt-nv wants to merge 4 commits into
Draft
[infra] Enable CPU-only development: bindings stubs, profiler fix, conftest guard#2tburt-nv wants to merge 4 commits into
tburt-nv wants to merge 4 commits into
Conversation
Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Enable importing tensorrt_llm and running CPU-compatible unit tests on machines without NVIDIA GPU hardware. Changes: - tensorrt_llm/bindings/: Python-only stub package providing minimal stand-in types for DataType, executor configs, BuildInfo, etc. Set TRT_LLM_NO_LIB_INIT=1 to skip loading compiled .so plugins. - tensorrt_llm/profiler.py: Gracefully handle missing NVML driver (pynvml.NVMLError) instead of crashing at import time. - tests/unittest/conftest.py: Skip CUDA memory cleanup when torch.cuda.is_available() is False. Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Replace ~300 lines of hand-maintained boilerplate (duplicated enum values, repeated __init__ patterns) with a single 50-line _stubs.py module. Every stub submodule now delegates to the same metaclass that auto-vivifies child classes on attribute access. Only one project-specific override remains: executor.LookaheadDecodingConfig — its static default-value methods are called at class-definition time by llm_args.py. Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Make the wheel importable on machines without an NVIDIA GPU driver: - bindings/__init__.py: try importing real compiled _C extension first, fall back to Python stubs on ImportError (no TRT_LLM_NO_LIB_INIT needed) - _common.py: _init() catches OSError from missing native .so libs and logs a warning instead of crashing - build_wheel.py: place compiled pybind .so as bindings/_C.cpython-*.so (inside the bindings package) so it coexists with the stub fallback - setup.py: update package_data glob and extract_from_precompiled to handle both new (bindings/_C.*) and legacy (bindings.*) wheel layouts Signed-off-by: Cursor Agent <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Makes the
tensorrt_llmwheel importable on machines without an NVIDIA GPU driver, while preferring the real compiled C++ bindings when available.Design
The
bindings/directory is now a Python package instead of a single compiled.somodule. On import it follows a two-tier strategy:_C.soloads → full bindings_C.sofails (nolibcuda.so) → stubs activate,_init()logs warning_C.sopresent → stubs activateChanges
tensorrt_llm/bindings/__init__.py— try real_Cextension, fall back to stubs. NoTRT_LLM_NO_LIB_INITenv var needed.tensorrt_llm/_common.py—_init()catchesOSErrorfrom missing native.solibs and logs a warning instead of crashing.scripts/build_wheel.py— compiled pybind.sois placed asbindings/_C.cpython-*.so(inside the package) so it coexists with the stub fallback.setup.py— updatedpackage_dataglob;extract_from_precompiledhandles both new (bindings/_C.*) and legacy (bindings.*) wheel layouts by relocating legacy files automatically.tensorrt_llm/bindings/_stubs.py— DRY shared metaclass (unchanged from previous commit).tensorrt_llm/profiler.py— gracefulpynvml.NVMLErrorhandling (unchanged from previous commit).tests/unittest/conftest.py—torch.cuda.is_available()guard (unchanged from previous commit).Test Coverage
35 CPU-compatible unit tests pass — no environment variables required:
unit_tests_final.log
All 21 pre-commit hooks pass.
To show artifacts inline, enable in settings.