Skip to content

[infra] Enable CPU-only development: bindings stubs, profiler fix, conftest guard#2

Draft
tburt-nv wants to merge 4 commits into
user/tburt/report_conflictfrom
cursor/dev-env-setup-c406
Draft

[infra] Enable CPU-only development: bindings stubs, profiler fix, conftest guard#2
tburt-nv wants to merge 4 commits into
user/tburt/report_conflictfrom
cursor/dev-env-setup-c406

Conversation

@tburt-nv

@tburt-nv tburt-nv commented May 19, 2026

Copy link
Copy Markdown
Owner

Description

Makes the tensorrt_llm wheel importable on machines without an NVIDIA GPU driver, while preferring the real compiled C++ bindings when available.

Design

The bindings/ directory is now a Python package instead of a single compiled .so module. On import it follows a two-tier strategy:

bindings/__init__.py
  ├── try:  from ._C import *          ← real pybind11 .so (GPU machine)
  └── except ImportError:              ← Python-only stubs  (CPU machine)
          __getattr__ = make_module_getattr(...)
Environment What happens
GPU machine with compiled wheel Real _C.so loads → full bindings
CPU machine with compiled wheel _C.so fails (no libcuda.so) → stubs activate, _init() logs warning
CPU machine without wheel libs No _C.so present → stubs activate

Changes

tensorrt_llm/bindings/__init__.py — try real _C extension, fall back to stubs. No TRT_LLM_NO_LIB_INIT env var needed.

tensorrt_llm/_common.py_init() catches OSError from missing native .so libs and logs a warning instead of crashing.

scripts/build_wheel.py — compiled pybind .so is placed as bindings/_C.cpython-*.so (inside the package) so it coexists with the stub fallback.

setup.py — updated package_data glob; extract_from_precompiled handles both new (bindings/_C.*) and legacy (bindings.*) wheel layouts by relocating legacy files automatically.

tensorrt_llm/bindings/_stubs.py — DRY shared metaclass (unchanged from previous commit).

tensorrt_llm/profiler.py — graceful pynvml.NVMLError handling (unchanged from previous commit).

tests/unittest/conftest.pytorch.cuda.is_available() guard (unchanged from previous commit).

Test Coverage

35 CPU-compatible unit tests pass — no environment variables required:

python3 -m pytest \
  tests/unittest/llmapi/test_reasoning_parser.py \
  tests/unittest/llmapi/test_build_cache.py \
  tests/unittest/others/test_mapping.py \
  tests/unittest/trt/quantization/test_mode.py \
  tests/unittest/others/test_kv_cache_manager.py \
  tests/unittest/others/test_module.py -v

unit_tests_final.log

All 21 pre-commit hooks pass.

To show artifacts inline, enable in settings.

Open in Web Open in Cursor 

Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Enable importing tensorrt_llm and running CPU-compatible unit tests on
machines without NVIDIA GPU hardware. Changes:

- tensorrt_llm/bindings/: Python-only stub package providing minimal
  stand-in types for DataType, executor configs, BuildInfo, etc.
  Set TRT_LLM_NO_LIB_INIT=1 to skip loading compiled .so plugins.

- tensorrt_llm/profiler.py: Gracefully handle missing NVML driver
  (pynvml.NVMLError) instead of crashing at import time.

- tests/unittest/conftest.py: Skip CUDA memory cleanup when
  torch.cuda.is_available() is False.

Signed-off-by: Cursor Agent <cursoragent@cursor.com>
@cursor cursor Bot changed the title [infra] Add AGENTS.md with Cursor Cloud development environment instructions [infra] Enable CPU-only development: bindings stubs, profiler fix, conftest guard May 19, 2026
Replace ~300 lines of hand-maintained boilerplate (duplicated enum
values, repeated __init__ patterns) with a single 50-line _stubs.py
module.  Every stub submodule now delegates to the same metaclass
that auto-vivifies child classes on attribute access.

Only one project-specific override remains:
  executor.LookaheadDecodingConfig — its static default-value methods
  are called at class-definition time by llm_args.py.

Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Make the wheel importable on machines without an NVIDIA GPU driver:

- bindings/__init__.py: try importing real compiled _C extension first,
  fall back to Python stubs on ImportError (no TRT_LLM_NO_LIB_INIT needed)

- _common.py: _init() catches OSError from missing native .so libs and
  logs a warning instead of crashing

- build_wheel.py: place compiled pybind .so as bindings/_C.cpython-*.so
  (inside the bindings package) so it coexists with the stub fallback

- setup.py: update package_data glob and extract_from_precompiled to
  handle both new (bindings/_C.*) and legacy (bindings.*) wheel layouts

Signed-off-by: Cursor Agent <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants