Add Layout Validation by zhx06 · Pull Request #752 · isaac-sim/IsaacLab-Arena

zhx06 · 2026-06-02T22:11:15Z

Summary

Add layout validation with save/load storage

Detailed description

Add structured ValidationReport, frozen and fails-closed, through PlacementResult, ranking, and fallback paths.
Unify acceptance, selection, and diagnostics on an injectable layout_filter; track per-layout use_count.
Add PooledObjectPlacer save/load to reuse solved poses, persisting placement_seed and had_fallbacks for reproducible sampling.
Isolate the JSON schema in layout_pool_serialization with atomic, fail-loud writes and full load-time guards.

greptile-apps · 2026-06-02T22:18:38Z

Greptile Summary

This PR adds structured layout validation via an immutable ValidationReport that replaces the previous bool success flag, an injectable LayoutFilter predicate that unifies acceptance across ranking, storage, and fallback paths, and save/load persistence for PooledObjectPlacer that serializes solved poses to JSON for reuse without re-solving.

ValidationReport is frozen and fails-closed (empty → not passed), with a with_check extension API for post-pool simulation checks; PlacementResult.success is now a derived property.
PooledLayout wraps each stored result with a use_count counter tracked across both consuming and non-consuming draws; _store_env_matched_results now correctly guards against setting _had_fallbacks when env_results is empty.
layout_pool_serialization isolates the JSON schema with atomic fail-loud writes and full structural load-time validation; the assert-based guards in this module (and now also in ValidationReport) are stripped by python -O, a concern already noted in the prior review thread for the serialization module.

Confidence Score: 5/5

The change is safe to merge. Core ranking, fallback, and persistence logic is correct; the only open items are the assert-stripping pattern already flagged in the prior review thread.

The refactoring is thorough and internally consistent. The bug fix in _store_env_matched_results (guarding against empty env_results before setting _had_fallbacks) is correct. The save/load round-trip is well-tested with seed restoration, consumed-layout preservation, heterogeneity mismatch detection, and object-set validation. No logic path leaves objects silently mis-placed or the pool silently empty.

The assert-based validation in placement_result.py (ValidationReport.__post_init__, with_check) and the return-type guard in pooled_object_placer.py (accepts()) follow the same pattern already flagged in layout_pool_serialization.py — all three files warrant the same fix if the team decides to address the optimised-build concern.

Important Files Changed

Filename	Overview
isaaclab_arena/relations/placement_result.py	Introduces `ValidationReport` (frozen, immutable checks map), `LayoutFilter` type alias, and `default_layout_filter`; replaces `PlacementResult.success` bool field with a derived property. Core design is sound but `assert`-based bool enforcement is stripped in optimised builds.
isaaclab_arena/relations/layout_pool_serialization.py	New module owning the JSON schema for pool persistence. Atomic write with OSError cleanup and NaN guard are correct; `assert`-based structural checks (already flagged in prior thread) remain the open concern in debug vs. optimised builds.
isaaclab_arena/relations/pooled_object_placer.py	Adds `PooledLayout` wrapper with `use_count`, injectable `layout_filter`, `accepts()`, `_summarize_rejections`, `stored_layouts`, and `save`/`load`. Bug fix: `_store_env_matched_results` now guards against empty `env_results` before setting `_had_fallbacks`; `accepts()` guard is assert-based and stripped in optimised builds.
isaaclab_arena/relations/object_placer.py	Renames `_validate_placement` → `_validate_geometry` (now returns `ValidationReport`), adds injectable `layout_filter` parameter, and changes `_rank_candidates` from a `@staticmethod` to an instance method using `self._accepts`. All call sites updated correctly.
isaaclab_arena/relations/placement_events.py	Switches the fallback-detection condition from `result.success` to `placement_pool.accepts(result)`, keeping acceptance logic consistent with the pool's filter. All mock pools in tests are updated to expose `accepts()`.
isaaclab_arena/tests/test_layout_pool_serialization.py	New test file covering round-trips, structural guards, NaN rejection, orphan-tmp cleanup, and all serialization edge cases with good parametrised coverage.
isaaclab_arena/tests/test_object_placer_reproducibility.py	Extensive save/load round-trip tests, use_count tracking, stored_layouts post-validation flow, snapshot isolation from refills, and seed-restore coverage added.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant PooledObjectPlacer
    participant ObjectPlacer
    participant ValidationReport
    participant Serialization

    Caller->>PooledObjectPlacer: __init__(objects, placer_params, layout_filter)
    PooledObjectPlacer->>ObjectPlacer: ObjectPlacer(params, layout_filter)
    PooledObjectPlacer->>PooledObjectPlacer: _solve_and_store(pool_size)
    ObjectPlacer->>ObjectPlacer: _validate_geometry(positions, bboxes)
    ObjectPlacer-->>ValidationReport: "ValidationReport(checks={no_overlap, on_relations})"
    ObjectPlacer->>ObjectPlacer: _rank_candidates(self._accepts)
    ObjectPlacer-->>PooledObjectPlacer: "PlacementResult(validation=report, ...)"
    PooledObjectPlacer->>PooledObjectPlacer: accepts(result) via layout_filter(report)
    PooledObjectPlacer->>PooledObjectPlacer: "store as PooledLayout(result, use_count=0)"

    Caller->>PooledObjectPlacer: save(path)
    PooledObjectPlacer->>Serialization: write_pool_document(path, PoolDocument)
    Serialization-->>Caller: JSON file (atomic write)

    Caller->>PooledObjectPlacer: load(path, objects, params)
    PooledObjectPlacer->>Serialization: read_pool_document(path)
    Serialization-->>PooledObjectPlacer: PoolDocument (validated)
    PooledObjectPlacer->>Serialization: deserialize_layout(data, name_to_obj)
    Serialization-->>PooledObjectPlacer: PlacementResult (reconstructed)
    PooledObjectPlacer-->>Caller: PooledObjectPlacer (loaded, seed restored)

    Caller->>PooledObjectPlacer: sample_with_replacement(count)
    PooledObjectPlacer->>PooledObjectPlacer: _draw(rng, pooled_layouts) mark_used()
    PooledObjectPlacer-->>Caller: list[PlacementResult]

_{Reviews (4): Last reviewed commit: "address comments" | Re-trigger Greptile}

isaaclab-review-bot

Review: Add Layout Validation

Good feature addition. The ValidationReport / LayoutFilter abstraction cleanly separates geometry checks from acceptance policy, and the serialization layer is well-guarded. A few items worth addressing before merge:

1. `assert` used for runtime validation in serialization (Medium)

layout_pool_serialization.py uses assert statements for all structural validation of loaded files. In optimized Python (python -O), assertions are stripped entirely, meaning a corrupt file would pass silently and produce wrong poses.

Suggestion: Replace assert with explicit if … raise ValueError(…) (or a dedicated LayoutPoolError) for the user-facing validation in read_pool_document, deserialize_layout, and PoolDocument.from_dict. Keep assert only for internal invariants that should never fire.

2. `write_pool_document` doesn't `fsync` before `os.replace` (Low)

The atomic write pattern (write → os.replace) is correct for crash-consistency on most filesystems, but without os.fsync(fd) on the temp file (or os.fsync on the directory), data can be lost on power failure before the kernel flushes to disk.

Suggestion: If this is used in long-running sim training where a crash mid-write would lose hours of solved poses, consider adding an fsync call:

with open(tmp_path, "w") as f:
    f.write(payload)
    f.flush()
    os.fsync(f.fileno())
os.replace(tmp_path, path)

3. `_draw` marks `use_count` but `sample_with_replacement` doesn't record which slot consumed it (Low)

PooledLayout.use_count increments on every draw (both consuming and non-consuming), which is fine for observability. However, there's no way to distinguish a sample_with_replacement use from a sample_without_replacement / sample_for_envs use. If analytics or debugging later needs that distinction, consider tracking served_count vs consumed_count separately.

4. `PooledObjectPlacer.load()` silently ignores `pool_size` from caller (Low)

load() overrides pool_size with max(1, loaded_count). A caller passing pool_size=200 might expect the loaded pool to refill up to 200 on the next solve, but the loaded count becomes the refill batch size. This is defensible but worth documenting in the docstring.

5. `deserialize_layout` requires `set(orientations) <= set(positions)` but not equality (Nit)

Positions require an exact object-set match against name_to_obj, but orientations only require a subset of positions. This means an object without a saved yaw silently gets orientations = {} for that object on load. The asymmetry is intentional (some objects have no yaw), but a brief comment explaining why the subset check rather than equality would help future readers.

6. Missing `init.py` re-exports for new public API (Low)

ValidationReport, LayoutFilter, default_layout_filter, and the serialization helpers aren't visible from the module-level __init__.py (if one exists). If these are intended as public API for downstream consumers, consider re-exporting them.

7. Test helper `_report` duplicated across test files (Nit)

test_heterogeneous_placement.py defines a local _report(passed) helper. Consider moving it to tests/utils/placement.py (which this PR already introduces) to avoid future duplication.

Summary

The design is solid: ValidationReport is frozen and fails-closed, the injectable LayoutFilter keeps the placer and pool aligned on the same acceptance criterion, and the serialization is thorough with atomic writes and structural guards. The test coverage is excellent.

Verdict: Approve with minor suggestions — the assert-for-validation concern (item 1) is the most impactful; the rest are quality-of-life improvements.

📝 Update (`52c035c`)

Reviewed 6 new commits (c897caf2..52c035cc). Changes include NotNextTo relation/loss strategy, chunked eval dispatch, layout pool save/load, PooledLayout use_count tracking, and stored_layouts accessor.

✅ Item 1 inline (orphan .tmp cleanup): Fixed — write_pool_document now has try/except with unlink(missing_ok=True).
Item 1 (assert-based validation): Still present — not addressed in these commits.
Items 2–7: Unchanged / not addressed (all low priority, fine to defer).

New code looks good overall:

NotNextTo loss strategy is well-designed (min-of-two-escape-routes avoids dead plateaus) and comprehensively tested with solver integration.
Layout pool save()/load() round-trip is thorough with structural guards and 20+ focused tests.
Chunked eval runner is pragmatic for OOM recovery; the TODO for metrics aggregation is appropriate.
The isaaclab_arena_env → task_description simplification is a clean decoupling.

No new blocking issues found. Previous verdict stands: Approve with minor suggestions (item 1 remains the only medium-priority concern).

📝 Update (`3832af0`)

Reviewed 1 new commit (52c035cc..3832af07). Minor wording refinement only:

Warning messages now say "did not meet the pool's acceptance criteria" instead of "failed strict placement validation" — clearer messaging that aligns with the LayoutFilter abstraction.
Test fixtures updated to match the ValidationReport-based API.

No new issues. Previous concerns remain unchanged. Verdict: Approve with minor suggestions.

Update (249ea78): Incremental review of changes since 3832af0.

New: Variation System (`isaaclab_arena/variations/`)

Clean, well-structured two-tier variation framework:

BuildTimeVariationBase — sampled once before env composition (HDR map selection)
RunTimeVariationBase — realized via EventTermCfg on each reset (camera extrinsics)

The sampler hierarchy (SamplerBase → ContinuousSampler/ChoiceSampler) is minimal and extensible. The VariationBaseCfg.sampler_cfg.build() pattern keeps configuration declarative.

Camera extrinsics variation (camera_extrinsics_variation.py): Good design — snapshots nominal pose on first call, reapplies nominal + delta each reset to prevent drift accumulation. The ROS→OpenGL frame conversion is well-documented. Note the Isaac Lab quaternion bug is correctly handled and explicitly tested (test_isaaclab_bug_get_local_poses.py).

New: Layout Pool Serialization (`layout_pool_serialization.py`)

Solid defensive serialization with structural guards at every level:

PoolDocument.from_dict validates types, lengths, and key presence
deserialize_layout rejects non-finite values, wrong-length positions, empty validation maps
Atomic write via temp file + os.replace (previous fsync comment still applies)
Non-finite JSON constants rejected on read (NaN/Infinity literals)

Graph Spec: `cli_override_specs` + Pydantic Migration

cli_override_specs in YAML allows declarative swappable flags (e.g. --object, --embodiment) — nice ergonomic improvement for users running different configs from the same graph
at_pose → at_position and in → on aligns the spec closer to what the solver actually supports
Validation errors now surface as pydantic.ValidationError instead of raw AssertionError — cleaner error messages for malformed YAML

Test Coverage

Thorough test additions:

test_layout_pool_serialization.py — covers round trips, structural rejections, IO errors, leaf-level guards
test_build_time_variations.py, test_run_time_variations.py, test_camera_extrinsics_variation.py — end-to-end through ArenaEnvBuilder
test_variation_base.py — cfg plumbing, enable/disable, apply_cfg sampler rebuild
test_isaaclab_bug_get_local_poses.py — regression guard for the upstream quaternion layout issue
tests/utils/placement.py — shared layout_signature helper reduces duplication

Minor Observations

pydantic>=2.0 in setup.py: This is a significant new dependency. Confirm it doesn't conflict with Isaac Lab's own Pydantic usage (Isaac Lab pins pydantic in some versions).
ChoiceSampler.sample uses torch.randint: This bypasses the per-env RNG seeds (get_rngs). For the HDR variation (called once at build time), this is fine. If ever used in a per-env runtime path, it would break reproducibility. Worth a brief docstring note.
SamplerBaseCfg.build() raises NotImplementedError: The base class's default cfg will error if someone passes a bare VariationBaseCfg() without overriding sampler_cfg. This is acceptable but could be friendlier with an early check in VariationBase.__init__.

Verdict

Approve — well-tested, well-structured addition. The variation system is clean and the serialization layer is defensively written. Previous review items (assert vs raise in serialization, fsync) still apply.

Signed-off-by: zhx06 <zihaox@nvidia.com>

zhx06 requested review from alexmillane, cvolkcvolk, peterd-NV, qianl-nv, viiik-inside and xyao-nv as code owners June 2, 2026 22:11

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread isaaclab_arena/relations/layout_pool_serialization.py

Comment thread isaaclab_arena/relations/layout_pool_serialization.py

isaaclab-review-bot Bot reviewed Jun 2, 2026

View reviewed changes

zhx06 force-pushed the zxiao/feature/layout_validation branch 2 times, most recently from 52c035c to 3832af0 Compare June 3, 2026 19:37

zhx06 added 5 commits June 4, 2026 12:45

add layout validation with storage

295ec55

Signed-off-by: zhx06 <zihaox@nvidia.com>

polish API

633418d

Signed-off-by: zhx06 <zihaox@nvidia.com>

code clean-up

a6f862c

Signed-off-by: zhx06 <zihaox@nvidia.com>

address comments

fb31c1f

Signed-off-by: zhx06 <zihaox@nvidia.com>

address comments

249ea78

zhx06 force-pushed the zxiao/feature/layout_validation branch from 3832af0 to 249ea78 Compare June 4, 2026 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Layout Validation#752

Add Layout Validation#752
zhx06 wants to merge 5 commits into
mainfrom
zxiao/feature/layout_validation

zhx06 commented Jun 2, 2026

Uh oh!

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhx06 commented Jun 2, 2026

Summary

Detailed description

Uh oh!

greptile-apps Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review: Add Layout Validation

1. assert used for runtime validation in serialization (Medium)

2. write_pool_document doesn't fsync before os.replace (Low)

3. _draw marks use_count but sample_with_replacement doesn't record which slot consumed it (Low)

4. PooledObjectPlacer.load() silently ignores pool_size from caller (Low)

5. deserialize_layout requires set(orientations) <= set(positions) but not equality (Nit)

6. Missing __init__.py re-exports for new public API (Low)

7. Test helper _report duplicated across test files (Nit)

Summary

📝 Update (52c035c)

📝 Update (3832af0)

New: Variation System (isaaclab_arena/variations/)

New: Layout Pool Serialization (layout_pool_serialization.py)

Graph Spec: cli_override_specs + Pydantic Migration

Test Coverage

Minor Observations

Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading

1. `assert` used for runtime validation in serialization (Medium)

2. `write_pool_document` doesn't `fsync` before `os.replace` (Low)

3. `_draw` marks `use_count` but `sample_with_replacement` doesn't record which slot consumed it (Low)

4. `PooledObjectPlacer.load()` silently ignores `pool_size` from caller (Low)

5. `deserialize_layout` requires `set(orientations) <= set(positions)` but not equality (Nit)

6. Missing `init.py` re-exports for new public API (Low)

7. Test helper `_report` duplicated across test files (Nit)

📝 Update (`52c035c`)

📝 Update (`3832af0`)

New: Variation System (`isaaclab_arena/variations/`)

New: Layout Pool Serialization (`layout_pool_serialization.py`)

Graph Spec: `cli_override_specs` + Pydantic Migration