Skip to content

Refactor cartpole tasks, make presets available to direct envs#5930

Merged
ooctipus merged 7 commits into
isaac-sim:developfrom
StafaH:mh/cartpole_overhaul2
Jun 3, 2026
Merged

Refactor cartpole tasks, make presets available to direct envs#5930
ooctipus merged 7 commits into
isaac-sim:developfrom
StafaH:mh/cartpole_overhaul2

Conversation

@StafaH
Copy link
Copy Markdown
Contributor

@StafaH StafaH commented Jun 3, 2026

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (existing functionality will not work without user modification)
  • Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team labels Jun 3, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #5930 - Refactor cartpole tasks, make presets available to direct envs

Thanks for this cleanup, @StafaH! This is a substantial refactor that consolidates the cartpole camera environments and adds proper RSL-RL vision training support. I've reviewed the changes across all 31 files. Here's my assessment:

✅ What I Like

  1. Consolidation is well-executed: Merging CartpoleCameraPresetsEnv into CartpoleCameraEnv and having the camera env inherit from CartpoleEnv eliminates significant code duplication. The reward/termination/reset logic is now shared properly.

  2. Channel-first output is the right call: Switching from [H, W, C] to [C, H, W] output matches what CNN policies (rsl_rl, rl_games, skrl) expect. The changelog documents this breaking change appropriately.

  3. The custom CNNModel fix is clean: The isaaclab_rl.rsl_rl.models.CNNModel override to support image-only observations (no 1D groups) is a minimal, surgical fix for the upstream rsl-rl limitation.

  4. The permute parameter in mdp.image(): Nice addition to let manager-based envs opt into channel-first output at the observation term level.


🔍 Issues & Suggestions

1. Missing newline at EOF (Minor)

# source/isaaclab_tasks/isaaclab_tasks/core/cartpole/agents/rsl_rl_ppo_cfg.py
    )
# ^ File ends without newline

2. Unused cart_pos removal in reward function looks intentional but undocumented

-            self.joint_pos[:, self._cart_dof_idx[0]],
             self.joint_vel[:, self._cart_dof_idx[0]],

The compute_rewards function no longer receives cart_pos, but this parameter was never used in the reward computation anyway (rew_scale_cart_vel uses velocity, not position). This is fine, but worth noting that it's a signature cleanup rather than a behavior change.

3. Potential type annotation inconsistency

In cartpole_direct_camera_env.py, the cfg type annotation:

cfg: CartpoleCameraEnvCfg

But CartpoleCameraEnvCfg is now a PresetCfg wrapper that resolves to BaseCartpoleCameraEnvCfg. After preset resolution, the actual type would be BaseCartpoleCameraEnvCfg. Consider whether the type hint should reflect the resolved type.

4. The showcase cartpole_camera now duplicates base cfg fields

# contrib/cartpole_showcase/cartpole_camera/cartpole_camera_env_cfg.py
class CartpoleCameraEnvCfg(DirectRLEnvCfg):
    # env
    decimation = 2
    episode_length_s = 5.0
    action_scale = 100.0  # [N]
    ...

Since the core CartpoleEnvCfg now shares physics via CartpolePhysicsCfg, the showcase variant could potentially inherit from the refactored base instead of redeclaring everything. Not blocking, but worth considering for future consolidation.

5. Frame stack channel math changed from last-dim to first-dim

The old code:

cfg.observation_space = [*cfg.observation_space[:-1], single_channels * cfg.frame_stack]

The new code:

cfg.observation_space = [single_channels * frame_stack, *cfg.observation_space[1:]]

This is correct for the channel-first change, but the stacking logic in _get_observations should be verified end-to-end (I see the tests were updated, which is good).

6. Attribute access changed from self._cartpole to self.cartpole

# In cartpole_camera_showcase_env.py
-        self._cartpole.set_joint_effort_target_index(...)
+        self.cartpole.set_joint_effort_target_index(...)

This change makes the attribute public. Is this intentional API exposure, or just matching the parent class? Worth documenting if it's meant to be part of the public interface.


📝 Documentation Observations

The changelog entries are thorough and explain the breaking changes well. The tutorial doc update for rsl_rl_manager_ppo_cfgrsl_rl_ppo_cfg is correct.


🧪 Testing

The test updates in test_cartpole_camera_presets_frame_stack.py properly reflect the class rename and the channel-first observation format. The shape assertions were correctly updated:

expected_shape = (env.num_envs, expected_channels, env_cfg.tiled_camera.height, env_cfg.tiled_camera.width)

Summary

This is a solid refactoring PR that improves code organization and adds valuable RSL-RL CNN support. The main concerns are minor (missing newline, potential type annotation refinement). The breaking change to channel-first output is well-documented.

Recommendation: Address the EOF newline and consider the type annotation refinement, but otherwise this looks ready for merge once CI passes.


Review by IsaacLab Review Bot 🤖


🔄 Update (2025-06-03) — Reviewed new commits up to 81254cf

New changes are minor cleanup — no new issues introduced:

  • ✅ Relative imports converted to absolute imports across cartpole modules (consistent style)
  • ✅ Class renamed CartpoleCameraPresetsEnvCfgCartpoleCameraEnvCfg with matching entry point update in __init__.py
  • ✅ Minor formatting improvements (blank lines between code blocks in cartpole_direct_env.py)
  • ✅ Docstring in CartpoleCameraEnv simplified

Previous concerns status: All original suggestions still apply (minor items, non-blocking). No regressions from the new commits.

LGTM on the incremental changes. 👍


🔄 Update (2026-06-03 06:49 UTC) — Reviewed commit 67ef71b

Single new commit adds missing changelog fragments:

  • cartpole-image-permute.minor.rst (isaaclab) — Documents the new permute argument for the mdp.image() observation term. Correctly describes the channel-first layout option.
  • cartpole-overhaul-viz-test.skip (isaaclab_visualizers) — Correctly skips changelog entry for test-only import updates.

These are exactly what the changelog-check CI was flagging. No code changes, no new issues.

Previous concerns status: All original minor items (EOF newline, type annotation refinement) remain non-blocking. LGTM. 👍


🔄 Update: Reviewed new commits (67ef71badd6be969)

New commits move sim_launcher.py from isaaclab_tasks.utils to isaaclab.app (yielding PhysicsCfg from context manager), add RTX scene partitioning across all Kit app files, refactor the manager-based cartpole camera env to a preset-driven pattern, and update TacSL sensor demo to use isaaclab_physx imports. Version bumps across multiple packages (isaaclab 6.2.1, isaaclab_physx 1.1.2, isaaclab_newton 0.14.1, isaaclab_ovphysx 3.0.1, isaaclab_tasks 2.0.1, isaaclab_visualizers 0.1.1).

No new issues found in the incremental changes — these are clean mechanical refactors and import-path migrations.

Previous inline comments remain outstanding (not addressed in this push):

  • P1: In-place mutation of depth sensor buffer
  • P2: experiment_name change breaking checkpoint resumption

🔄 Update (2026-06-03 07:23 UTC) — Reviewed commits dd6be9632c6730

Two files changed — introduces dedicated RSL-RL runner config subclasses for direct envs:

  • P2 Fixed: experiment_name breakage resolved. New CartpoleDirectPPORunnerCfg (experiment_name = "cartpole_direct") and CartpoleCameraDirectPPORunnerCfg (experiment_name = "cartpole_camera_direct") subclasses created, and direct-env registrations in __init__.py now point to them. Existing checkpoints will continue to resolve correctly.

Remaining: P1 (in-place depth buffer mutation) is still outstanding but non-blocking.

No new issues in this push. Clean fix. 👍

StafaH added 2 commits June 2, 2026 23:25
Adds the missing changelog fragments flagged by the changelog-check CI:
a minor-bump entry for the new permute argument on the image observation
in isaaclab, and a skip fragment for the test-only import update in
isaaclab_visualizers.
@StafaH StafaH marked this pull request as ready for review June 3, 2026 06:49
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 3, 2026

Greptile Summary

This PR refactors the cartpole task suite to unify direct and manager-based environments under shared PresetCfg hierarchies for physics backends and camera observation pipelines, eliminating per-variant duplicates (RGB, depth, feature) in favour of runtime-selectable presets.

  • Core/contrib split: The core/cartpole direct-camera env is now NCHW channel-first throughout (config, env, frame-stack buffer, rsl-rl CNN model), while the contrib/cartpole_showcase camera env is intentionally kept NHWC by fully overriding _get_observations.
  • Frame-stack & CNNModel: CartpoleCameraEnv gains frame_stack support via CircularBuffer; a new isaaclab_rl.rsl_rl.models:CNNModel subclass is introduced to support image-only policies without 1D observation groups.

Confidence Score: 4/5

Safe to merge after fixing the in-place depth buffer mutation; the experiment_name change is a usability concern but not a blocker.

One P1 defect (depth tensor in-place mutation corrupts the sensor cache) and one P2 concern (experiment_name change breaks checkpoint resumption for direct-env users). The rest of the refactor is structurally sound: the PresetCfg hierarchy, NCHW channel convention, frame-stack CircularBuffer, and new CNNModel subclass all look correct. The contrib showcase env intentionally stays NHWC with a full _get_observations override, which is self-consistent. Score docked one point for the P1 depth bug.

cartpole_direct_camera_env.py (depth in-place mutation), agents/rsl_rl_ppo_cfg.py (experiment_name)

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py New unified camera env with frame-stack support; depth path mutates sensor buffer in-place (P1 bug).
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env_cfg.py Replaces per-datatype cfg subclasses with PresetCfg hierarchy; observation_space now channel-first [C,H,W].
source/isaaclab_rl/isaaclab_rl/rsl_rl/models.py New CNNModel subclass supporting image-only policies (no 1D obs groups) by conditionally skipping MLPModel.get_latent.
source/isaaclab_rl/isaaclab_rl/rsl_rl/rl_cfg.py class_name updated to fully-qualified 'isaaclab_rl.rsl_rl.models:CNNModel' for the new model subclass.
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/agents/rsl_rl_ppo_cfg.py Unified PPO runner configs; experiment_name changed from 'cartpole_direct' to 'cartpole' (breaks existing checkpoint paths).
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_env.py Removed dead cart_pos parameter from compute_rewards JIT kernel; CartpoleEnv now shared base for proprioceptive and camera variants.
source/isaaclab/isaaclab/envs/mdp/observations.py Added permute: bool = False to image() to support optional NHWC→NCHW conversion at observation pipeline level.
source/isaaclab_tasks/isaaclab_tasks/contrib/cartpole_showcase/cartpole_camera/cartpole_camera_env.py Fixed self._cartpole → self.cartpole API rename; _get_observations returns NHWC intentionally (showcase variant).
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_manager_camera_env_cfg.py Added permute=True to RGB/depth ObsTerms so manager-based camera env also outputs NCHW.
source/isaaclab_tasks/test/core/test_cartpole_camera_presets_frame_stack.py Test updated for new class names and NCHW channel-first shape assertions.

Reviews (1): Last reviewed commit: "Add changelog fragments for cartpole tas..." | Re-trigger Greptile

Comment on lines +109 to 110
elif data_type == "depth":
camera_data[camera_data == float("inf")] = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 In-place mutation of the sensor output buffer for depth

The boolean-mask assignment camera_data[camera_data == float("inf")] = 0 writes directly into the tensor returned by self._tiled_camera.data.output[data_type]. That tensor is owned by the sensor; mutating it corrupts the sensor's internal state for any subsequent reader in the same step (e.g. the write_image_to_file path on line 125, or a second call if the sensor data is cached). The depth branch should clone the data before modifying it.

Suggested change
elif data_type == "depth":
camera_data[camera_data == float("inf")] = 0
elif data_type == "depth":
camera_data = camera_data.clone()
camera_data[camera_data == float("inf")] = 0

Comment on lines +19 to +24
@configclass
class CartpolePPORunnerCfg(RslRlOnPolicyRunnerCfg):
num_steps_per_env = 16
max_iterations = 150
save_interval = 50
experiment_name = "cartpole"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 experiment_name changed for direct-env variants, breaking checkpoint resumption

The deleted rsl_rl_direct_ppo_cfg.py used experiment_name = "cartpole_direct". This file sets experiment_name = "cartpole" for all variants (including the direct-env CartpolePPORunnerCfg). Any user with existing checkpoints under logs/rsl_rl/cartpole_direct/ will need to manually rename their log directory or override the field; the default training script will silently start a fresh run instead of resuming.

@ooctipus ooctipus merged commit 09d897d into isaac-sim:develop Jun 3, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants