Refactor cartpole tasks, make presets available to direct envs#5930
Conversation
There was a problem hiding this comment.
Code Review: PR #5930 - Refactor cartpole tasks, make presets available to direct envs
Thanks for this cleanup, @StafaH! This is a substantial refactor that consolidates the cartpole camera environments and adds proper RSL-RL vision training support. I've reviewed the changes across all 31 files. Here's my assessment:
✅ What I Like
-
Consolidation is well-executed: Merging
CartpoleCameraPresetsEnvintoCartpoleCameraEnvand having the camera env inherit fromCartpoleEnveliminates significant code duplication. The reward/termination/reset logic is now shared properly. -
Channel-first output is the right call: Switching from
[H, W, C]to[C, H, W]output matches what CNN policies (rsl_rl, rl_games, skrl) expect. The changelog documents this breaking change appropriately. -
The custom
CNNModelfix is clean: Theisaaclab_rl.rsl_rl.models.CNNModeloverride to support image-only observations (no 1D groups) is a minimal, surgical fix for the upstream rsl-rl limitation. -
The
permuteparameter inmdp.image(): Nice addition to let manager-based envs opt into channel-first output at the observation term level.
🔍 Issues & Suggestions
1. Missing newline at EOF (Minor)
# source/isaaclab_tasks/isaaclab_tasks/core/cartpole/agents/rsl_rl_ppo_cfg.py
)
# ^ File ends without newline2. Unused cart_pos removal in reward function looks intentional but undocumented
- self.joint_pos[:, self._cart_dof_idx[0]],
self.joint_vel[:, self._cart_dof_idx[0]],The compute_rewards function no longer receives cart_pos, but this parameter was never used in the reward computation anyway (rew_scale_cart_vel uses velocity, not position). This is fine, but worth noting that it's a signature cleanup rather than a behavior change.
3. Potential type annotation inconsistency
In cartpole_direct_camera_env.py, the cfg type annotation:
cfg: CartpoleCameraEnvCfgBut CartpoleCameraEnvCfg is now a PresetCfg wrapper that resolves to BaseCartpoleCameraEnvCfg. After preset resolution, the actual type would be BaseCartpoleCameraEnvCfg. Consider whether the type hint should reflect the resolved type.
4. The showcase cartpole_camera now duplicates base cfg fields
# contrib/cartpole_showcase/cartpole_camera/cartpole_camera_env_cfg.py
class CartpoleCameraEnvCfg(DirectRLEnvCfg):
# env
decimation = 2
episode_length_s = 5.0
action_scale = 100.0 # [N]
...Since the core CartpoleEnvCfg now shares physics via CartpolePhysicsCfg, the showcase variant could potentially inherit from the refactored base instead of redeclaring everything. Not blocking, but worth considering for future consolidation.
5. Frame stack channel math changed from last-dim to first-dim
The old code:
cfg.observation_space = [*cfg.observation_space[:-1], single_channels * cfg.frame_stack]The new code:
cfg.observation_space = [single_channels * frame_stack, *cfg.observation_space[1:]]This is correct for the channel-first change, but the stacking logic in _get_observations should be verified end-to-end (I see the tests were updated, which is good).
6. Attribute access changed from self._cartpole to self.cartpole
# In cartpole_camera_showcase_env.py
- self._cartpole.set_joint_effort_target_index(...)
+ self.cartpole.set_joint_effort_target_index(...)This change makes the attribute public. Is this intentional API exposure, or just matching the parent class? Worth documenting if it's meant to be part of the public interface.
📝 Documentation Observations
The changelog entries are thorough and explain the breaking changes well. The tutorial doc update for rsl_rl_manager_ppo_cfg → rsl_rl_ppo_cfg is correct.
🧪 Testing
The test updates in test_cartpole_camera_presets_frame_stack.py properly reflect the class rename and the channel-first observation format. The shape assertions were correctly updated:
expected_shape = (env.num_envs, expected_channels, env_cfg.tiled_camera.height, env_cfg.tiled_camera.width)Summary
This is a solid refactoring PR that improves code organization and adds valuable RSL-RL CNN support. The main concerns are minor (missing newline, potential type annotation refinement). The breaking change to channel-first output is well-documented.
Recommendation: Address the EOF newline and consider the type annotation refinement, but otherwise this looks ready for merge once CI passes.
Review by IsaacLab Review Bot 🤖
🔄 Update (2025-06-03) — Reviewed new commits up to 81254cf
New changes are minor cleanup — no new issues introduced:
- ✅ Relative imports converted to absolute imports across cartpole modules (consistent style)
- ✅ Class renamed
CartpoleCameraPresetsEnvCfg→CartpoleCameraEnvCfgwith matching entry point update in__init__.py - ✅ Minor formatting improvements (blank lines between code blocks in
cartpole_direct_env.py) - ✅ Docstring in
CartpoleCameraEnvsimplified
Previous concerns status: All original suggestions still apply (minor items, non-blocking). No regressions from the new commits.
LGTM on the incremental changes. 👍
🔄 Update (2026-06-03 06:49 UTC) — Reviewed commit 67ef71b
Single new commit adds missing changelog fragments:
- ✅
cartpole-image-permute.minor.rst(isaaclab) — Documents the newpermuteargument for themdp.image()observation term. Correctly describes the channel-first layout option. - ✅
cartpole-overhaul-viz-test.skip(isaaclab_visualizers) — Correctly skips changelog entry for test-only import updates.
These are exactly what the changelog-check CI was flagging. No code changes, no new issues.
Previous concerns status: All original minor items (EOF newline, type annotation refinement) remain non-blocking. LGTM. 👍
🔄 Update: Reviewed new commits (67ef71ba → dd6be969)
New commits move sim_launcher.py from isaaclab_tasks.utils to isaaclab.app (yielding PhysicsCfg from context manager), add RTX scene partitioning across all Kit app files, refactor the manager-based cartpole camera env to a preset-driven pattern, and update TacSL sensor demo to use isaaclab_physx imports. Version bumps across multiple packages (isaaclab 6.2.1, isaaclab_physx 1.1.2, isaaclab_newton 0.14.1, isaaclab_ovphysx 3.0.1, isaaclab_tasks 2.0.1, isaaclab_visualizers 0.1.1).
No new issues found in the incremental changes — these are clean mechanical refactors and import-path migrations.
Previous inline comments remain outstanding (not addressed in this push):
- P1: In-place mutation of depth sensor buffer
- P2:
experiment_namechange breaking checkpoint resumption
🔄 Update (2026-06-03 07:23 UTC) — Reviewed commits dd6be96 → 32c6730
Two files changed — introduces dedicated RSL-RL runner config subclasses for direct envs:
- ✅ P2 Fixed:
experiment_namebreakage resolved. NewCartpoleDirectPPORunnerCfg(experiment_name = "cartpole_direct") andCartpoleCameraDirectPPORunnerCfg(experiment_name = "cartpole_camera_direct") subclasses created, and direct-env registrations in__init__.pynow point to them. Existing checkpoints will continue to resolve correctly.
Remaining: P1 (in-place depth buffer mutation) is still outstanding but non-blocking.
No new issues in this push. Clean fix. 👍
Adds the missing changelog fragments flagged by the changelog-check CI: a minor-bump entry for the new permute argument on the image observation in isaaclab, and a skip fragment for the test-only import update in isaaclab_visualizers.
Greptile SummaryThis PR refactors the cartpole task suite to unify direct and manager-based environments under shared
Confidence Score: 4/5Safe to merge after fixing the in-place depth buffer mutation; the experiment_name change is a usability concern but not a blocker. One P1 defect (depth tensor in-place mutation corrupts the sensor cache) and one P2 concern (experiment_name change breaks checkpoint resumption for direct-env users). The rest of the refactor is structurally sound: the PresetCfg hierarchy, NCHW channel convention, frame-stack CircularBuffer, and new CNNModel subclass all look correct. The contrib showcase env intentionally stays NHWC with a full _get_observations override, which is self-consistent. Score docked one point for the P1 depth bug. cartpole_direct_camera_env.py (depth in-place mutation), agents/rsl_rl_ppo_cfg.py (experiment_name) Important Files Changed
Reviews (1): Last reviewed commit: "Add changelog fragments for cartpole tas..." | Re-trigger Greptile |
| elif data_type == "depth": | ||
| camera_data[camera_data == float("inf")] = 0 |
There was a problem hiding this comment.
In-place mutation of the sensor output buffer for depth
The boolean-mask assignment camera_data[camera_data == float("inf")] = 0 writes directly into the tensor returned by self._tiled_camera.data.output[data_type]. That tensor is owned by the sensor; mutating it corrupts the sensor's internal state for any subsequent reader in the same step (e.g. the write_image_to_file path on line 125, or a second call if the sensor data is cached). The depth branch should clone the data before modifying it.
| elif data_type == "depth": | |
| camera_data[camera_data == float("inf")] = 0 | |
| elif data_type == "depth": | |
| camera_data = camera_data.clone() | |
| camera_data[camera_data == float("inf")] = 0 |
| @configclass | ||
| class CartpolePPORunnerCfg(RslRlOnPolicyRunnerCfg): | ||
| num_steps_per_env = 16 | ||
| max_iterations = 150 | ||
| save_interval = 50 | ||
| experiment_name = "cartpole" |
There was a problem hiding this comment.
experiment_name changed for direct-env variants, breaking checkpoint resumption
The deleted rsl_rl_direct_ppo_cfg.py used experiment_name = "cartpole_direct". This file sets experiment_name = "cartpole" for all variants (including the direct-env CartpolePPORunnerCfg). Any user with existing checkpoints under logs/rsl_rl/cartpole_direct/ will need to manually rename their log directory or override the field; the default training script will silently start a fresh run instead of resuming.
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Screenshots
Please attach before and after screenshots of the change if applicable.
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there