Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc#61
Open
fwd4 wants to merge 9 commits into
Open
Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc#61fwd4 wants to merge 9 commits into
fwd4 wants to merge 9 commits into
Conversation
…ness, and doc Mirrors the DROID action-policy counterpart (action_policy_droid_nano + repro toml + launch + doc). Net-new LIBERO feature: - experiment config: action_policy_libero_nano - dataset: LIBEROLeRobotDataset + get_action_libero_sft_dataset (frame_wise_relative rot6d, quantile_rot, concat_view, 20fps); base_dataset tasks.parquet fallback for community LIBERO layouts; resample-on-decode-failure guard (matches i4 behavior) - closed-loop eval harness (vectorized sim) + batched /predict_batch inference path + single-rank no_dist checkpoint load for the policy server - canonical recipe action_policy_libero_repro.toml + launch_sft_action_policy_libero.sh (lr 5e-5, warmup 500, cycle 16000, global batch 2048; ~95% libero_10 500-ep eval) - docs/action_policy_libero_sft.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lean the toml/config/launch/doc comments (drop SR numbers and experimental detail), and set the canonical recipe to HSDP 2x8 with grad_accum=1 (global batch 2048) instead of single-node grad_accum=2.
- action_sft_dataset.py: rebuild as origin/main + libero-only (drop the speedup-era ShardedDROIDLeRobotDataset import that broke config load on a clean main). - remove dataset_reply_action_server.py (GT-replay debug tool, not part of the recipe). - drop DROID/LoRA references from libero docstrings/comments/doc/launch.
…etions); EGL setup optional in doc
…aunch -> gbs 2048)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the Cosmos3-Nano LIBERO-10 action-policy SFT surface, mirroring the existing DROID counterpart (
action_policy_droid_nano+action_policy_droid_repro.toml+launch_sft_action_policy_droid.sh+ doc).Feature (net-new)
action_policy_libero_nano(gen + action heads from the public Cosmos3-Nano GA base).LIBEROLeRobotDataset+get_action_libero_sft_dataset— frame_wise_relative rot6d,quantile_rot, concat_view (third-person + wrist), 20 fps.base_datasettasks.parquetfallback for community LIBERO layouts./predict_batchpath and single-rankno_distcheckpoint load in the policy server.Recipe + doc
examples/toml/sft_config/action_policy_libero_repro.toml+examples/launch_sft_action_policy_libero.sh: lr 5e-5, warmup 500, cycle 16000, global batch 2048 (HSDP 2x8).docs/action_policy_libero_sft.md.Notes
main.🤖 Generated with Claude Code