[Feature] Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn…#3621
Open
theap06 wants to merge 2 commits intopytorch:mainfrom
Open
[Feature] Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn…#3621theap06 wants to merge 2 commits intopytorch:mainfrom
theap06 wants to merge 2 commits intopytorch:mainfrom
Conversation
…er et al. 2023) as a set of composable loss modules and latent model components on top of the existing DreamerV1 infrastructure. New files: - torchrl/objectives/dreamer_v3.py: DreamerV3ModelLoss, DreamerV3ActorLoss, DreamerV3ValueLoss; symlog/symexp transforms; two_hot_encode/decode; categorical_kl_balanced with KL balancing (alpha=0.8) and free bits - torchrl/modules/models/model_based_v3.py: RSSMPriorV3, RSSMPosteriorV3, RSSMRolloutV3 with discrete categorical latent and straight-through estimator Updated: - torchrl/objectives/__init__.py: exports V3 losses and utility functions - torchrl/modules/models/__init__.py: exports V3 RSSM classes - test/test_objectives.py: TestDreamerV3 class with 30 tests (all passing) - docs/source/reference/objectives_other.rst: DreamerV3 API reference section - docs/source/reference/modules_models.rst: V3 RSSM classes in world models section
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3621
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 13 New Failures, 1 Cancelled JobAs of commit 182811b with merge base be89a87 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
[Quality] |
Quality | [Quality] Fix typos and add codespell |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
self-contained alongside the other dedicated test files.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #3613
Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafner et al. 2023) as a set of composable loss modules and latent model components on top of the existing DreamerV1 infrastructure.
New files:
torchrl/objectives/dreamer_v3.py: DreamerV3ModelLoss, DreamerV3ActorLoss, DreamerV3ValueLoss; symlog/symexp transforms; two_hot_encode/two_hot_decode; categorical_kl_balanced with KL balancing (α=0.8) and free bits
torchrl/modules/models/model_based_v3.py: RSSMPriorV3, RSSMPosteriorV3, RSSMRolloutV3 with discrete categorical latent and straight-through gradient estimator
Updated:
torchrl/objectives/init.py: exports V3 losses and utility functions
torchrl/modules/models/init.py: exports V3 RSSM classes
test/test_objectives.py: TestDreamerV3 class with 30 tests (all passing)
docs/source/reference/objectives_other.rst: DreamerV3 API reference section
docs/source/reference/modules_models.rst: V3 RSSM classes in world models section
Motivation and Context
TorchRL supports DreamerV1/V2 but not DreamerV3, which introduces key improvements: discrete categorical latent states, KL balancing between prior and posterior, symlog-compressed targets, and two-hot categorical value distributions. These changes significantly improve stability and sample efficiency across diverse domains. This PR adds V3 as a self-contained extension — V1 is untouched and existing users are unaffected.
I have raised an issue to propose this change (required for new features and bug fixes)
Types of changes
New feature (non-breaking change which adds core functionality)
Documentation (update in the documentation)
Checklist
I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.