Skip to content

[Feature] Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn…#3621

Open
theap06 wants to merge 2 commits intopytorch:mainfrom
theap06:feat/dreamer-v3
Open

[Feature] Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn…#3621
theap06 wants to merge 2 commits intopytorch:mainfrom
theap06:feat/dreamer-v3

Conversation

@theap06
Copy link
Copy Markdown
Contributor

@theap06 theap06 commented Apr 12, 2026

Fixes #3613

Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafner et al. 2023) as a set of composable loss modules and latent model components on top of the existing DreamerV1 infrastructure.

New files:

torchrl/objectives/dreamer_v3.py: DreamerV3ModelLoss, DreamerV3ActorLoss, DreamerV3ValueLoss; symlog/symexp transforms; two_hot_encode/two_hot_decode; categorical_kl_balanced with KL balancing (α=0.8) and free bits
torchrl/modules/models/model_based_v3.py: RSSMPriorV3, RSSMPosteriorV3, RSSMRolloutV3 with discrete categorical latent and straight-through gradient estimator
Updated:

torchrl/objectives/init.py: exports V3 losses and utility functions
torchrl/modules/models/init.py: exports V3 RSSM classes
test/test_objectives.py: TestDreamerV3 class with 30 tests (all passing)
docs/source/reference/objectives_other.rst: DreamerV3 API reference section
docs/source/reference/modules_models.rst: V3 RSSM classes in world models section
Motivation and Context
TorchRL supports DreamerV1/V2 but not DreamerV3, which introduces key improvements: discrete categorical latent states, KL balancing between prior and posterior, symlog-compressed targets, and two-hot categorical value distributions. These changes significantly improve stability and sample efficiency across diverse domains. This PR adds V3 as a self-contained extension — V1 is untouched and existing users are unaffected.

I have raised an issue to propose this change (required for new features and bug fixes)
Types of changes
New feature (non-breaking change which adds core functionality)
Documentation (update in the documentation)
Checklist
I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

…er et al. 2023)

as a set of composable loss modules and latent model components on top of the
existing DreamerV1 infrastructure.

New files:
- torchrl/objectives/dreamer_v3.py: DreamerV3ModelLoss, DreamerV3ActorLoss,
  DreamerV3ValueLoss; symlog/symexp transforms; two_hot_encode/decode;
  categorical_kl_balanced with KL balancing (alpha=0.8) and free bits
- torchrl/modules/models/model_based_v3.py: RSSMPriorV3, RSSMPosteriorV3,
  RSSMRolloutV3 with discrete categorical latent and straight-through estimator

Updated:
- torchrl/objectives/__init__.py: exports V3 losses and utility functions
- torchrl/modules/models/__init__.py: exports V3 RSSM classes
- test/test_objectives.py: TestDreamerV3 class with 30 tests (all passing)
- docs/source/reference/objectives_other.rst: DreamerV3 API reference section
- docs/source/reference/modules_models.rst: V3 RSSM classes in world models section
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 12, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3621

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 13 New Failures, 1 Cancelled Job

As of commit 182811b with merge base be89a87 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2026
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ PR Title Label Error

PR title must start with a label prefix in brackets (e.g., [BugFix]).

Current title: Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn…

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@github-actions github-actions bot added Documentation Improvements or additions to documentation Objectives Modules labels Apr 12, 2026
@theap06 theap06 changed the title Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn… [Feature] Implements DreamerV3 (Mastering Diverse Domains in World Models, Hafn… Apr 12, 2026
@github-actions github-actions bot added the Feature New feature label Apr 12, 2026
self-contained alongside the other dedicated test files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature Modules Objectives

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Implementing TD-MPC for a Model-Based Approach

1 participant