-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Replace RenderGraph with systems
#22144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
It looks like your PR is a breaking change, but you didn't provide a migration guide. Please review the instructions for writing migration guides, then expand or revise the content in the migration guides directory to reflect your changes. |
|
It looks like your PR has been selected for a highlight in the next release blog post, but you didn't provide a release note. Please review the instructions for writing release notes, then expand or revise the content in the release notes directory to showcase your changes. |
| ) | ||
| .entered(); | ||
|
|
||
| world.run_schedule(schedule); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the "big" reasons I've pushed back against RenderGraph -> Systems in the past is that it prevents us from parallelizing command encoding in cases like this where we need to run specific, potentially very expensive "subgraphs" in a loop.
The ownership and access patterns in the current system allow for that sort of thing (and were explicitly chosen to enable it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things that I'd like to note:
- With the exception of one caveat I'll note in the PR description, this isn't worse than the existing solution in terms of overall parallelism.
- In the context of cameras, we effectively currently need to be serial because we re-use
ViewTargettextures. Without a higher level resource tracking system, we have to pessimistically assume that cameras need to run in terms of the order they are given due to implicit dependencies. - Something like a "read only" schedule marker has been proposed before to allow running schedules on
&World. ECS devs have suggested to me this is feasible. - We do do mutation in the current graph in many places, but have to resort to hack-y atomics usage. It's not crazy that you want to mutate world sometimes (i.e. to insert some gpu resource to be used by the next frame).
- I'd like to solve issues around parallelism in the context of more generalized ECS patterns. I have a number of more speculative ideas here, but I think there are a few different directions we could go.
But yeah, we obviously need to make sure we don't regress current state performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to keep in mind maintenance. This PR manages to remove a lot of code and makes the render graph much more approachable for people that already know bevy's ECS. It's 3.8k less lines of code to maintain. Of course, line of code isn't a super useful metric for many things but I think it's significant enough that it can't just be ignored. With the current render graph, we almost never see PRs to it and I wouldn't be surprised if a big reason for that is that it looks so foreign in bevy and because it's also a lot of very complex code.
| .add_systems(Render, prepare_cas_pipelines.in_set(RenderSystems::Prepare)) | ||
| .add_systems( | ||
| Core3d, | ||
| cas.after(fxaa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewer note: fxaa is ordered after tonemapping, so the .after tonemapping here is redundant thus removed
| Core3d, | ||
| temporal_anti_alias | ||
| .after(Core3dSystems::StartMainPassPostProcessing) | ||
| .before(Core3dSystems::PostProcessing), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some ordering constraints were lost here
atlv24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to check Hide whitespace in diff view options!
Some notes:
- should we be using fallible systems to more closely match the Result behavior of the original nodes?
- some ordering constraints were removed but i think many were redundant. i noted these
- tracing infra probably is redundant with system traces now (followup)
- can probably use run conditions in a bunch of places too now (followup)
- render_context -> ctx could have been a different pr (and is a matter of taste but i agree with it anyways) its just noise here
- one of the solari files is jumbled and unreviewable
This is an absolutely colossal amount of work with incredible attention to detail
fantastic job, excited to see this merge. Lots of small fixes bundled here and its pretty much all good. There's some comments that were removed but i dont feel very strongly about them.
Very in favor of this merging, approval will come after comments addressed and pr exits draft status.
I'm going to dig into the CI failures soon
| start = end; | ||
| result | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets punt this back down to the bottom of the file
| SmaaPreset::Ultra => "SMAA_PRESET_ULTRA".into(), | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets punt this back down to the bottom of the file
| mut ctx: RenderContext, | ||
| ) { | ||
| run_downsample_depth_system( | ||
| "late_downsample_depth", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the late and early seem to be identical aside from debug string, perhaps we can introduce a const generic and swap the string with it to collapse them into one function
| meshlet_prepass | ||
| .after(early_shadow_pass) | ||
| .before(Core3dSystems::EndPrepasses), | ||
| meshlet_deferred_gbuffer_prepass | ||
| .after(meshlet_prepass) | ||
| .before(Core3dSystems::EndPrepasses), | ||
| meshlet_main_opaque_pass | ||
| .after(Core3dSystems::StartMainPass) | ||
| .before(Core3dSystems::EndMainPass), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a faithful transcription of the orderings, not sure why its different or if it matters though
| // Early GPU preprocess runs before early prepasses | ||
| early_gpu_preprocess | ||
| .after(clear_indirect_parameters_metadata) | ||
| .before(early_prepass), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this before is redundant, but its fine to leave it for consistency i guess
| render_sky_pass.set_viewport( | ||
| viewport.physical_position.x as f32, | ||
| viewport.physical_position.y as f32, | ||
| viewport.physical_size.x as f32, | ||
| viewport.physical_size.y as f32, | ||
| viewport.depth.start, | ||
| viewport.depth.end, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why this is like this now
| let vertex_state = fullscreen_shader.to_vertex_state(); | ||
| let mut desc = RenderPipelineDescriptor { | ||
| label: Some("post_process_pipeline".into()), | ||
| label: Some(format!("fullscreen_material_pipeline<{}>", type_name::<T>()).into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you this is so much better 😭
| ) | ||
| .add_render_graph_edge(Core3d, Node3d::MsaaWriteback, Node3d::StartMainPass); | ||
| } | ||
| render_app.add_systems(Core3d, msaa_writeback.before(Core3dSystems::EndPrepasses)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this before EndPrepasses now? it used to e before StartMainPass
| prepare::{SolariLightingResources, LIGHT_TILE_BLOCKS, WORLD_CACHE_SIZE}, | ||
| SolariLighting, | ||
| }; | ||
| use crate::scene::RaytracingSceneBindings; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the diff in this file is horrifying, can you reorder the bits to be closer to the order they used to be in so its easier to review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't have two different solari_lighting systems for dlss/no-dlss.
| let mut schedule = Schedule::new(Self); | ||
|
|
||
| schedule.set_build_settings(ScheduleBuildSettings { | ||
| ..Default::default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why doesnt 2d get auto_insert_apply_deferred: false,?
render-graph-as-systems
This PR removes the
RenderGraphin favor of using systems.Motivation
The
RenderGraphAPI was originally created when the ECS was significantly more immature. It was also created with the intention of supporting an input/output based slot system for managing resources that has never been used. While resource management is an important potential use of a render graph, current rendering code doesn't make use of any patterns relating to it.Since the ECS has improved, the functionality of
Schedulehas basically become co-extensive with what theRenderGraphAPI is doing, i.e. ordering bits of system-like logic relative to one another and executing them in a big chunk. Additionally, while there's still desire for more advanced techniques like resource management in the graph, it's desirable to implement those in ECS terms rather than creating moreRenderGraphspecific abstraction.In short, this sets us up to iterate on a more ECS based approach, while deleting ~3k lines of mostly unused code.
Implementation
At a high level: We use
Scheduleas our "sub-graph." Rather than running the graph, we run a schedule. Systems can be ordered relative to one another.The render system uses a
RenderGraphschedule to define the "root" of the graph.core_pipelineadds acamera_driversystem that runs the per-camera schedules. This top level schedule provides an extension point for apps that may want to do custom rendering, or non-camera rendering.CurrentView/ViewQueryWhen running schedules per-camera in the
camera_driversystem, we insert aCurrentViewresource that's used to mark the currently iterating view. We also add a new paramViewQuerythat internally uses this resource to execute the query and skip the system if it doesn't match as a convenience.RenderContextThe
RenderContextis now a system param that wraps aDeferredfor tracking the state of the current command encoder and queued buffers.TODO: