-
Notifications
You must be signed in to change notification settings - Fork 25
[AutoDiff] Adstack max-reducer: parallel multi-axis MaxOverRange dispatch #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
c56d388
[AutoDiff] Stage 1.1: recognize MaxOverRange specs reducible by a par…
duburcqa 5b7c464
[AutoDiff] Stage 1.2: SPIR-V max-reducer shader (option D); body byte…
duburcqa cf1fdd2
[AutoDiff] Stage 1.3: LLVM runtime function for the option-D max redu…
duburcqa 872e2a6
[AutoDiff] Stage 1.4a: AdStackCache max-reducer cache methods + body …
duburcqa 73f1f36
[AutoDiff] Stage 1.6: substitute_precomputed_max_over_range helper (r…
duburcqa 961349f
[AutoDiff] Stage 1.4b: GfxRuntime::dispatch_max_reducers + adstack_ma…
duburcqa 639e1cd
[AutoDiff] Stage 1.4+1.6: launch_kernel wires dispatch_max_reducers a…
duburcqa 0df62e1
[AutoDiff] Hard-require PSB+Int64 at the adstack reverse-mode entry; …
duburcqa a9c9b95
[AutoDiff] Stage 1.5 + comment cleanup: LLVM dispatch_max_reducers_fo…
duburcqa c03016f
[AutoDiff] Adstack max-reducer: dispatch fixes, Metal u32 atomic, cap…
duburcqa f73c157
[AutoDiff] Adstack: short-circuit MaxOverRange walk on cap-hit (avoid…
duburcqa 3e6a03e
[AutoDiff] Adstack: drop LLVM device sizer overflow-flag write to avo…
duburcqa d0b908f
[AutoDiff] Adstack: scope cap-hit tripwire test to backends with expl…
duburcqa 9748cc9
[AutoDiff] Adstack: drop arch restriction on cap-hit tripwire test
duburcqa 98dd82d
[Docs] Document the per-task sizer iteration cap and its parallel-eva…
duburcqa 9a8bc2d
[AutoDiff] Adstack max-reducer: capture nested MaxOverRange chains ac…
duburcqa 47fc8d2
[AutoDiff] Adstack max-reducer: round-based dispatch substitutes capt…
duburcqa df42498
[AutoDiff] Adstack max-reducer: support bound-var-indexed FieldLoad i…
duburcqa f6c146b
[AutoDiff] LLVM adstack lazy-claim: split into stage-grouped subdir (…
duburcqa 91aa148
[Runtime] Split adstack runtime helpers into a separate translation u…
duburcqa 3dc7253
[Docs] Reformat 'What can go wrong' as FAQ-style subsections; tighten…
duburcqa f34db99
[CI] Search $LLVM_DIR/bin for llvm-link so the runtime bitcode link s…
duburcqa 85ceb31
[CI] chmod 0755 LLVM toolchain binaries after extract so the bitcode …
duburcqa d92fee3
[Docs] Reflow three comment blocks in adstack max-reducer files to wr…
duburcqa 279baf6
[Runtime] Revert separate-TU build to single-TU include-cpp; llvm-lin…
duburcqa 1d695c8
[AutoDiff] Skip LLVM max-reducer dispatch on pre-Ampere CUDA where th…
e830d60
Fix CUDA Graph grad for adstack.
duburcqa d9397bb
[AutoDiff] Pin max-reducer dispatch to nullptr stream on CUDA to matc…
duburcqa 157ddef
[AutoDiff] LLVM max-reducer: split CPU serial vs CUDA/AMDGPU parallel…
duburcqa fbe4c6e
[Docs] Reword 'Inner reverse-mode loop with a complex bound' to use c…
duburcqa f19244c
[Perf] Adstack max-reducer: gate per-launch dispatch on captured spec…
duburcqa 9ca862f
[Docs] Reword 'Inner reverse-mode loop with a complex bound' section …
duburcqa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cant help suspecting this is very likely the original bullet point, just with a reference to Appendix C added :) However, having the link to appendix C does reduce the burden on being easliy undersatndable I feel. So, ok :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not!