Skip to content

[Perf] Add cross-GPU subgroup.ballot(predicate) primitive#600

Draft
hughperkins wants to merge 6 commits intohp/cross-gpu-subgroupfrom
hp/cross-gpu-ballot
Draft

[Perf] Add cross-GPU subgroup.ballot(predicate) primitive#600
hughperkins wants to merge 6 commits intohp/cross-gpu-subgroupfrom
hp/cross-gpu-ballot

Conversation

@hughperkins
Copy link
Copy Markdown
Collaborator

Implement a portable ballot operation that returns a u32 bitmask where bit i is set if lane i's predicate is non-zero. Works across CUDA (__ballot_sync), AMDGPU (amdgcn_ballot.i32), and SPIR-V/Vulkan (OpGroupNonUniformBallot).

Follows the same cross-backend pattern as subgroup.shuffle: a single Python API (subgroup.ballot) dispatches to the appropriate backend intrinsic at codegen time. On AMDGPU CDNA with 64-wide wavefronts only the low 32 bits are returned, consistent with the u32 return type.

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough

Implement a portable ballot operation that returns a u32 bitmask where
bit i is set if lane i's predicate is non-zero. Works across CUDA
(__ballot_sync), AMDGPU (amdgcn_ballot.i32), and SPIR-V/Vulkan
(OpGroupNonUniformBallot).

Follows the same cross-backend pattern as subgroup.shuffle: a single
Python API (subgroup.ballot) dispatches to the appropriate backend
intrinsic at codegen time. On AMDGPU CDNA with 64-wide wavefronts only
the low 32 bits are returned, consistent with the u32 return type.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc looks ok to me

hugh added 3 commits May 9, 2026 21:05
# Conflicts:
#	docs/source/user_guide/subgroup.md
Mac OS X build was failing because spirv_codegen.cpp was accessing
IRBuilder::t_v4_uint_ directly, which is a private member. Add a
public v4_u32_type() accessor following the existing pattern
(u32_type(), bool_type(), etc.) and use it from the ballot lowering.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

@hughperkins hughperkins changed the base branch from main to hp/cross-gpu-subgroup May 9, 2026 22:03
…ross-gpu-ballot

# Conflicts:
#	docs/source/user_guide/subgroup.md
#	python/quadrants/lang/simt/subgroup.py
#	quadrants/codegen/amdgpu/codegen_amdgpu.cpp
#	quadrants/codegen/cuda/codegen_cuda.cpp
#	quadrants/codegen/spirv/spirv_codegen.cpp
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

@github-actions
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant