[Docs] Add user-guide page for qd.algorithms.* device-wide algorithms by hughperkins · Pull Request #642 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-05-07T07:44:53Z

Summary

New user-guide page docs/source/user_guide/algorithms.md documenting qd.algorithms.parallel_sort and qd.algorithms.PrefixSumExecutor.

Companion to the in-flight subgroup (#639), block (#638), atomics (#640), and grid (#641) docs; this is one of a series of tier-by-tier user-guide pages. Independent — no cross-links to unmerged docs.

Includes:

Backend / dtype support matrix surfacing the most-hit limitation: PrefixSumExecutor is i32-only and CUDA + Vulkan only.
Semantics for both ops with explicit constructor / API contracts (length cap, in-place behaviour, stability of parallel_sort).
Worked examples: key-value sort by per-element key; scan-based select / compact via flag-prefix-sum.
Performance notes — O(N log² N) vs radix, allocate-once / run-many for the executor, where the cost actually goes.

Adds a new "Algorithms" caption to the user-guide toctree in index.md.

Test plan

cd docs && make html builds without warnings.
Visual check rendered page on RTD preview.

Made with Cursor

Documents the two ops currently in qd.algorithms: parallel_sort (odd-even merge sort, key or key-value, all backends, not stable) and PrefixSumExecutor (Kogge-Stone hierarchical inclusive scan, i32 only, CUDA + Vulkan only). Covers semantics, the i32 / CUDA + Vulkan limitation that cross-platform code most commonly hits, the allocate-once / run-many pattern, and worked examples (key-value sort, scan-based compact). Adds a new 'Algorithms' caption to the toctree in index.md.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c80d6083f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T07:46:49Z

+
+`run(input_arr)`:
+
+- `input_arr` must be a 1-D `qd.field(qd.i32, shape=(L,))` with `L <= length`.


Require the scanned array to match the executor length

This documents length as a maximum and says any L <= length input is valid, but PrefixSumExecutor.run() always copies exactly self.sorting_length elements from and back to input_arr (python/quadrants/algorithms/_algorithms.py:90 and :114) and never switches to the runtime array length. If a user follows this and reuses an executor for a shorter field, the kernel reads/writes past that field's extent rather than scanning only L elements, so the docs should state that the field length must match the constructor length unless the implementation is changed.

Useful? React with 👍 / 👎.

github-actions · 2026-05-07T09:14:23Z

Diff coverage: 0% · 0 lines, 0 missing

…already covered upstream)

…or section

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0fd07ef95e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T18:11:37Z

+
+### `qd.algorithms.parallel_sort(keys, values=None)`
+
+In-place sort. Reorders `keys` ascending; if `values` is provided, applies the same permutation to `values` (key-value sort). Both arguments must be 1-D `field` or `ndarray`.


Restrict sort docs to fields

This documents parallel_sort as accepting qd.ndarray, but the implementation passes the argument to sort_stage, which immediately accesses keys.snode.ptr.offset / values.snode.ptr.offset in python/quadrants/_kernels.py; ScalarNdarray stores an arr handle and has no snode attribute. In the scenario where a user follows this new guide and calls qd.algorithms.parallel_sort(qd.ndarray(...)), the call fails during tracing instead of sorting, so the public docs should not advertise ndarray support unless the kernel is changed to handle ndarray arguments.

Useful? React with 👍 / 👎.

alanray-tech

Code Review

Two inline findings — both are factual inaccuracies that will mislead users into hitting runtime crashes. Backed by source-code evidence.

alanray-tech · 2026-05-07T18:27:18Z

+
+### `qd.algorithms.parallel_sort(keys, values=None)`
+
+In-place sort. Reorders `keys` ascending; if `values` is provided, applies the same permutation to `values` (key-value sort). Both arguments must be 1-D `field` or `ndarray`.


Must fix — parallel_sort does not support ndarray, only field

This line says:

Both arguments must be 1-D field or ndarray.

But parallel_sort delegates to the sort_stage kernel, which directly accesses keys.snode.ptr.offset and values.snode.ptr.offset (_kernels.py:237–238). ndarray (ScalarNdarray) has no snode attribute — it uses an arr handle with a different storage mechanism. Passing an ndarray will fail during tracing with an AttributeError.

The test suite (test_sort.py) exclusively uses qd.field() and never passes ndarray to parallel_sort.

Suggested fix:

Both arguments must be 1-D field.

alanray-tech · 2026-05-07T18:27:18Z

+
+`run(input_arr)`:
+
+- `input_arr` must be a 1-D `qd.field(qd.i32, shape=(L,))` with `L <= length`.


Must fix — L <= length contract is incorrect; L must equal length

This line says:

input_arr must be a 1-D qd.field(qd.i32, shape=(L,)) with L <= length.

But run() always uses the constructor's length, not the runtime array size. The relevant code path in _algorithms.py:

# line 75-76: length is fixed at construction time def run(self, input_arr): length = self.sorting_length # line 90: copies `length` elements FROM input_arr blit_from_field_to_field(self.large_arr, input_arr, 0, length) # line 114: copies `length` elements BACK to input_arr blit_from_field_to_field(input_arr, self.large_arr, 0, length)

If input_arr has L < length elements, both blit_from_field_to_field calls read/write past the field's extent — out-of-bounds access with undefined behavior.

The constructor description at line 51 also says length is "the maximum number of elements", which reinforces the incorrect impression that shorter arrays are safe.

Suggested fix — either:

Change the doc to state that input_arr must have exactly length elements (and update line 51 accordingly), or

Fix the implementation to use min(self.sorting_length, input_arr.shape[0]) so shorter arrays are actually supported.

…tor length contract (PR #642 review)

github-actions · 2026-05-07T19:54:39Z

Diff coverage: 0% · 0 lines, 0 missing

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

hughperkins mentioned this pull request May 7, 2026

[Docs] Add user-guide page for matrix decompositions and solvers #643

Merged

3 tasks

hughperkins marked this pull request as draft May 7, 2026 08:49

hughperkins added 11 commits May 7, 2026 11:02

[Docs] algorithms: drop namespace-size editorial sentence

c4cf5ff

[Docs] algorithms: drop qipc and CUB references

7f2b1e3

[Docs] algorithms: trim performance bullet to crossover summary

c44a3d9

[Docs] algorithms: drop large-N regression aside from performance bullet

94bc159

[Docs] algorithms: drop roll-your-own-scan aside from backend bullet

ea48dc1

[Docs] algorithms: trim parallel_sort bullet to complexity statement

df7f433

[Docs] algorithms: trim PrefixSumExecutor bullet to limitation statement

440b673

[Docs] algorithms: remove executor-lifecycle bullet

1f1fa10

[Docs] algorithms: remove PrefixSumExecutor backend-coverage bullet (…

e76b31c

…already covered upstream)

[Docs] algorithms: remove parallel_sort complexity bullet

5e595c1

[Docs] algorithms: move 'no fence required' note into PrefixSumExecut…

0fd07ef

…or section

hughperkins marked this pull request as ready for review May 7, 2026 18:10

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

alanray-tech reviewed May 7, 2026

View reviewed changes

[Docs] algorithms: fix parallel_sort ndarray claim and PrefixSumExecu…

e86afb6

…tor length contract (PR #642 review)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Add user-guide page for qd.algorithms.* device-wide algorithms#642

[Docs] Add user-guide page for qd.algorithms.* device-wide algorithms#642
hughperkins wants to merge 13 commits intomainfrom
hp/docs-algorithms

hughperkins commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

alanray-tech left a comment

Uh oh!

alanray-tech May 7, 2026

Uh oh!

alanray-tech May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		`run(input_arr)`:

		- `input_arr` must be a 1-D `qd.field(qd.i32, shape=(L,))` with `L <= length`.


		### `qd.algorithms.parallel_sort(keys, values=None)`

		In-place sort. Reorders `keys` ascending; if `values` is provided, applies the same permutation to `values` (key-value sort). Both arguments must be 1-D `field` or `ndarray`.

Conversation

hughperkins commented May 7, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

alanray-tech left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

alanray-tech May 7, 2026

Choose a reason for hiding this comment

Uh oh!

alanray-tech May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants