Skip to content

Generic sparse arrays: CSC/CSR/COO formats, on-device conversions, SpMM/SpMV#745

Draft
maleadt wants to merge 15 commits into
mainfrom
tb/sparse
Draft

Generic sparse arrays: CSC/CSR/COO formats, on-device conversions, SpMM/SpMV#745
maleadt wants to merge 15 commits into
mainfrom
tb/sparse

Conversation

@maleadt

@maleadt maleadt commented Jun 23, 2026

Copy link
Copy Markdown
Member

Builds out GPUArrays' generic sparse-array support so a back-end gets formats, conversions, and a handful of linear algebra ops by implementing a smaller interface. Validated on JLArrays (reference) and Metal (WIP)

Formats & interface

  • CSC, CSR, and COO (new), sharing a field-name read contract (colPtr/rowPtr/rowInd/nzVal/…).
  • Back-end interface documented in docs/src/interface.md: storage structs + constructors, similar, undef constructors, the coo_type/csr_type/csc_type format hooks, get_backend, adapt_structure,
    _sptranspose/_spadjoint.

Conversions (on-device, no host round-trips)

  • Dense↔sparse via to_sparse/to_dense.
  • Cross-format CSC↔CSR↔COO via generic convert; target format selected with the coo_type/csr_type/csc_type type hooks.
  • Empty/blank allocation via similar, undef constructors, and a spzeros-style helper.

Linear algebra & ops (generic, COO-normalized kernels)

  • SpMV / SpMM / dense·sparse * and 5-arg mul!, incl. transpose/adjoint, bool α/β, and Float16/ComplexF16 accumulated in Float32.
  • broadcast, and mapreduce/reductions (existing)

Breaking / notes

  • Sparse back-end interface reworked → major bump (→ 12.0.0); downstream back-ends must update.
  • New dependency: Atomix (atomic scatter in the matmul kernels).

maleadt and others added 14 commits June 22, 2026 15:20
The sparse support required back-ends to implement a cluster of
type-returning functions -- `dense_array_type`, `dense_vector_type`,
`sparse_array_type`, `csc_type`, `csr_type`, `coo_type` -- that generic
code called to name a related type (the dense counterpart, the
unparameterized sparse wrapper, a sibling format). That is unidiomatic:
Julia expresses these with value-level verbs (`float`, `complex`,
`sparse`) and `similar`, not type-level mapping functions.

Replace that machinery:

  * Format conversion is now the value verbs `sparse_csc`/`sparse_csr`/
    `sparse_coo`. Converting to the format already held is the generic
    identity; back-ends implement only the cross-format cases (which they
    already had as constructors). `csc_type`/`csr_type`/`coo_type` are gone.

  * Broadcast and `similar` outputs are built with `similar` dispatched on a
    representative sparse value (plus an internal `_stored_inds` field
    accessor for filling), mirroring `SparseArrays`' `_allocres`. No more
    reconstructing a sparse type from a type, so `sparse_array_type` is gone.

  * `dense_array_type`/`dense_vector_type` had no consumer in the generic
    algorithms (a dense result comes from `similar(nonzeros(A), ...)`); they
    are removed, and the test suite threads the dense array type explicitly.

  * `sparse_from_dense` now yields COO directly (the natural format when
    scanning a dense array); CSR/CSC are derived with the conversion verbs,
    removing the last type-level `coo_type(ST)` use.

A back-end's obligation shrinks to: the storage structs, their
constructors, `similar`, and the three conversion verbs. JLArrays is
updated as the reference; the GPUArrays test suite passes for both JLArray
and Array.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…supported

`mapreduce(abs, …)` over a `Complex{<:Integer}` sparse matrix has result
element type `Float64` (since `abs(::Complex{<:Integer})::Float64`). Back-ends
that can't allocate that type (e.g. Metal, which has no `Float64`) errored on
these cases. Gate the `abs`-reductions on the result type being among the
back-end's supported eltypes; `sum`, which preserves the input eltype, stays
unconditional.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Spell out the parallel sparse hierarchy, the storage structs and their
conventional field names, and the methods a back-end implements (constructors,
`similar`, and the `sparse_csc`/`sparse_csr`/`sparse_coo` conversion verbs),
plus the generic functionality that comes for free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Densifying a sparse array had no generic on-device path: back-ends did it with a
host round-trip (`Array(collect(SparseMatrixCSC(x)))`). Add `to_dense(A)`, which
scatters the stored entries into a dense array of the same back-end with a kernel
(allocated via `similar(nonzeros(A), …)`), no host transfer. Coordinates are taken
to be unique -- true for CSC/CSR and for COO produced by conversion.

Rename the existing `sparse_from_dense` to `to_sparse` so the two on-device
conversions form a symmetric, discoverable pair (`to_sparse` needs a target format;
`to_dense` does not). JLArrays uses `to_dense` for `JLArray(::JLSparse…)`, and the
interface docs are updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reorganize the sparse section of the interface docs into three parts -- the
storage types a back-end provides, the methods it implements to integrate them,
and the functionality it gets for free -- and round it out (device-struct
`Adapt` rule, `get_backend`, `_sptranspose`/`_spadjoint`, the `to_sparse`/
`to_dense` conversions, and the full list of generic operations). Also give
`to_sparse` a docstring to match `to_dense`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Also add the missing JLArrays COO identity constructor.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mandate the undef constructor as the empty-of-shape primitive; similar delegates to it. JLArrays now provides spzeros + undef constructors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the sparse branch's sparse_csc/csr/coo value-verbs with main's type-level format hooks, used as coo_type(A)(A); keeps the generic on-device convert algorithm as the engine. Re-aligns the conversion API with main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread docs/src/interface.md

A sparse array can't share the `AbstractGPUArray` supertype — that is a `DenseArray`,
whereas a sparse array must be an `AbstractSparseArray` — so GPUArrays keeps a parallel
sparse hierarchy with its own generic functionality. Integrating a back-end has three

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird phrasing here, let me think a bit about how to make this more natural

Comment thread docs/src/interface.md
| `AbstractGPUSparseMatrixCSR{Tv,Ti}` | `rowPtr`, `colVal`, `nzVal`, `dims`, `nnz` |
| `AbstractGPUSparseMatrixCOO{Tv,Ti}` | `rowInd`, `colInd`, `nzVal`, `dims`, `nnz` |

The pointer/index/value arrays are the back-end's own dense vector type. Provide only the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give an example (e.g. CuVector for CUDA)

Comment thread docs/src/interface.md

### Interface to implement

* **Constructors** — from component arrays (`MyCSR(rowPtr, colVal, nzVal, dims)`),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to/from dense arrays?

JLArray(x::JLSparseVector) = JLArray(collect(SparseVector(x)))
JLArray(x::JLSparseMatrixCSC) = JLArray(collect(SparseMatrixCSC(x)))
JLArray(x::JLSparseMatrixCSR) = JLArray(collect(SparseMatrixCSC(x)))
JLArray(x::JLSparseVector) = GPUArrays.to_dense(x)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this an eval block?

Comment thread src/host/sparse.jl
SparseArrays.SparseMatrixCSC(x::AbstractGPUSparseMatrixCSC) = SparseMatrixCSC(size(x)..., Array(SparseArrays.getcolptr(x)), Array(SparseArrays.rowvals(x)), Array(SparseArrays.nonzeros(x)))

function check_sparse_target(::Type{ST}, ::Type{Tv}, ::Type{Ti}) where {ST,Tv,Ti}
Tv isa Type && Ti isa Type ||

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really funky to me. Shouldn't Ti also check it's a <: Integer?

Comment thread src/host/sparse.jl

function check_sparse_target(::Type{ST}, ::Type{Tv}, ::Type{Ti}) where {ST,Tv,Ti}
Tv isa Type && Ti isa Type ||
throw(ArgumentError("sparse target type $ST must specify value and index eltypes"))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include Tv and Ti in the error message?

Comment thread src/host/sparse.jl
@kernel function densify_vector_kernel!(out, iPtr, nzVal)
k = @index(Global, Linear)
if k <= length(nzVal)
@inbounds out[iPtr[k]] = nzVal[k]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's funny how Claude made this a one liner here but not above

@maleadt

maleadt commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

Yeah this hasn't passed my personal sniff test yet, hence draft, but thanks for the review. FWIW, the initial motivation was Metal support + SpMV/SpMM (which happens to be easier to express in COO), with the rest following out of it. The API simplification, replacing *_array_type / dense_array_type / sparse_array_type with something like similar feels like a genuine simplification though. (tried to get rid of the remaining csc_type/csr_type/... too, but that proved too hard)

@kshyatt

kshyatt commented Jun 23, 2026

Copy link
Copy Markdown
Member

Ah I can eff off until it's ready if you prefer, sorry!

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gdalle

gdalle commented Jun 24, 2026

Copy link
Copy Markdown

I'd love to review this one too once it's ready!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants