-
Notifications
You must be signed in to change notification settings - Fork 102
Generic sparse arrays: CSC/CSR/COO formats, on-device conversions, SpMM/SpMV #745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
maleadt
wants to merge
15
commits into
main
Choose a base branch
from
tb/sparse
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
f569596
Add dense GPU sparse conversion
maleadt 6563c12
Add generic COO sparse matmul
maleadt 13ae5b9
Add typed dense sparse conversion
maleadt 5224d42
Rework sparse back-end interface around value verbs and `similar`
maleadt 7e149d0
testsuite: only test sparse `abs`-reductions when the result type is …
maleadt 92bde19
docs: document the sparse array back-end interface
maleadt 38b5dc5
Add on-device `to_dense`; rename `sparse_from_dense` → `to_sparse`
maleadt 637acb0
docs: restructure the sparse interface; document `to_sparse`/`to_dense`
maleadt 22313e1
Add on-device sparse format conversions via `convert`
maleadt 780e855
Adopt generic sparse conversions in JLArrays
maleadt f0be29f
Drop `sparse_coo_type`; test COO via `sparse_types`
maleadt 21b3dd7
Test sparse matmul and conversions generically
maleadt d8a8c60
Allocate empty sparse arrays via spzeros/undef
maleadt ae6812f
Convert sparse formats via coo_type/csr_type/csc_type hooks
maleadt 5115577
Only run the matmul accumulation test on GPU sparse formats
maleadt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,6 +31,81 @@ KernelAbstractions.get_backend(a::CA) where CA <: CustomArray = CustomBackend() | |
|
|
||
| There are numerous examples of potential interfaces for GPUArrays, such as with [JLArrays](https://github.com/JuliaGPU/GPUArrays.jl/blob/main/lib/JLArrays/src/JLArrays.jl), [CuArrays](https://github.com/JuliaGPU/CUDA.jl/blob/main/src/gpuarrays.jl), and [ROCArrays](https://github.com/JuliaGPU/AMDGPU.jl/blob/main/src/gpuarrays.jl). | ||
|
|
||
| ## Sparse arrays | ||
|
|
||
| A sparse array can't share the `AbstractGPUArray` supertype — that is a `DenseArray`, | ||
| whereas a sparse array must be an `AbstractSparseArray` — so GPUArrays keeps a parallel | ||
| sparse hierarchy with its own generic functionality. Integrating a back-end has three | ||
| parts: the storage types it provides, the methods it implements to plug them in, and the | ||
| functionality it then gets for free. | ||
|
|
||
| ### Storage types to provide | ||
|
|
||
| One mutable struct per supported format, subtyping the matching abstract type and using | ||
| the conventional field names (generic code reads them directly): | ||
|
|
||
| | supertype | fields | | ||
| |:--|:--| | ||
| | `AbstractGPUSparseVector{Tv,Ti}` | `iPtr`, `nzVal`, `len`, `nnz` | | ||
| | `AbstractGPUSparseMatrixCSC{Tv,Ti}` | `colPtr`, `rowVal`, `nzVal`, `dims`, `nnz` | | ||
| | `AbstractGPUSparseMatrixCSR{Tv,Ti}` | `rowPtr`, `colVal`, `nzVal`, `dims`, `nnz` | | ||
| | `AbstractGPUSparseMatrixCOO{Tv,Ti}` | `rowInd`, `colInd`, `nzVal`, `dims`, `nnz` | | ||
|
|
||
| The pointer/index/value arrays are the back-end's own dense vector type. Provide only the | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we give an example (e.g. |
||
| formats you need, but note that several generic operations route through COO. | ||
|
|
||
| ### Interface to implement | ||
|
|
||
| * **Constructors** — from component arrays (`MyCSR(rowPtr, colVal, nzVal, dims)`), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about to/from dense arrays? |
||
| between formats (`MyCSR(::MyCOO)`, …), and to/from host `SparseArrays` | ||
| (`MyCSC(::SparseMatrixCSC)`, `SparseMatrixCSC(::MyCSC)`). | ||
| * **`undef` constructors** — `MyCSC{Tv,Ti}(undef, dims)` / `MyVec{Tv,Ti}(undef, n)`, | ||
| building a *structurally-empty* array (no stored entries), mirroring dense | ||
| `Array{T}(undef, dims)` and `SparseArrays`' `SparseMatrixCSC{Tv,Ti}(undef, m, n)`. This | ||
| is the empty-of-a-shape allocation primitive. Note there is no uninitialized-structure | ||
| analogue: for a sparse array `undef` means empty, exactly as in `SparseArrays`. | ||
| Implementing these through a `spzeros(Tv, Ti, dims…; fmt=…)` helper (the value-level | ||
| analogue of `SparseArrays.spzeros`, with a format selector) is recommended — it also | ||
| serves as a convenient public, format-polymorphic entry point — but `spzeros` itself is | ||
| not mandated, since its signature is back-end-flavored (format symbols, storage modes) | ||
| whereas the `undef` constructor is uniform. | ||
| * **`Base.similar`** — structure-preserving (`similar(A)`, `similar(A, ::Type)`) and | ||
| empty-of-a-shape (`similar(A, ::Type, dims)`), as for dense arrays; generic code | ||
| allocates its outputs through `similar`, never by naming a type. The empty-of-a-shape | ||
| form just delegates to the `undef` constructor (threading the source's storage mode), | ||
| so the constructor is the real primitive. | ||
| * **Format-conversion hooks** `GPUArrays.coo_type`/`csr_type`/`csc_type` — map any of your | ||
| sparse-matrix types to the *type* of the named sibling format | ||
| (`coo_type(::Type{<:MyCSC}) = MyCOO`); generic code converts with `coo_type(A)(A)`. These | ||
| are type-level hooks rather than plain `convert(Dest, A)` because a format is the | ||
| wrapper's identity (distinct structs), not a type parameter — so, unlike an eltype change, | ||
| there is no generic wrapper→sibling-wrapper operation, and only the back-end knows its | ||
| sibling types. The cross-format `convert` methods above are the engine the resulting | ||
| constructors route through; the identity case (`coo_type(coo)(coo)`) is your identity | ||
| constructor. | ||
| * **`KernelAbstractions.get_backend`** for the sparse types (usually | ||
| `get_backend(nonzeros(A))`). | ||
| * **`Adapt.adapt_structure`** converting each host struct to its device counterpart | ||
| (`GPUArrays.GPUSparseDeviceVector`, `GPUSparseDeviceMatrixCSC`/`CSR`/`COO`), so the | ||
| generic kernels can consume it inside `@kernel`s. | ||
| * **`GPUArrays._sptranspose`/`_spadjoint`** — materialize a (conjugate) transpose; used | ||
| by `kron`/`triu`/`tril` on lazily wrapped operands. | ||
|
|
||
| `SparseArrays`' accessors (`nnz`, `nonzeros`, `nonzeroinds`, `rowvals`, `getcolptr`) come | ||
| for free from the field names. Dense↔sparse conversion is generic and on-device: | ||
| `to_sparse(::Type{ST}, dense)` scans into a sparse array (`ST` a vector or COO type; | ||
| CSR/CSC follow via the verbs) and `to_dense(A)` scatters back to a dense array of the | ||
| back-end — so a back-end's `MyArray(::MySparse…)` and dense→sparse constructors can simply | ||
| call them. | ||
|
|
||
| ### Functionality you get | ||
|
|
||
| Broadcasting; `mapreduce` and reductions (`sum`, `norm`, `opnorm`); sparse–dense and | ||
| sparse–vector multiplication (`*`, `mul!`, including transposed/adjoint operands); | ||
| `findnz`, `triu`/`tril`/`kron`/`reshape`/`droptol!`; `iszero`/`issymmetric`/`ishermitian`; | ||
| scalar and slice indexing; `copy`/`copyto!`/`collect`/`Array`; and conversion between | ||
| formats and to/from dense. | ||
|
|
||
| ## Caching Allocator | ||
|
|
||
| ```@docs | ||
|
|
||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird phrasing here, let me think a bit about how to make this more natural