The lack of specialized getindex methods for GPUSparseDeviceMatrixCSC makes it so that we cannot index into it (the previous CuSparseDeviceMatrixCSC could). As a consequence, we also cannot index into GPUSparseDeviceColumnView. I understand that it is not the most efficient as we need to perform a binary search, but in my use case, it is the only option.
MWE in terms of GPUSparseDeviceColumnView
using CUDA, SparseArrays
A = sprand(10, 10, 0.4)
A_gpu = cu(A)
a_gpu = @view A_gpu[:, 1]
function kernel(out, a)
acc = 0.0
i = 1
while i <= length(a)
acc += a[i]
i += 1
end
out[1] = acc
return nothing
end
out = CUDA.zeros(1)
@cuda kernel(out, a_gpu)
The kernel can equivalently be written as the following (subarray.jl:313).
function kernel(out, a)
acc = 0.0
i = 1
while i <= length(a)
index = Base.reindex(a.indices, (i,))
v = a.parent[index...]
acc += v
i += 1
end
out[1] = acc
return nothing
end
In this case, it succeeds with reindexing, i.e. to find the index within the parent (A_gpu), but since GPUSparseDeviceMatrixCSC lacks appropriate indexing methods, it fails with the error message (under julia with option -g2).
index: (1, 1)
ERROR: a exception was thrown during kernel execution on thread (1, 1, 1) in block (1, 1, 1).
Stacktrace:
[1] error_if_canonical_getindex at ./abstractarray.jl:1357
[2] getindex at ./abstractarray.jl:1341
[3] kernel at ./REPL[7]:6
The lack of specialized
getindexmethods forGPUSparseDeviceMatrixCSCmakes it so that we cannot index into it (the previousCuSparseDeviceMatrixCSCcould). As a consequence, we also cannot index intoGPUSparseDeviceColumnView. I understand that it is not the most efficient as we need to perform a binary search, but in my use case, it is the only option.MWE in terms of
GPUSparseDeviceColumnViewThe kernel can equivalently be written as the following (subarray.jl:313).
In this case, it succeeds with reindexing, i.e. to find the index within the parent (
A_gpu), but sinceGPUSparseDeviceMatrixCSClacks appropriate indexing methods, it fails with the error message (under julia with option -g2).