Skip to content

Use the cached task-local SYCL queue for oneMKL FFT plans#586

Merged
michel2323 merged 3 commits into
mainfrom
fft-sycl-queue-lifetime
Jun 22, 2026
Merged

Use the cached task-local SYCL queue for oneMKL FFT plans#586
michel2323 merged 3 commits into
mainfrom
fft-sycl-queue-lifetime

Conversation

@michel2323

@michel2323 michel2323 commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

_create_descriptor built fresh syclDevice/syclContext/syclQueue objects for every FFT plan. Once those wrappers become garbage, their finalizers (syclQueueDestroy etc.) tear down SYCL runtime state for the still-in-use underlying Level Zero queue, corrupting later DFT commits and crashing at process exit.

This switches _create_descriptor to the cached task-local sycl_queue(global_queue(...)) accessor that every other oneMKL wrapper (BLAS/LAPACK/sparse) already uses, so the plan shares the managed queue lifetime instead of owning a throwaway one.

Notes

Touches only the queue construction in _create_descriptor. The GC.@preserve rooting of lengths/strides already present on main (#582) is left untouched.

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.90%. Comparing base (e995a63) to head (a1c8ade).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #586      +/-   ##
==========================================
- Coverage   80.92%   80.90%   -0.02%     
==========================================
  Files          48       48              
  Lines        3234     3232       -2     
==========================================
- Hits         2617     2615       -2     
  Misses        617      617              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

`_create_descriptor` built fresh syclDevice/syclContext/syclQueue
objects for every FFT plan. Once those wrappers become garbage their
finalizers (syclQueueDestroy etc.) tear down SYCL runtime state for the
still-in-use underlying Level Zero queue, corrupting later DFT commits
and crashing at process exit.

Use the cached task-local `sycl_queue(global_queue(...))` accessor that
every other oneMKL wrapper already uses, so the plan shares the managed
queue lifetime instead of owning a throwaway one.
@michel2323 michel2323 force-pushed the fft-sycl-queue-lifetime branch from be0cf3c to 3719b89 Compare June 22, 2026 15:29
@michel2323 michel2323 marked this pull request as ready for review June 22, 2026 15:29
@michel2323 michel2323 enabled auto-merge (squash) June 22, 2026 15:30
@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/test/fft.jl b/test/fft.jl
index d441946..21cbc5b 100644
--- a/test/fft.jl
+++ b/test/fft.jl
@@ -80,26 +80,28 @@ end
     end
 end
 
-@testset "shared queue lifetime across plans" begin
-    # Plans must share the single cached task-local SYCL queue rather than each owning a
-    # throwaway one (whose finalizer would tear down shared SYCL/oneMKL state). Assert the
-    # shared handle deterministically, independent of whether a stale queue would crash.
-    cached_handle = Base.unsafe_convert(oneAPI.oneMKL.syclQueue_t,
-        oneAPI.sycl_queue(oneAPI.global_queue(oneAPI.context(), oneAPI.device())))
+    @testset "shared queue lifetime across plans" begin
+        # Plans must share the single cached task-local SYCL queue rather than each owning a
+        # throwaway one (whose finalizer would tear down shared SYCL/oneMKL state). Assert the
+        # shared handle deterministically, independent of whether a stale queue would crash.
+        cached_handle = Base.unsafe_convert(
+            oneAPI.oneMKL.syclQueue_t,
+            oneAPI.sycl_queue(oneAPI.global_queue(oneAPI.context(), oneAPI.device()))
+        )
 
-    dX1 = gpu(rand(ComplexF32, 8))
-    p1 = AbstractFFTs.plan_fft(dX1)
-    @test p1.queue == cached_handle
-    dY1 = p1 * dX1
-    p1i = AbstractFFTs.plan_ifft(dX1)
-    p1i * dY1
+        dX1 = gpu(rand(ComplexF32, 8))
+        p1 = AbstractFFTs.plan_fft(dX1)
+        @test p1.queue == cached_handle
+        dY1 = p1 * dX1
+        p1i = AbstractFFTs.plan_ifft(dX1)
+        p1i * dY1
 
-    GC.gc(true)  # run finalizers of any throwaway per-plan SYCL wrappers
+        GC.gc(true)  # run finalizers of any throwaway per-plan SYCL wrappers
 
-    X2 = rand(ComplexF32, 8, 32)
-    dX2 = gpu(X2)
-    p2 = AbstractFFTs.plan_fft(dX2)
-    @test p2.queue == cached_handle
-    cmp(p2 * dX2, fft(X2))
-end
+        X2 = rand(ComplexF32, 8, 32)
+        dX2 = gpu(X2)
+        p2 = AbstractFFTs.plan_fft(dX2)
+        @test p2.queue == cached_handle
+        cmp(p2 * dX2, fft(X2))
+    end
 end

Add deterministic assertions to the queue-lifetime testset: every plan's
stored queue handle must equal the task-local cached SYCL queue. This fails
on the old throwaway-per-plan code regardless of hardware, unlike the
GC-roundtrip check which only crashes on PVC-class teardown.
@michel2323 michel2323 merged commit a76c36e into main Jun 22, 2026
5 checks passed
@michel2323 michel2323 deleted the fft-sycl-queue-lifetime branch June 22, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant