Description
The README shows the following benchmark:
julia> @btime FFTA.fft(x) setup=(x = @SVector rand(N));
698.611 ns (8 allocations: 2.11 KiB)
julia> @btime FFTW.fft(x) setup=(x = @SVector rand(N));
5.433 μs (34 allocations: 4.70 KiB)
We cannot reproduce this result. On current Julia and FFTW, both are identical at ~5 µs.
Reproduction
using BenchmarkTools, FFTW, StaticArrays, FFTA
N = 64
b_ffta = @benchmark FFTA.fft(x) setup=(x = @SVector rand($N))
b_fftw = @benchmark FFTW.fft(x) setup=(x = @SVector rand($N))
println("FFTA: ", BenchmarkTools.prettytime(median(b_ffta).time), " (", b_ffta.allocs, " allocs)")
println("FFTW: ", BenchmarkTools.prettytime(median(b_fftw).time), " (", b_fftw.allocs, " allocs)")
Result
FFTA: 4.922 μs (9 allocs)
FFTW: 4.945 μs (9 allocs)
Both are equal at ~5 µs — no speedup for FFTA. Note that the allocation count is also different from the README (9 vs 8 for FFTA, 9 vs 34 for FFTW), suggesting FFTW has improved its SVector handling since the README was written.
For reference, a planned FFTW on a regular Vector is 422 ns — much faster than both:
plan = FFTW.plan_fft(rand(N))
@btime $plan * x setup=(x = rand($N)) # 422 ns
Environment
- Julia: 1.12.4
- FFTA: 0.3.1 (latest main)
- FFTW: 1.10.0
- StaticArrays: 1.9.18
- OS: Linux x86_64
Description
The README shows the following benchmark:
We cannot reproduce this result. On current Julia and FFTW, both are identical at ~5 µs.
Reproduction
Result
Both are equal at ~5 µs — no speedup for FFTA. Note that the allocation count is also different from the README (9 vs 8 for FFTA, 9 vs 34 for FFTW), suggesting FFTW has improved its SVector handling since the README was written.
For reference, a planned FFTW on a regular
Vectoris 422 ns — much faster than both:Environment