Conversation
|
Performance seems to be either unchanged or consistently (~ same % across sizes) worse in some cases. I wonder if FFI calls have a higher cost on the new version, my first intuition is that the slower tests are the ones that compute keccak hashes. Benchmark |
This is quite unfortunate. |
|
Yeah, I think it'd be good to try and figure out what the extra time is spent on. |
df808f0 to
95911f4
Compare
|
I will be also in favor to wait until we have a clear picture on what is causing this regression. |
|
@elopez wanna have another go at this? Maybe it's better now? I think there have been some changes since that may have improved the situation? |
|
It looks like aeson-optics has not been updated yet 😢 |
|
Looks like we can get it to build on Windows anyway with an extra line on cabal.project :) If any of you can do the re-benchmarking that'd be cool, the branch should be up to date with main now and the latest nixpkgs-unstable |
|
Worth noting here -- static builds for linux (.#redistributable) will use GHC 9.12 NixOS/nixpkgs#488658 |
|
And static builds seem to be broken 😅 |
Ours? Or what do you mean? |
I mean building .#redistributable with these changes uses GHC 9.12 and it ghc iserv seems to fail/crash when building one of the dependencies (at least when I tried on linux aarch64) |
|
Yes, the linux aarch64 architecture was a thing we wanted to add for static builds, but we might need to remove it again. |
I am still seeing similar levels of regression as before: Comparsion with `main` baselineAll
loop
2: OK
29.0 μs ± 1.9 μs, same as baseline
4: OK
42.3 μs ± 1.8 μs, same as baseline
8: OK
67.1 μs ± 4.3 μs, same as baseline
16: OK
121 μs ± 12 μs, same as baseline
32: OK
221 μs ± 13 μs, same as baseline
64: OK
431 μs ± 28 μs, same as baseline
128: OK
854 μs ± 27 μs, same as baseline
256: OK
1.73 ms ± 115 μs, same as baseline
512: OK
3.31 ms ± 243 μs, same as baseline
1024: OK
6.58 ms ± 455 μs, same as baseline
2048: OK
13.1 ms ± 946 μs, same as baseline
4096: OK
26.0 ms ± 1.0 ms, same as baseline
8192: OK
52.4 ms ± 3.6 ms, same as baseline
16384: OK
104 ms ± 9.1 ms, same as baseline
primes
2: OK
97.3 μs ± 5.4 μs, same as baseline
4: OK
142 μs ± 7.0 μs, same as baseline
8: OK
227 μs ± 14 μs, same as baseline
16: OK
437 μs ± 32 μs, same as baseline
32: OK
990 μs ± 98 μs, same as baseline
64: OK
2.09 ms ± 105 μs, same as baseline
128: OK
4.61 ms ± 296 μs, same as baseline
256: OK
10.6 ms ± 877 μs, same as baseline
512: OK
25.3 ms ± 2.0 ms, same as baseline
1024: OK
60.6 ms ± 3.7 ms, same as baseline
2048: OK
149 ms ± 8.1 ms, same as baseline
4096: OK
372 ms ± 17 ms, same as baseline
8192: OK
939 ms ± 87 ms, same as baseline
16384: OK
2.349 s ± 63 ms, same as baseline
hashes
2: OK
41.8 μs ± 3.9 μs, 12% more than baseline
4: OK
67.8 μs ± 3.9 μs, 13% more than baseline
8: OK
120 μs ± 8.7 μs, 14% more than baseline
16: OK
220 μs ± 15 μs, 13% more than baseline
32: OK
425 μs ± 30 μs, 16% more than baseline
64: OK
844 μs ± 58 μs, 16% more than baseline
128: OK
1.70 ms ± 115 μs, 15% more than baseline
256: OK
3.46 ms ± 257 μs, 15% more than baseline
512: OK
6.94 ms ± 562 μs, 14% more than baseline
1024: OK
13.8 ms ± 937 μs, 13% more than baseline
2048: OK
27.3 ms ± 1.7 ms, 14% more than baseline
4096: OK
54.4 ms ± 4.3 ms, 14% more than baseline
8192: OK
108 ms ± 8.1 ms, 15% more than baseline
16384: OK
213 ms ± 14 ms, 13% more than baseline
hashmem
2: OK
65.1 μs ± 4.5 μs, 9% more than baseline
4: OK
106 μs ± 6.7 μs, 14% more than baseline
8: OK
183 μs ± 15 μs, 15% more than baseline
16: OK
340 μs ± 32 μs, 18% more than baseline
32: OK
653 μs ± 58 μs, 18% more than baseline
64: OK
1.38 ms ± 121 μs, 24% more than baseline
128: OK
2.76 ms ± 221 μs, 19% more than baseline
256: OK
5.53 ms ± 436 μs, 17% more than baseline
512: OK
11.2 ms ± 1.1 ms, 17% more than baseline
1024: OK
22.2 ms ± 1.7 ms, 13% more than baseline
2048: OK
44.8 ms ± 3.6 ms, 16% more than baseline
4096: OK
88.3 ms ± 7.2 ms, 15% more than baseline
8192: OK
180 ms ± 8.6 ms, 15% more than baseline
16384: OK
361 ms ± 15 ms, 17% more than baseline
balanceTransfer
2: OK
3.76 ms ± 248 μs, same as baseline
4: OK
3.78 ms ± 233 μs, same as baseline
8: OK
3.83 ms ± 213 μs, same as baseline
16: OK
3.94 ms ± 268 μs, same as baseline
32: OK
4.09 ms ± 244 μs, same as baseline
64: OK
4.43 ms ± 291 μs, same as baseline
128: OK
5.20 ms ± 257 μs, same as baseline
256: OK
6.99 ms ± 437 μs, same as baseline
512: OK
10.6 ms ± 842 μs, same as baseline
1024: OK
17.1 ms ± 876 μs, same as baseline
2048: OK
28.8 ms ± 1.9 ms, same as baseline
4096: OK
52.1 ms ± 4.3 ms, same as baseline
8192: OK
98.1 ms ± 4.7 ms, same as baseline
16384: OK
192 ms ± 14 ms, same as baseline
funcCall
2: OK
47.1 μs ± 3.7 μs, same as baseline
4: OK
64.5 μs ± 4.2 μs, same as baseline
8: OK
102 μs ± 7.5 μs, same as baseline
16: OK
172 μs ± 5.9 μs, same as baseline
32: OK
322 μs ± 29 μs, same as baseline
64: OK
608 μs ± 58 μs, same as baseline
128: OK
1.22 ms ± 70 μs, same as baseline
256: OK
2.32 ms ± 105 μs, same as baseline
512: OK
4.54 ms ± 314 μs, same as baseline
1024: OK
8.90 ms ± 758 μs, same as baseline
2048: OK
17.7 ms ± 1.2 ms, same as baseline
4096: OK
35.2 ms ± 3.4 ms, same as baseline
8192: OK
70.2 ms ± 4.4 ms, same as baseline
16384: OK
140 ms ± 8.9 ms, same as baseline
contractCreation
2: OK
81.7 μs ± 7.5 μs, 10% more than baseline
4: OK
137 μs ± 13 μs, 13% more than baseline
8: OK
242 μs ± 13 μs, 14% more than baseline
16: OK
471 μs ± 29 μs, 13% more than baseline
32: OK
1.01 ms ± 64 μs, 18% more than baseline
64: OK
2.55 ms ± 115 μs, 25% more than baseline
128: OK
6.31 ms ± 423 μs, 21% more than baseline
256: OK
12.4 ms ± 853 μs, 11% more than baseline
512: OK
26.2 ms ± 2.4 ms, same as baseline
1024: OK
55.6 ms ± 4.9 ms, same as baseline
2048: OK
108 ms ± 4.8 ms, 6% more than baseline
4096: OK
229 ms ± 11 ms, 8% more than baseline
8192: OK
466 ms ± 28 ms, 12% more than baseline
16384: OK
893 ms ± 67 ms, same as baseline
contractCreationMem
2: OK
352 μs ± 27 μs, 16% more than baseline
4: OK
662 μs ± 53 μs, 19% more than baseline
8: OK
1.58 ms ± 113 μs, 32% more than baseline
16: OK
3.85 ms ± 258 μs, 22% more than baseline
32: OK
8.30 ms ± 435 μs, 15% more than baseline
64: OK
17.3 ms ± 1.0 ms, 11% more than baseline
128: OK
35.8 ms ± 2.1 ms, 7% more than baseline
256: OK
73.2 ms ± 5.2 ms, same as baseline
512: OK
150 ms ± 15 ms, same as baseline
1024: OK
314 ms ± 8.9 ms, same as baseline
2048: OK
662 ms ± 65 ms, same as baseline
4096: OK
1.267 s ± 74 ms, same as baseline
8192: OK
2.508 s ± 129 ms, same as baseline
16384: OK
5.193 s ± 252 ms, 5% more than baseline
arrayCreationMem
2: OK
176 μs ± 15 μs, 25% more than baseline
4: OK
502 μs ± 38 μs, 35% more than baseline
8: OK
1.81 ms ± 115 μs, 40% more than baseline
16: OK
6.36 ms ± 501 μs, 40% more than baseline
32: OK
24.3 ms ± 1.8 ms, 39% more than baseline
64: OK
95.6 ms ± 4.6 ms, 40% more than baseline
128: OK
376 ms ± 17 ms, 40% more than baseline
256: OK
1.501 s ± 32 ms, 41% more than baseline
512: OK
5.974 s ± 67 ms, 41% more than baseline
mapStorage
2: OK
68.5 μs ± 4.7 μs, 15% more than baseline
4: OK
115 μs ± 6.8 μs, 15% more than baseline
8: OK
212 μs ± 13 μs, 17% more than baseline
16: OK
402 μs ± 29 μs, 17% more than baseline
32: OK
803 μs ± 56 μs, 20% more than baseline
64: OK
1.66 ms ± 108 μs, 21% more than baseline
128: OK
3.41 ms ± 259 μs, 20% more than baseline
256: OK
6.93 ms ± 549 μs, 20% more than baseline
512: OK
13.9 ms ± 889 μs, 18% more than baseline
1024: OK
28.0 ms ± 1.8 ms, 18% more than baseline
2048: OK
56.1 ms ± 4.5 ms, 18% more than baseline
4096: OK
114 ms ± 7.3 ms, 19% more than baseline
8192: OK
229 ms ± 15 ms, 21% more than baseline
16384: OK
459 ms ± 41 ms, 20% more than baseline
swapOperations
2: OK
167 μs ± 14 μs, same as baseline
4: OK
195 μs ± 17 μs, same as baseline
8: OK
247 μs ± 16 μs, same as baseline
16: OK
351 μs ± 28 μs, same as baseline
32: OK
555 μs ± 53 μs, same as baseline
64: OK
1.02 ms ± 54 μs, same as baseline
128: OK
1.82 ms ± 113 μs, same as baseline
256: OK
3.34 ms ± 257 μs, same as baseline
512: OK
6.22 ms ± 476 μs, same as baseline
1024: OK
12.1 ms ± 1.2 ms, same as baseline
2048: OK
23.9 ms ± 1.9 ms, same as baseline
4096: OK
47.4 ms ± 4.6 ms, same as baseline
8192: OK
93.4 ms ± 8.7 ms, same as baseline
16384: OK
187 ms ± 14 ms, same as baseline |
Numbers for GHC 9.12.2Doesn't look very good :( |
Description
This is a WIP PR to prepare for when nixpkgs is ready to build hevm with GHC 9.10
We would need the following to land on nixpkgs-unstable or be resolved before we can make the switch:
Checklist