Add builtin data structure benchmark framework for booster#4139
Add builtin data structure benchmark framework for booster#4139
Conversation
Benchmark Results (GHC 9.6.5)KMap Operations
KSet Operations
KList Operations
Pipeline Benchmarks
Key observations:
|
|
@jberthold what do you think of this benchmark? It looks good to me, any changes/additions you would make to it? At least, having it as a test in the repo seems like a good thing. |
jberthold
left a comment
There was a problem hiding this comment.
I am generally not opposed to adding benchmarks like these to the code.
There are some details that should be corrected, specifically the test data should definitely not be part of the library code. Other comments are probably minor.
Overall, I am somehow not convinced that optimising the data structures will provide much performance gain. I would imagine the typical map at runtime to not have more than 100 elements (maximum). Maybe the effort is better spent on improving other areas where booster is known to be deficient, such as the implies endpoint.
| if command -v cabal >/dev/null 2>&1; then | ||
| cmd=(cabal bench booster-bench --flags profiling) | ||
| if [[ -n "$benchmark_options" ]]; then | ||
| cmd+=("--benchmark-options=$benchmark_options") | ||
| fi | ||
| cmd+=(-- +RTS) | ||
| cmd+=("${rts_flags[@]}") | ||
| cmd+=(-RTS) | ||
| elif command -v stack >/dev/null 2>&1; then | ||
| bench_args=() | ||
| if [[ -n "$benchmark_options" ]]; then | ||
| bench_args+=("$benchmark_options") | ||
| fi |
There was a problem hiding this comment.
Personally I would give users a choice whether to use cabal or stack (and I would probably run with stack), or at least test for stack first (because I would prefer using stack) but that may be just me.
| ghc-options: | ||
| - -rtsopts | ||
| - -threaded | ||
| - -with-rtsopts=-N |
There was a problem hiding this comment.
This is an unrelated change. The predicates-integration test suite does not need multi-threading.
There was a problem hiding this comment.
I think it would be much better to keep these two modules Data and Ops outside of the library. They are all about testing and should not be part of the main code.
| import Booster.Pattern.Base | ||
|
|
||
| benchmarkSizes :: [Int] | ||
| benchmarkSizes = [10, 100, 1000, 5000, 10000, 50000] |
There was a problem hiding this comment.
Maybe these sizes should be made parameters (via env.variable or argument to the benchmark program). This will allow the benchmarks to be run as unit tests in reasonable time/scale so their code doesn't become stale without us noticing.
| [ bench "insert" $ nf (\(m, k, v) -> runMapUpdate m k v) (mapTerm, insertKey, insertValue) | ||
| , bench "update" $ nf (\(m, k, v) -> runMapUpdate m k v) (mapTerm, existingKey, updateValue) | ||
| , bench "remove" $ nf (\(m, k) -> runMapRemove m k) (mapTerm, existingKey) | ||
| , bench "keys" $ nf runMapKeys mapTerm | ||
| , bench "values" $ nf runMapValues mapTerm |
There was a problem hiding this comment.
Some of these tests are measuring the generation of the large test data structures (such as mapTerm) as well as the operation they are trying to measure. It might be better to evaluate the data structure in full beforehand (using force) - which may require a NFData instance.
See the long comment on the nf function in Test.Tasty.Bench for more information.
|
The |
Hi @jberthold, thanks for reviewing it! Could you please tell me more about these things, which has higher priority to improve? Is there any issues introducing these gaps? |
We have a few github issues around the I also have a branch Probably also a good idea to go through the board and try to close stale issues as well as find work. |
It makes a lot sense. Maybe that's a good start which makes the status of backend easy to track.
Thanks for sharing these! I'll read these issues first. |
|
I still would like to see this benchmark merged, because I think it is useful. So if you have time @Stevengre to address the changes that Jost pointed out, that would be great! |
|
Sure, I'd like to refine this PR when I have time. |
Adds a benchmark framework for profiling booster's builtin K data structure operations (KMap, KSet, KList), including a pipeline benchmark for
matchMaps.What's included:
Booster.Benchmark.Data/Booster.Benchmark.Ops: generate synthetic K terms and benchmark operations (lookup, insert, update, remove, union, intersection, etc.) at various sizes (10–50000)booster-benchexecutable usingtasty-benchwith CSV outputmatchMaps, term ordering (Ordvs hash-first), substitution application, and a full single-rule rewrite pipelinescripts/run-bench-profile.shfor profiling with GHC RTS options