Skip to content

Added benchmarks vs S2, Boost.Geometry, GeographicLib#14

Merged
gistrec merged 9 commits into
masterfrom
add-benchmarks
May 19, 2026
Merged

Added benchmarks vs S2, Boost.Geometry, GeographicLib#14
gistrec merged 9 commits into
masterfrom
add-benchmarks

Conversation

@gistrec
Copy link
Copy Markdown
Owner

@gistrec gistrec commented May 9, 2026

Summary

Adds an opt-in benchmarks/ directory with Google Benchmark micro-benchmarks for:

  • distance_between;
  • heading;
  • contains;
  • area;
  • path_length.

The benchmarks compare geo-utils-cpp against:

  • S2 Geometry;
  • Boost.Geometry with spherical strategy;
  • GeographicLib using WGS84;
  • a naive haversine baseline.

Conversion policy is documented and normalized across libraries: streams of points are converted inside the timed loop (the realistic cost when the input is lat/lng), long-lived polygons are pre-built once outside the loop (matches the geofence-loaded-once-queried-many-times pattern).

Also adds disk-footprint measurement via benchmarks/size/measure.sh. The script builds minimal consumer programs against each library and reports:

  • stripped binary size;
  • on-disk install size.

Documentation changes:

  • adds docs/benchmarks.md with methodology, full results per operation, and a "when to use which library" guide;
  • adds a short summary table to the README;
  • column winners are highlighted in bold (with co-winners marked when within ~5% noise).

Headline numbers on Apple M1, clang 17, -O2 -DNDEBUG:

  • tied with hand-written haversine on distance_between, showing zero header-only overhead;
  • tied with Boost.Geometry on distance / heading within ~5% noise; ahead on every polygon operation;
  • about 9–13× faster than Boost.Geometry on contains (depending on polygon size);
  • ahead of S2 on distance, heading, area, path_length, and on contains against ~10-vertex polygons. S2 wins contains from ~100 vertices onward via its bounding-rectangle prefilter — the documented caveat;
  • 36 KB install size vs:
    • 32.8 MB for S2 (with abseil);
    • 12.3 MB for the Boost.Geometry geometry subset alone;
    • 4.6 MB for GeographicLib.

That makes geo-utils-cpp roughly 130× to 900× smaller, depending on the comparison target.

Build / CI hygiene:

  • URL_HASH SHA256=... pinned for the Google Benchmark FetchContent;
  • size-consumer programs use strtod (not the UB-on-bad-input atof);
  • measure.sh persists failed build logs to build-bench/size-logs/ and reports failure status via a sentinel file (works correctly across command-substitution boundaries);
  • find_package(s2 NAMES s2 S2) accepts both casings of the installed config file.

Benchmarks are off by default via GEO_UTILS_CPP_BUILD_BENCHMARKS=OFF and are not run in CI (would require S2 / Boost / GeographicLib installs). A small smoke-build job in CI compiles bench_geo_utils and bench_naive (the dependency-free portion) so the benchmark plumbing can't bitrot silently.

@gistrec gistrec requested a review from MrHerrn May 9, 2026 01:24
@gistrec gistrec self-assigned this May 9, 2026
@gistrec gistrec marked this pull request as draft May 9, 2026 01:32
@gistrec gistrec marked this pull request as ready for review May 9, 2026 14:20
Comment thread benchmarks/speed/bench_boost.cpp Outdated
Comment thread benchmarks/speed/bench_boost.cpp Outdated
Comment thread benchmarks/speed/bench_geographiclib.cpp
Comment thread docs/benchmarks.md Outdated
gistrec added a commit that referenced this pull request May 10, 2026
Address PR #14 review feedback (@MrHerrn): with native point/geometry
types built fresh inside the benchmark loop, conversion overhead was
being charged to each competitor's per-call cost, inflating the gap on
ops like Boost.Geometry's path_length. Move every native-type build
out of the timed region so the numbers reflect algorithmic cost only.

bench_geographiclib's area / path_length intentionally keep the
PolygonArea + AddPoint work in-loop: AddPoint *is* the per-vertex
geodesic computation, and Compute() only adds the closing-edge
contribution — pre-building outside the loop would measure a cached
double read, not real work. Comment explains this.

Refresh README.md and docs/benchmarks.md result tables and TL;DR to
match the new picture: ties Boost.Geometry on distance / heading /
path_length within noise; wins clearly on area; trails S2 on
distance / path_length / contains algorithmically (S2 still pays
lat/lng->S2Point conversion in real lat/lng workloads, which these
algorithm-only numbers do not count). Install-size advantage of
130-900x is unchanged.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.19%. Comparing base (a28bb64) to head (e828d42).

Additional details and impacted files
@@           Coverage Diff           @@
##           master      #14   +/-   ##
=======================================
  Coverage   98.19%   98.19%           
=======================================
  Files          20       20           
  Lines         499      499           
  Branches       88       88           
=======================================
  Hits          490      490           
  Misses          1        1           
  Partials        8        8           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread docs/benchmarks.md Outdated
@gistrec gistrec added the enhancement New feature or request label May 13, 2026
gistrec added 9 commits May 19, 2026 03:07
Address PR #14 review feedback (@MrHerrn): with native point/geometry
types built fresh inside the benchmark loop, conversion overhead was
being charged to each competitor's per-call cost, inflating the gap on
ops like Boost.Geometry's path_length. Move every native-type build
out of the timed region so the numbers reflect algorithmic cost only.

bench_geographiclib's area / path_length intentionally keep the
PolygonArea + AddPoint work in-loop: AddPoint *is* the per-vertex
geodesic computation, and Compute() only adds the closing-edge
contribution — pre-building outside the loop would measure a cached
double read, not real work. Comment explains this.

Refresh README.md and docs/benchmarks.md result tables and TL;DR to
match the new picture: ties Boost.Geometry on distance / heading /
path_length within noise; wins clearly on area; trails S2 on
distance / path_length / contains algorithmically (S2 still pays
lat/lng->S2Point conversion in real lat/lng workloads, which these
algorithm-only numbers do not count). Install-size advantage of
130-900x is unchanged.
… split smoke CI

- benchmarks/common/constants.hpp + bench_polygon/bench_queries helpers
  in random_data.hpp; bench files use shared constants instead of
  hard-coding (40.0, -74.0, 5.0, 1000) per file.
- Every BENCHMARK macro now ->Repetitions(5)->ReportAggregatesOnly(true).
- regular_polygon() lng-sign fix: vertices are now actually CCW
  (positive signed_area in our convention).
- queries_around() clamps lat/lng to valid domain.
- benchmarks smoke-build moved to its own workflow with paths filter on
  benchmarks/** + CMakeLists.txt + the workflow file; removed redundant
  step from ci.yml (doc-only PRs no longer pay for it).
- docs/benchmarks.md: "Disk footprint" -> "Deployment footprint"; drop
  "fraction of the disk footprint" / "33 MB install" claims as headline
  arguments; lean on dependency-free + lat/lng-native + S2Point-native
  data as the technical criteria. TL;DR compressed to scannable bullets,
  bold rule applied strictly at ~5%, N/A rows unified, polygon-area
  footnote added, naive-haversine baseline disclaimer in distance
  commentary, PolygonArea apples-to-apples note added.
- README.md: merged Features + Why-use into one list, deduped install
  boilerplate (single find_package block), added polygon example
  demonstrating contains/area/path_length/on_path, replaced std::boolalpha
  with inline ternaries, "about 36 KB across 4 headers".
- docs/api.md: merged Conventions + Numerical notes; expanded with
  Earth model + Google Maps reference, thread safety, error handling
  (noexcept story, nullopt/extrapolate semantics), include strategy.
  Added on_path and angle_between examples; offset_origin round-trip
  + nullopt example; readable comments on distance/path_length/area
  examples; geodesic-defaults note at top of Polygon functions.
- benchmarks/README.md: drop Competitors table and Methodology notes
  (already in docs/benchmarks.md); reduced to install + build + measure
  mechanics.
@gistrec
Copy link
Copy Markdown
Owner Author

gistrec commented May 19, 2026

Fixed, ready for re-review, @MrHerrn

@gistrec gistrec merged commit abb0da1 into master May 19, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants