Response cache stats#206
Conversation
|
Hi @acmel, thank you for the PR. Here's gemini's output via the Other than that, the db changes look safe. @rgushchin I am not sure how you handle the database migrations for adding new columns but if they are allowed to be null, then it should be fine. |
|
Thanks for the review, Derek! On the two Gemini findings:
On the migration question: yes, all four new columns ( |
|
I'll force push now with the merge conflicts resolved. |
3617c51 to
19cd4d5
Compare
find_best_review_for_patch_refs() used a broken status-preference heuristic that could select a stale "Reviewed" result from a previous run over the most recent "Failed" retry. Replace with a simple highest-id comparison so the most recent review always wins. rerun_patchset() only reset the patchset-level status but left per-patch statuses (and apply_error) from the previous run. Add a step to reset all patches to "Pending" with apply_error cleared. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add layered configuration loading so users can place personal overrides (credentials, paths, custom settings) in ~/.config/sashiko/Settings.toml without modifying the repo-level Settings.toml. Settings are loaded in order (last wins): 1. ./Settings.toml — repo-level defaults 2. ~/.config/sashiko/Settings.toml — user overrides (optional) 3. SASHIKO__* env vars — per-invocation overrides The user config path follows the XDG Base Directory Specification: $XDG_CONFIG_HOME/sashiko/Settings.toml if set, else ~/.config/sashiko/. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cache_stats setting Extend CacheStats with two new fields: `misses` (requests that missed the cache and triggered an API call) and `tokens_stored` (total tokens in newly created cache entries, representing future savings potential). Add `show_cache_stats` boolean to AiSettings (default false) as the gate for all cache statistics display in CLI, web UI, and daemon logs. Add `fmt_tokens()` helper for human-readable token counts (42.1k, 1.5M). Document the response_cache, response_cache_ttl_days, and show_cache_stats settings in Settings.toml with commented examples. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add four nullable INTEGER columns to the reviews table via migration: cache_hits, cache_misses, cache_tokens_saved, cache_tokens_stored. Extend complete_review() with an optional ReviewCacheStats parameter to populate these columns. In process_patch_review(), snapshot CacheStats before run_review_tool() and compute the delta after it returns. This gives per-patch cache hit/miss/tokens data without new synchronization — atomic counters only grow, so the difference is the per-review contribution. Gate the patchset-level cache summary log and add per-patch cache log lines behind the show_cache_stats setting. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire show_cache_stats through the API layer: add it to AppState, pass it
from main.rs, and inject it into patchset detail responses alongside the
existing smtp_enabled and dry_run flags.
Extend the reviews SELECT in get_patchset_details() with cache_hits,
cache_misses, cache_tokens_saved, cache_tokens_stored columns.
In the CLI, extend print_patch_line() with a cache_suffix parameter.
When show_cache_stats is enabled, build per-patch suffixes like
"{cache: 7/7 hits, 42.1k tokens saved}" and display a patchset-level
summary line with hit rate and token totals.
Add fmt_tokens() helper for human-readable token counts (k/M suffixes).
Document the response cache feature and previously undocumented CLI
commands (rerun, cancel, local, --watch) in README.md.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When show_cache_stats is enabled in the API response, the web UI now shows cache statistics via tap/click-to-expand panels that work on both desktop and mobile browsers. Per-patch: each patch row in the summary table shows a ▶ indicator when cache stats are available. Tapping it expands an animated stats panel with a 3-column grid (hits, misses, hit rate, tokens saved, tokens stored, total calls) and a hit-rate progress bar. The row click still scrolls to the patch detail as before. Patchset-level: the Model line in the metadata section gains a ▶ indicator that expands to show aggregate cache stats across all patches in the series. Uses CSS max-height/opacity transitions for smooth animation. Adds fmtTokens() JS helper for human-readable token counts (k/M suffixes). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The cancel_patchset endpoint was the only mutation endpoint missing the .to_canonical() call before .is_loopback(), causing 403 Forbidden when a CLI on the same machine connects via IPv4 (127.0.0.1) to a dual-stack server (bound to ::). The IPv4 address arrives as the mapped ::ffff:127.0.0.1, which is_loopback() rejects. The submit, rerun_patch, and rerun_patchset endpoints already use to_canonical() — this was simply missed on cancel. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cancel_patchset returned the same "not in a cancellable state" message whether the patchset didn't exist or existed in a non-cancellable status. Change the DB function to return Option<bool> (None = not found, Some(true) = cancelled, Some(false) = wrong status) so the API can return 404 and the CLI can show "Patchset N not found". Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The reviewer only checked for cancellation after completing all patch reviews, so a cancelled patchset would keep submitting LLM requests until every patch was processed. Add a cancellation check before each process_patch_review call in both the concurrent and main worker loops — when cancellation is detected, the shared job queue is drained so all workers stop promptly. This does not interrupt an in-flight LLM call (that would require threading a CancellationToken through the AI provider), but it stops between patches, which is the useful granularity. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fetching custom remotes (e.g. pahole) from git.kernel.org over IPv6 can be sluggish, and the sequential ensure_remote() calls in main() blocked the Reviewer service from starting until all fetches completed. This meant no patchsets were picked up while waiting for a slow git fetch. Move the Reviewer::new() + start() before the custom remote loop and spawn the remote fetches in a background task. The reviewer already calls ensure_remote() itself when it encounters a RemoteTarget baseline candidate, so patchsets that need a custom remote will wait for the fetch at that point — while patchsets targeting other repos proceed without delay. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add an "Opening response cache" message before the DB initialization begins and a "Response cache ready" summary when it completes, reporting the number of cached entries, file size (MB/GB), lifetime tokens saved, and elapsed wall-clock time. This makes it easy to spot slow cache opens and provides useful diagnostics at startup. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Derek Barbosa <debarbos@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
19cd4d5 to
dccf6e0
Compare
● Response cache statistics, startup improvements, and bug fixes
This series adds observability into the response cache, fixes several
bugs found during testing, and improves daemon startup behavior.
Response cache statistics (patches 3-6)
Track per-review cache hit/miss/token counts in the database and expose
them through the API, CLI (sashiko show), and web UI. The web UI uses
tap-to-expand panels with hit-rate progress bars — available per-patch
and aggregated at the patchset level. All cache stats display is gated
behind the show_cache_stats setting (default off).
Example sashiko-cli show output with cache stats enabled:
Patchset Details:
ID: 61
Subject: perf session: Add minimum event size and alignment validation
Author: Arnaldo Carvalho de Melo acme@redhat.com
Status: Failed
Date: 2026-05-19 14:48:09
Patches (28):
[1] perf session: Add minimum event size and alignment validation (Failed) [Failed] {cache: 502/543 hits, 23.1M tokens saved}
[2] perf tools: Fix event_contains() macro to verify full field extent (Reviewed) [Issues Found] {cache: 413/807 hits, 15.7M tokens saved}
...
[27] perf kwork: Bounds check work->cpu before indexing cpus_runtime[] (Reviewed) [Reviewed] {cache: 3667/4155 hits, 163.0M tokens saved}
[28] perf test: Add truncated perf.data robustness test (Reviewed) [Issues Found] {cache: 3080/3334 hits, 134.8M tokens saved}
Cache: 64296/78526 hits (81.9%), 2841.6M tokens saved, 647.2M tokens stored
Model: gemini-3.1-pro-preview
In the web interface, these stats are available by tapping the ▶ indicator
next to each patch row for per-patch details, and on the Model: line in
the patchset header for aggregate stats across all patches.
Startup and operational improvements (patches 2, 11, 12)
personal credentials and overrides don't live in the repo checkout
immediately instead of blocking on slow git.kernel.org IPv6 fetches
Bug fixes (patches 1, 7-10)
"Failed" retry after patchset resubmission
servers (missing to_canonical() call)
is cancelled
Docs (patch 0)
I'll work now on adding limits to the cache and hopefully some sort of LRU to keep LLM requests that have tons of hits while making space for new entries by purging seldom used ones and then make the request cache a default feature with a given reasonable size based on the experience we gather by exposing the stats using this patch series publicly, which is something I encourage the admin for https:://sashiko.dev to do when upgrading to the next release.
Saving "I know that this is not something introduced by your patch, but there is this pre-existing problem that I analysed and found problematic, can you postpone submitting this series and fixing just this one more issue, pretty please?" cases in some sort of database that gets exposed in the https://sashiko.dev site in one special session of "please help picking up from this cache of query/response tokens cache of problems" is another idea I plan to work on Real Soon Now. :-)
Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com