feat(viewer): reusable run viewer in ra/ (Trace) + kai Findings panel#98
Merged
Conversation
Lift the rollout viewer out of the benchmark harness into a core kai.viewer package so any pipeline run can be inspected, not just benchmark rollouts. - trace.py: the causal trace loader, moved verbatim from evaluation/trace_viewer.py - findings.py: load a run's exploits.json into a Finding view-model, reusing kai.cvss to derive severity and expand the CVSS vector - html.py: a single self-contained, data-first (Tufte) page with two tabs -- Findings (severity dot + score + 0-10 bar, CVSS breakdown, PoC, +/- patch diff) and Trace (the restyled causal spine); XSS-safe via textContent - __main__.py: python -m kai.viewer <run_dir> [-o OUT] [--open] - evaluation/trace_viewer.py: thin shim re-exporting from kai.viewer so cli view / index keep working Reads exploits.json + rollouts/*.jsonl straight off disk; no live state backend needed.
…g titles - Move the design tokens + shared component CSS (table, severity dot/score/ bar, CVSS/detail blocks, code/diff/output panes) into kai/viewer/style.py. The interactive viewer now composes style.base_css() + its own layout, so the upcoming 'kai report --format html' document can reuse the exact same palette and primitives -- one design system, no drift. - Cap derived finding titles at ~64 chars on a word boundary so a long first hypothesis sentence stays a scannable headline instead of wrapping across table cells and section titles (improves both viewer and report).
This was referenced Jun 8, 2026
A finished run's exploits.json keeps merged-away duplicate hypotheses as 'deduplicated' shells (no severity/PoC). They were surfacing as empty 'none'-severity rows and inflating the finding count (a real run showed '10 findings · 9 none'). load_findings now drops them, leaving the confirmed and rejected findings that actually matter.
Collaborator
|
this one lgtm |
There was a problem hiding this comment.
Pull request overview
This PR introduces a first-class, offline HTML viewer under kai.viewer that can render a normal kai pipeline run (security findings from exploits.json plus the causal agent trace from rollouts/*.jsonl). It also keeps the old evaluation.trace_viewer API working via a compatibility shim.
Changes:
- Add
kai.viewerpackage with trace loading, findings loading/normalization, and a self-contained HTML renderer. - Add
python -m kai.viewer <run_dir> [-o OUT] [--open]entry point for generating/opening the viewer HTML. - Replace
evaluation/trace_viewer.pyimplementation with re-exports fromkai.viewerand add viewer-focused tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_viewer.py | New tests for findings loading, HTML rendering, and write-to-disk behavior. |
| src/kai/viewer/trace.py | New causal trace loader that folds per-agent JSONL into a nested run trace model. |
| src/kai/viewer/style.py | Shared CSS tokens/components for consistent styling across HTML surfaces. |
| src/kai/viewer/html.py | Self-contained HTML template + JS rendering for Findings and Trace tabs. |
| src/kai/viewer/findings.py | Findings loader/view-model with CVSS expansion and sorting. |
| src/kai/viewer/main.py | CLI entry point for rendering and optionally opening the HTML output. |
| src/kai/viewer/init.py | Public re-exports for the viewer package API. |
| evaluation/trace_viewer.py | Compatibility shim re-exporting viewer functions from kai.viewer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Per review: the trace viewer and design system are framework-level — any agent built on ra produces rollouts and can render them — so they belong in ra, not kai. - ra/viewer/: trace.py + style.py (moved) + html.py, a self-contained, panel-composed viewer with a built-in causal Trace panel and a reusable render_page(data, panels) composer. 'python -m ra.viewer <run_dir>' renders any ra run's trace standalone. - kai/viewer/: keeps the security domain layer — findings.py (CVSS/exploits) and a Findings panel — and composes it onto ra's Trace panel via ra.viewer.render_page. Findings stay in kai because they're domain concepts; moving them to ra would invert the dependency (ra must not import kai). evaluation shim and tests updated to import the trace loader from ra.viewer. No behaviour change to 'kai view' (still Findings + Trace, same design).
andthattoo
previously approved these changes
Jun 9, 2026
Review (Copilot, #98): - _sort_key now orders by (confirmed, severity, score) so a high/critical finding with a label but no usable CVSS score isn't sunk below a low finding that has a numeric score. - write_html / write_trace_html mkdir the output's parent, so '-o some/new/dir/trace.html' works instead of raising FileNotFoundError.
aktasbatuhan
added a commit
that referenced
this pull request
Jun 9, 2026
Match the viewer fix (#98): 'kai report -o some/new/dir/report.md' (and --format html) now mkdir the parent instead of raising FileNotFoundError.
Member
Author
|
Addressed both Copilot comments in the latest push (3826e9e):
|
andthattoo
approved these changes
Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Base
master. First of the stack (#98 → #99 → #100 → #101 → #102).What this does
A self-contained HTML viewer for a run — findings + the causal agent trace — with no server and no external requests.
Per review feedback, the reusable parts live in
ra/(the framework), so any ra-based agent can render its runs; the security-specific parts stay inkai/:ra/viewer/— framework-level, domain-agnostic:trace.py— the causal-trace loader (rollouts → spine), moved verbatim.style.py— the shared Tufte design system (tokens + primitives).html.py— a panel-composed viewer: a tabbed shell + a built-in Trace panel +render_page(data, panels)so a domain layer can add its own panels.python -m ra.viewer <run_dir>renders any ra run's trace standalone.kai/viewer/— the security domain layer:findings.py—exploits.json→Finding(CVSS/severity/PoC/patch). Also hidesdeduplicatedbookkeeping shells.html.py— a Findings panel composed ontora's Trace panel viara.viewer.render_page.Why findings stay in kai: CVSS / exploits / patches are domain concepts. Moving them to
rawould make the framework importkai.cvss— inverting the dependency. So the split is: reusable viewer + design system →ra; security findings →kai.kai viewis unchanged for users (Findings + Trace, same Tufte design);evaluationand tests now import the trace loader fromra.viewer.Verification
ty/ruffclean;pytestgreen (newtest_ra_viewer.pycovers the standalone trace viewer + arbitrary-panel composition with zero kai involvement).kai viewproduces the Findings (confirmed Critical reentrancy) + Trace page;python -m ra.viewerproduces a Trace-only page with no Findings tab; both fully self-contained (no external resource loads).