Skip to content

RFC: enable custom prompts/review-stages, and allow for agnostic git backends#141

Closed
derekbarbosa wants to merge 10 commits into
sashiko-dev:mainfrom
derekbarbosa:feature/custom-prompts
Closed

RFC: enable custom prompts/review-stages, and allow for agnostic git backends#141
derekbarbosa wants to merge 10 commits into
sashiko-dev:mainfrom
derekbarbosa:feature/custom-prompts

Conversation

@derekbarbosa
Copy link
Copy Markdown
Collaborator

@derekbarbosa derekbarbosa commented Apr 30, 2026

Sashiko's review pipeline currently has a fixed structure: hardcoded stage
instructions, a static set of AI tools, and no mechanism for users to inject
domain-specific review guidance.

Those adopting sashiko for their own
kernel trees or non-kernel projects must either:

  1. Fork the repository and modify prompt files directly, creating a
    maintenance burden when upstream evolves.

  2. Accept the default review pipeline as-is, missing project-specific review
    criteria and tooling.

Neither option scales. Someone reviewing storage drivers needs different
guidance someone reviewing networking code, but today both run the
identical 9-stage pipeline with identical prompts.

This patch introduces a configuration-driven customization layer that sits
between the user and the review pipeline. All changes are opt-in; the default
behavior is completely unchanged when no configuration is provided.

The following features are enabled by this PR:

Custom Prompts Directory

Users can point sashiko at their own prompt files via Settings.toml or the
--prompts CLI flag. The directory can be a local path, a Git repository URL
(cloned and cached locally), or an HTTP(S) URL. The default
third_party/prompts/kernel directory continues to work unchanged.

Stage Configuration

A new stages.toml file controls the review pipeline:

  • Enable or disable individual stages
  • Override the instruction file for any stage
  • Specify per-stage supporting documents
  • Add entirely new custom stages beyond the default 9

This allows teams to skip irrelevant stages (e.g., disable the hardware review
stage for a pure-software project) or add domain-specific stages (e.g., a
compliance analysis stage).

Template Variables

Prompt files can contain {{variable}} placeholders that are substituted at
load time from configuration. Built-in variables include {{date}} and
{{year}}. Users define custom variables in [prompts.variables] to inject
organization names, target architectures, or other project-specific context
into prompts without modifying the prompt files themselves.

Tool Filtering

Users can allowlist or denylist which built-in AI tools are available during
review. This is useful for restricting the AI to read-only operations, or
limiting tool access in constrained environments.

Custom Tool Definitions

Users can register shell-based tools that the AI can invoke during review.
Custom tools are defined in Settings.toml with a JSON schema for parameters,
a shell command template, and an optional path allowlist for security. This
enables integration with project-specific linters, static analyzers, or other
external tools without modifying sashiko's source.

Multi-Forge Integration via ForgeProvider Trait

Patches are currently ingested exclusively via NNTP (mailing list).
This patch adds a ForgeProvider trait that abstracts webhook-based
ingestion from code forges, with built-in implementations for GitHub and
GitLab:

  • ForgeProvider trait: validate_event(), parse_payload() — any forge
    that speaks webhooks can implement this interface.
  • ForgeRegistry: runtime registry of providers, pre-loaded with GitHub
    and GitLab. Third-party providers can be registered at startup.
  • ForgeMetadata: normalized struct (repo_url, base_sha, head_sha,
    pr_number, pr_title, pr_url) — backend-agnostic representation of
    a PR/MR event.
  • ForgeSettings in Settings.toml: enabled, provider, webhook_secret,
    api_token.
  • Webhook endpoint (/api/webhook/:provider): receives forge events, validates
    via the registered provider, extracts commit range, and feeds patches into
    the existing review pipeline.
  • DB schema additions: mr_url, mr_title, mr_number, slug columns on
    patchsets table for forge metadata tracking.

The forge layer is entirely opt-in. When [forge] enabled = false (default),
no webhook routes are registered and the NNTP pipeline is unchanged.

This patch:

  • does not change the default review behavior. Without a [prompts] or
    [tools] section in Settings.toml, sashiko operates identically to before.
  • does not add new AI providers or modify the AI interaction protocol.
  • does not change the review output format.
  • does not change the default mailing list/mbox backend

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

derekbarbosa commented Apr 30, 2026

cc: @jpoimboe @PacheNico @jlelli @acmel

@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch 2 times, most recently from 7ea9ee1 to 1769c5e Compare May 1, 2026 00:15
@rgushchin
Copy link
Copy Markdown
Member

I like the idea and was thinking about something similar!
I'll try to look deeper and comment on the details. I'll be on lsf/mm next week, so please expect some delay.
Thank you for the RFC!

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

No worries. I was noticing some activity lately, so I decided to at least put up a working draft of the RFC now vs later :)

@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch from 1769c5e to 771ebc0 Compare May 6, 2026 13:05
@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

V2 -> V3:

  • Fixed case sensitivity for tool calls
  • Removed hardcoded enabled_tools list and use the tool declarations
  • Wrap custom tool output in JSON objects for model consumption
    thank you @jlelli for the testing + fixes

@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch from 771ebc0 to 60a7a79 Compare May 6, 2026 13:38
Add slug, mr_url, mr_title, and mr_number columns to the patchsets
table to support forge-based (GitHub/GitLab) merge request tracking.
Extend PatchsetRow, all SELECT queries, and create_fetching_patchset
with the new fields. Add get_patchset_details_by_slug and
get_patchset_summary_by_slug lookup methods. Add mr_url, mr_title,
and mr_number fields to Event::PatchSubmitted and ParsedArticle.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add configuration structs for multi-forge integration, subsystem
mapping, tool filtering, and custom prompts. Add ProjectSettings,
ForgeSettings, SubsystemsSettings, ToolsSettings, PromptsSettings,
and CustomToolDefinition types. Add Settings::get_prompts_dir()
helper for resolving the prompts directory path.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch from 60a7a79 to 714d1e8 Compare May 11, 2026 22:02
@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

V3 -> V4:

  • split megacommit into series of bisectable ones
  • rebased onto latest main

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

@rgushchin Hi Roman, rebasing this has started to get a little hairy. I would appreciate a review whenever you have the time! 😅

@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch from 714d1e8 to 5e848d2 Compare May 11, 2026 22:11
Add ForgeProvider trait with GitHubForge and GitLabForge
implementations for webhook handling. Add ForgeRegistry for
provider lookup. Add extract_repo_name_from_url and
extract_repo_name_from_mr_url utility functions. Add bytes
dependency for webhook body parsing.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Extend AppState with Settings and ForgeRegistry. Change run_server
to accept Arc<Settings> instead of ServerSettings. Add GET /api/config
endpoint returning project metadata. Add POST /api/webhook/{provider}
for forge-agnostic webhook handling via ForgeRegistry trait dispatch.
Add slug-based patchset lookup in GET /api/patch handler. Add MR
metadata fields to FetchRequest for forge-sourced patches.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add gitlab_token and mr_metadata fields to FetchAgent. Thread
mr_url, mr_title, mr_number through extract_patch into
Event::PatchSubmitted. Add commit range splitting for base..head
ranges from forge webhooks. Inject GitLab auth tokens via oauth2
URL format in ensure_remote. Add safe.bareRepository=all to all
git subprocess calls for bare repo compatibility.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
… loop

Add conditional ingestor startup (disabled when forge is enabled).
Thread mr_url, mr_title, mr_number through Event::PatchSubmitted
into ParsedArticle and process_parsed_article. Add configurable
subsystem mapping via regex patterns. Add identify_subsystems_from_paths
for file-path-based subsystem detection. Use MR title as patchset
subject for forge-sourced patches. Move email policy loading
outside the DB worker loop.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
…validation

Add ToolBox::with_config constructor accepting ToolsSettings for
tool filtering. Add is_tool_enabled allowlist/denylist check.
Add register_custom_tools and execute_custom_tool with path
validation and dangerous pattern blocking. Filter tool declarations
and calls through is_tool_enabled gate.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
…riables

Add StagesConfig and StageDefinition for TOML-based stage
definitions. Add PromptRegistry::with_settings async factory
for remote prompt resolution (HTTP, git clone, local dir).
Add substitute_variables for {{variable}} template expansion.
Refactor get_stage_prompt to use fallback chain: stages.toml
config -> file-based stages -> hardcoded stages. Add stage
filtering respecting enabled flag from config. Add md5
dependency for prompt cache hashing.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
… tool

Change Reviewer::new to async fn new() -> Result<Self>, replacing
.expect() with ? for error propagation on BaselineRegistry and
create_provider_cached. Update main.rs call site to match on
Result. Add --prompts arg to run_review_tool using
settings.get_prompts_dir(). Update src/bin/review.rs to use
ToolBox::with_config, PromptRegistry::with_settings, and
Option<PathBuf> for --prompts arg.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add setup guides for GitHub and GitLab forge integration, custom
prompts documentation, tools configuration guide, and settings
reference. Add design document for custom prompts and tools RFC.
Add webhook testing and configuration validation scripts. Update
README with forge integration features and configuration sections.

Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
@derekbarbosa derekbarbosa force-pushed the feature/custom-prompts branch from 5e848d2 to e74a61b Compare May 11, 2026 22:15
@derekbarbosa derekbarbosa changed the title WIP: RFC: enable custom prompts/review-stages, and allow for agnostic git backends RFC: enable custom prompts/review-stages, and allow for agnostic git backends May 11, 2026
@rgushchin
Copy link
Copy Markdown
Member

Hi Derek! Totally understand, it's giant! Overall I like the direction, but I need to merge a similarly big patchset first (and it will mess a bit with stages) and also think how prompts need to be changed to go into the kernel tree. In other words, I can't merge it in the current form in next week or two at least. Which parts are most critical for you short-term, if any?

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

Hey Roman, thanks for the reply. That's understandable.

The only counter I have is that the we way we handle prompts at the moment is a bit silly with the giant match block -- this would allow us to eventually get rid of the hardcoded stages/prompts if we decide to use the new templating mechanism as the default.

If anything though, it would be nice to have the custom-git-forge backends be merged. This would really help ease some of the experiments we're running. 😅

I would like to have those traits implemented before any requests to support GitHub/GitLab or Gerrit, etc, start popping up.

@rgushchin
Copy link
Copy Markdown
Member

Can you please separate custom-git-forge backends part then? I'll merge this and we'll discuss the rest in a couple of weeks. Thank you!

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

ACK. Will push up something later this afternoon/evening :)

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

@rgushchin 1/2 -- this is the PR for custom git forges
#187

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

2/2 -- this is the RFC for the custom prompts #188

@derekbarbosa
Copy link
Copy Markdown
Collaborator Author

closing RFC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants