RFC: enable custom prompts/review-stages, and allow for agnostic git backends#141
RFC: enable custom prompts/review-stages, and allow for agnostic git backends#141derekbarbosa wants to merge 10 commits into
Conversation
7ea9ee1 to
1769c5e
Compare
|
I like the idea and was thinking about something similar! |
|
No worries. I was noticing some activity lately, so I decided to at least put up a working draft of the RFC now vs later :) |
1769c5e to
771ebc0
Compare
|
V2 -> V3:
|
771ebc0 to
60a7a79
Compare
Add slug, mr_url, mr_title, and mr_number columns to the patchsets table to support forge-based (GitHub/GitLab) merge request tracking. Extend PatchsetRow, all SELECT queries, and create_fetching_patchset with the new fields. Add get_patchset_details_by_slug and get_patchset_summary_by_slug lookup methods. Add mr_url, mr_title, and mr_number fields to Event::PatchSubmitted and ParsedArticle. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add configuration structs for multi-forge integration, subsystem mapping, tool filtering, and custom prompts. Add ProjectSettings, ForgeSettings, SubsystemsSettings, ToolsSettings, PromptsSettings, and CustomToolDefinition types. Add Settings::get_prompts_dir() helper for resolving the prompts directory path. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
60a7a79 to
714d1e8
Compare
|
V3 -> V4:
|
|
@rgushchin Hi Roman, rebasing this has started to get a little hairy. I would appreciate a review whenever you have the time! 😅 |
714d1e8 to
5e848d2
Compare
Add ForgeProvider trait with GitHubForge and GitLabForge implementations for webhook handling. Add ForgeRegistry for provider lookup. Add extract_repo_name_from_url and extract_repo_name_from_mr_url utility functions. Add bytes dependency for webhook body parsing. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Extend AppState with Settings and ForgeRegistry. Change run_server
to accept Arc<Settings> instead of ServerSettings. Add GET /api/config
endpoint returning project metadata. Add POST /api/webhook/{provider}
for forge-agnostic webhook handling via ForgeRegistry trait dispatch.
Add slug-based patchset lookup in GET /api/patch handler. Add MR
metadata fields to FetchRequest for forge-sourced patches.
Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add gitlab_token and mr_metadata fields to FetchAgent. Thread mr_url, mr_title, mr_number through extract_patch into Event::PatchSubmitted. Add commit range splitting for base..head ranges from forge webhooks. Inject GitLab auth tokens via oauth2 URL format in ensure_remote. Add safe.bareRepository=all to all git subprocess calls for bare repo compatibility. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
… loop Add conditional ingestor startup (disabled when forge is enabled). Thread mr_url, mr_title, mr_number through Event::PatchSubmitted into ParsedArticle and process_parsed_article. Add configurable subsystem mapping via regex patterns. Add identify_subsystems_from_paths for file-path-based subsystem detection. Use MR title as patchset subject for forge-sourced patches. Move email policy loading outside the DB worker loop. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
…validation Add ToolBox::with_config constructor accepting ToolsSettings for tool filtering. Add is_tool_enabled allowlist/denylist check. Add register_custom_tools and execute_custom_tool with path validation and dangerous pattern blocking. Filter tool declarations and calls through is_tool_enabled gate. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
…riables
Add StagesConfig and StageDefinition for TOML-based stage
definitions. Add PromptRegistry::with_settings async factory
for remote prompt resolution (HTTP, git clone, local dir).
Add substitute_variables for {{variable}} template expansion.
Refactor get_stage_prompt to use fallback chain: stages.toml
config -> file-based stages -> hardcoded stages. Add stage
filtering respecting enabled flag from config. Add md5
dependency for prompt cache hashing.
Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
… tool Change Reviewer::new to async fn new() -> Result<Self>, replacing .expect() with ? for error propagation on BaselineRegistry and create_provider_cached. Update main.rs call site to match on Result. Add --prompts arg to run_review_tool using settings.get_prompts_dir(). Update src/bin/review.rs to use ToolBox::with_config, PromptRegistry::with_settings, and Option<PathBuf> for --prompts arg. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
Add setup guides for GitHub and GitLab forge integration, custom prompts documentation, tools configuration guide, and settings reference. Add design document for custom prompts and tools RFC. Add webhook testing and configuration validation scripts. Update README with forge integration features and configuration sections. Signed-off-by: derekbarbosa <derekasobrab@gmail.com>
5e848d2 to
e74a61b
Compare
|
Hi Derek! Totally understand, it's giant! Overall I like the direction, but I need to merge a similarly big patchset first (and it will mess a bit with stages) and also think how prompts need to be changed to go into the kernel tree. In other words, I can't merge it in the current form in next week or two at least. Which parts are most critical for you short-term, if any? |
|
Hey Roman, thanks for the reply. That's understandable. The only counter I have is that the we way we handle prompts at the moment is a bit silly with the giant match block -- this would allow us to eventually get rid of the hardcoded stages/prompts if we decide to use the new templating mechanism as the default. If anything though, it would be nice to have the custom-git-forge backends be merged. This would really help ease some of the experiments we're running. 😅 I would like to have those traits implemented before any requests to support GitHub/GitLab or Gerrit, etc, start popping up. |
|
Can you please separate custom-git-forge backends part then? I'll merge this and we'll discuss the rest in a couple of weeks. Thank you! |
|
ACK. Will push up something later this afternoon/evening :) |
|
@rgushchin 1/2 -- this is the PR for custom git forges |
|
2/2 -- this is the RFC for the custom prompts #188 |
|
closing RFC |
Sashiko's review pipeline currently has a fixed structure: hardcoded stage
instructions, a static set of AI tools, and no mechanism for users to inject
domain-specific review guidance.
Those adopting sashiko for their own
kernel trees or non-kernel projects must either:
Fork the repository and modify prompt files directly, creating a
maintenance burden when upstream evolves.
Accept the default review pipeline as-is, missing project-specific review
criteria and tooling.
Neither option scales. Someone reviewing storage drivers needs different
guidance someone reviewing networking code, but today both run the
identical 9-stage pipeline with identical prompts.
This patch introduces a configuration-driven customization layer that sits
between the user and the review pipeline. All changes are opt-in; the default
behavior is completely unchanged when no configuration is provided.
The following features are enabled by this PR:
Custom Prompts Directory
Users can point sashiko at their own prompt files via
Settings.tomlor the--promptsCLI flag. The directory can be a local path, a Git repository URL(cloned and cached locally), or an HTTP(S) URL. The default
third_party/prompts/kerneldirectory continues to work unchanged.Stage Configuration
A new
stages.tomlfile controls the review pipeline:This allows teams to skip irrelevant stages (e.g., disable the hardware review
stage for a pure-software project) or add domain-specific stages (e.g., a
compliance analysis stage).
Template Variables
Prompt files can contain
{{variable}}placeholders that are substituted atload time from configuration. Built-in variables include
{{date}}and{{year}}. Users define custom variables in[prompts.variables]to injectorganization names, target architectures, or other project-specific context
into prompts without modifying the prompt files themselves.
Tool Filtering
Users can allowlist or denylist which built-in AI tools are available during
review. This is useful for restricting the AI to read-only operations, or
limiting tool access in constrained environments.
Custom Tool Definitions
Users can register shell-based tools that the AI can invoke during review.
Custom tools are defined in
Settings.tomlwith a JSON schema for parameters,a shell command template, and an optional path allowlist for security. This
enables integration with project-specific linters, static analyzers, or other
external tools without modifying sashiko's source.
Multi-Forge Integration via
ForgeProviderTraitPatches are currently ingested exclusively via NNTP (mailing list).
This patch adds a
ForgeProvidertrait that abstracts webhook-basedingestion from code forges, with built-in implementations for GitHub and
GitLab:
ForgeProvidertrait:validate_event(),parse_payload()— any forgethat speaks webhooks can implement this interface.
ForgeRegistry: runtime registry of providers, pre-loaded with GitHuband GitLab. Third-party providers can be registered at startup.
ForgeMetadata: normalized struct (repo_url,base_sha,head_sha,pr_number,pr_title,pr_url) — backend-agnostic representation ofa PR/MR event.
ForgeSettingsinSettings.toml:enabled,provider,webhook_secret,api_token./api/webhook/:provider): receives forge events, validatesvia the registered provider, extracts commit range, and feeds patches into
the existing review pipeline.
mr_url,mr_title,mr_number,slugcolumns onpatchsetstable for forge metadata tracking.The forge layer is entirely opt-in. When
[forge] enabled = false(default),no webhook routes are registered and the NNTP pipeline is unchanged.
This patch:
[prompts]or[tools]section inSettings.toml, sashiko operates identically to before.