You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wire ChunkedValidationParams.system_prompt to a real config knob on AnonymizerDetectConfig, and default it to a sensible validator system prompt (role framing, decision rubric, prompt-injection guardrail, output-format reinforcement). Currently the field exists end-to-end — including unit-test coverage that it's forwarded correctly to the facade — but no production caller populates it, so in practice every validation call today goes out with system_prompt=None.
Motivation
Three wins, roughly in order of impact:
Prompt-injection resistance. The validator reads user-submitted text. Without a system-level "treat text as data, not instructions" guardrail, an adversarial document can instruct the validator to mark all candidates as drop, producing silent PII leakage.
Cost. OpenAI/Anthropic/NIM cache stable system-prompt prefixes and charge less for cached tokens. With chunked validation the per-row fan-out multiplies the cacheable budget — pinning the rubric system-side is worth measurable $/run.
Prompt hygiene. The decision rubric (what "keep / drop / reclass" mean, edge-case handling) is identical across every row of a run; it does not belong interleaved with per-chunk candidate data in the user prompt.
Proposed scope
Add validator_system_prompt: str | None = None to AnonymizerDetectConfig with a documented default string.
Thread it through detection_workflow.EntityDetectionWorkflow → ChunkedValidationParams(system_prompt=...).
Move the stable rubric content out of _get_validation_prompt(...) into the default system prompt; keep per-chunk substitutions (_merged_tagged_text, _validation_skeleton, _tag_notation) on the user side.
Include prompt-injection guardrail text by default.
Keep the existing forwarding regression tests (test_system_prompt_is_forwarded_to_facade, test_system_prompt_default_none_is_forwarded_untouched) and add an integration-level assertion that the default is non-empty and contains the injection guardrail.
Out of scope
Actual A/B prompt-content tuning (keep/drop accuracy deltas). Worth a separate experiment once the plumbing lands.
Dataset-specific system prompts (can be a follow-up; data_summary already exists as a hook).
Summary
Wire
ChunkedValidationParams.system_promptto a real config knob onAnonymizerDetectConfig, and default it to a sensible validator system prompt (role framing, decision rubric, prompt-injection guardrail, output-format reinforcement). Currently the field exists end-to-end — including unit-test coverage that it's forwarded correctly to the facade — but no production caller populates it, so in practice every validation call today goes out withsystem_prompt=None.Motivation
Three wins, roughly in order of impact:
drop, producing silent PII leakage.Proposed scope
validator_system_prompt: str | None = NonetoAnonymizerDetectConfigwith a documented default string.detection_workflow.EntityDetectionWorkflow→ChunkedValidationParams(system_prompt=...)._get_validation_prompt(...)into the default system prompt; keep per-chunk substitutions (_merged_tagged_text,_validation_skeleton,_tag_notation) on the user side.test_system_prompt_is_forwarded_to_facade,test_system_prompt_default_none_is_forwarded_untouched) and add an integration-level assertion that the default is non-empty and contains the injection guardrail.Out of scope
data_summaryalready exists as a hook).References
tests/engine/test_chunked_validation.py::TestChunkedValidateRowPoolOfOne::test_system_prompt_is_forwarded_to_facade.