fix(sdk): enforce disabled skill model invocation#3114
fix(sdk): enforce disabled skill model invocation#3114Zheng-Lu wants to merge 5 commits intoOpenHands:mainfrom
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean bug fix with solid implementation.
Implementation Quality: The code is well-structured, follows existing patterns, and includes comprehensive test coverage. The logic correctly enforces the documented disable-model-invocation semantics while maintaining backward compatibility.
Eval Risk Flag: This PR changes agent behavior (skill visibility in prompts + tool invocation semantics), which falls under the eval-risk category per repository guidelines. While this is fixing broken functionality and the default behavior is unchanged, I'm flagging this for a human maintainer to decide whether lightweight evals are needed before merging.
Recommendation: The implementation itself is solid and ready. A human maintainer should decide whether to run evals or merge directly given this is a bug fix for a feature that was documented but never worked.
|
[RISK ASSESSMENT] Key Factors:
Rationale: While this is a well-implemented bug fix with no breaking changes, it modifies agent behavior by:
These changes fall under the eval-risk category (prompt templates + tool execution behavior). However, the risk is mitigated by:
Recommendation: Given this is a bug fix for non-functional documented behavior, eval impact is expected to be minimal. A human maintainer should make the final call on whether lightweight evals are needed before merge. |
VascoSch92
left a comment
There was a problem hiding this comment.
Hey @Zheng-Lu
thank you very much for this PR.
There are just a couple of things that need to be change/complete.
The most important one is that the fix is incomplete for skills delivered via the agent-server. Two endpoints don't carry disable_model_invocation:
- Server response model (
openhands-agent-server/openhands/agent_server/skills_router.py:85-94)
class SkillInfo(BaseModel):
name: str
type: Literal["repo", "knowledge", "agentskills"]
content: str
triggers: list[str] = Field(default_factory=list)
source: str | None = None
description: str | None = None
is_agentskills_format: bool = False
# ← no disable_model_invocation- And the constructor at line 160-169 doesn't forward it either.
- Client deserializer (
openhands-sdk/openhands/sdk/workspace/remote/base.py:761-768)
return Skill(
name=skill_data.get("name", "unknown"),
content=skill_data.get("content", ""),
trigger=trigger,
source=skill_data.get("source"),
description=skill_data.get("description"),
is_agentskills_format=skill_data.get("is_agentskills_format", False),
# ← no disable_model_invocation
)So a SKILL.md with disable-model-invocation: true loaded through RemoteWorkspace.load_skills_from_agent_server(...) (the standard remote path) will arrive on the client with disable_model_invocation=False and be fully invocable. The bug the PR fixes is reproduced over
the wire.
Add the field to SkillInfo, populate it in the response converter, and read it in _skill_from_dict. Worth a wire-level test too.
There was a problem hiding this comment.
I think you don't need the changes in invoke_skill tool.
If the skill is not displayed in the system prompt of the model, then the model will not the use the tool on it.
There was a problem hiding this comment.
I kept the invoke_skill check intentionally because hiding the skill from <available_skills> only means the model is not told about it. The model can still guess a skill name, reuse a name from previous context, or a tool call can come from replay/resume behavior. If the executor does not check this field, a disabled skill can still run once invoke_skill is called.
So this PR uses both protections:
- Do not advertise the skill in <available_skills>.
- Reject direct invocation at runtime even if it is attempted.
I think this makes disable-model-invocation a real runtime rule, not just prompt hiding.
Co-authored-by: openhands <openhands@all-hands.dev>
…ng-Lu/software-agent-sdk into fix/disable-model-invocation
Changed as requested |
Why
disable-model-invocationskills should only activate through trigger matching. Before this PR, the SDK ignored that frontmatter field: disabled skills were still listed in<available_skills>and could still be invoked directly throughinvoke_skill.Summary
disable-model-invocation/disable_model_invocationintoSkill.disable_model_invocation.<available_skills>while preserving trigger-based activation.invoke_skillcalls for disabled skills and avoid auto-attachinginvoke_skillwhen only disabled AgentSkills are present.Issue Number
The sub-issue "
disable-model-invocationis not parsed or enforced" mentioned in #3046How to Test
Targeted tests:
Video/Screenshots
I created a temporary AgentSkills-format
SKILL.md:to check:
disable-model-invocationis parsed intoSkill.disable_model_invocation<available_skills>and is therefore visible for directinvoke_skillinvoke_skill(name="trigger-only")is rejectedBefore this PR:

This shows the bug:
disable_attr=MISSING: the field was not parsed by the Skill model.prompt_lists_skill=True: the skill was still advertised in <available_skills>.invoke_error=False: direct invoke_skill succeeded.invoked_skills=['trigger-only']: the SDK recorded the disabled skill as successfully invoked.After this PR:

This shows the fix:
disable_attr=True: the field is now parsed.prompt_lists_skill=False: the skill is no longer advertised for direct model invocation.triggered_skill_names=['trigger-only']andtrigger_content_present=True: trigger-based activation still works.invoke_error=True: direct invoke_skill is rejected.invoked_skills=[]: the rejected direct invocation is not recorded as a successful skill invocation.Pytest result: 35 passed, 1 warning in 0.21s

Pre-commit result:

Type
Notes
uv run emitted an existing warning about parsing exclude-newer = "7 days" in pyproject.toml, but the targeted tests and pre-commit checks completed successfully.