Skip to content

fix(sdk): enforce disabled skill model invocation#3114

Open
Zheng-Lu wants to merge 5 commits intoOpenHands:mainfrom
Zheng-Lu:fix/disable-model-invocation
Open

fix(sdk): enforce disabled skill model invocation#3114
Zheng-Lu wants to merge 5 commits intoOpenHands:mainfrom
Zheng-Lu:fix/disable-model-invocation

Conversation

@Zheng-Lu
Copy link
Copy Markdown
Contributor

@Zheng-Lu Zheng-Lu commented May 7, 2026

  • A human has tested these changes.

Why

disable-model-invocation skills should only activate through trigger matching. Before this PR, the SDK ignored that frontmatter field: disabled skills were still listed in <available_skills> and could still be invoked directly through invoke_skill.

Summary

  • Parse disable-model-invocation / disable_model_invocation into Skill.disable_model_invocation.
  • Hide disabled skills from <available_skills> while preserving trigger-based activation.
  • Reject direct invoke_skill calls for disabled skills and avoid auto-attaching invoke_skill when only disabled AgentSkills are present.

Issue Number

The sub-issue "disable-model-invocation is not parsed or enforced" mentioned in #3046

How to Test

Targeted tests:

uv run pytest tests/sdk/skills/test_agentskills_fields.py tests/sdk/context/test_agent_context.py::TestAgentContext::test_disable_model_invocation_hides_skill_but_preserves_triggers tests/sdk/tool/test_invoke_skill.py -q

Video/Screenshots

I created a temporary AgentSkills-format SKILL.md:

name: trigger-only
description: Hidden skill
disable-model-invocation: true
triggers:
  - hidden-keyword

to check:

  • whether disable-model-invocation is parsed into Skill.disable_model_invocation
  • whether the skill appears in <available_skills> and is therefore visible for direct invoke_skill
  • whether trigger activation still works while direct invoke_skill(name="trigger-only") is rejected

Before this PR:
image

This shows the bug:

  • disable_attr=MISSING: the field was not parsed by the Skill model.
  • prompt_lists_skill=True: the skill was still advertised in <available_skills>.
  • invoke_error=False: direct invoke_skill succeeded.
  • invoked_skills=['trigger-only']: the SDK recorded the disabled skill as successfully invoked.

After this PR:
image

This shows the fix:

  • disable_attr=True: the field is now parsed.
  • prompt_lists_skill=False: the skill is no longer advertised for direct model invocation.
  • triggered_skill_names=['trigger-only'] and trigger_content_present=True: trigger-based activation still works.
  • invoke_error=True: direct invoke_skill is rejected.
  • invoked_skills=[]: the rejected direct invocation is not recorded as a successful skill invocation.

Pytest result: 35 passed, 1 warning in 0.21s
image

Pre-commit result:
image

Type

  • Bug fix
  • Feature
  • Refactor
  • Breaking change
  • Docs / chore

Notes

uv run emitted an existing warning about parsing exclude-newer = "7 days" in pyproject.toml, but the targeted tests and pre-commit checks completed successfully.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean bug fix with solid implementation.

Implementation Quality: The code is well-structured, follows existing patterns, and includes comprehensive test coverage. The logic correctly enforces the documented disable-model-invocation semantics while maintaining backward compatibility.

Eval Risk Flag: This PR changes agent behavior (skill visibility in prompts + tool invocation semantics), which falls under the eval-risk category per repository guidelines. While this is fixing broken functionality and the default behavior is unchanged, I'm flagging this for a human maintainer to decide whether lightweight evals are needed before merging.

Recommendation: The implementation itself is solid and ready. A human maintainer should decide whether to run evals or merge directly given this is a bug fix for a feature that was documented but never worked.

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[RISK ASSESSMENT]

⚠️ Risk Level: 🟡 MEDIUM (Eval Risk)

Key Factors:

  • Security: No security concerns
  • Breaking Changes: None - backward compatible (defaults to False)
  • Code Quality: Clean implementation following existing patterns
  • Test Coverage: Comprehensive tests included
  • ⚠️ Eval Risk: Changes skill visibility in <available_skills> and tool invocation semantics

Rationale: While this is a well-implemented bug fix with no breaking changes, it modifies agent behavior by:

  1. Changing what skills appear in the system prompt (<available_skills> section)
  2. Changing which skills can be invoked via invoke_skill

These changes fall under the eval-risk category (prompt templates + tool execution behavior). However, the risk is mitigated by:

  • This fixes a documented but broken feature
  • Default behavior is unchanged (disable_model_invocation=False)
  • Manual testing confirms expected behavior
  • Unlikely any current benchmarks rely on this broken feature

Recommendation: Given this is a bug fix for non-functional documented behavior, eval impact is expected to be minimal. A human maintainer should make the final call on whether lightweight evals are needed before merge.

Copy link
Copy Markdown
Contributor

@VascoSch92 VascoSch92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Zheng-Lu

thank you very much for this PR.

There are just a couple of things that need to be change/complete.

The most important one is that the fix is incomplete for skills delivered via the agent-server. Two endpoints don't carry disable_model_invocation:

  • Server response model (openhands-agent-server/openhands/agent_server/skills_router.py:85-94)
class SkillInfo(BaseModel):                                                                                                                                                                                                                                                  
    name: str                                                                                                                                                                                                                                                                
    type: Literal["repo", "knowledge", "agentskills"]                      
    content: str                                                                                                                                                                                                                                                             
    triggers: list[str] = Field(default_factory=list)                      
    source: str | None = None                                                                                                                                                                                                                                                
    description: str | None = None
    is_agentskills_format: bool = False                                                                                                                                                                                                                                      
    # ← no disable_model_invocation
  • And the constructor at line 160-169 doesn't forward it either.
  • Client deserializer (openhands-sdk/openhands/sdk/workspace/remote/base.py:761-768)
return Skill(                                                                                                                                                                                                                                                                
    name=skill_data.get("name", "unknown"),                                                                                                                                                                                                                                  
    content=skill_data.get("content", ""),                                                                                                                                                                                                                                   
    trigger=trigger,                                                       
    source=skill_data.get("source"),                                                                                                                                                                                                                                         
    description=skill_data.get("description"),                                                                                                                                                                                                                               
    is_agentskills_format=skill_data.get("is_agentskills_format", False),
    # ← no disable_model_invocation                                                                                                                                                                                                                                          
)

So a SKILL.md with disable-model-invocation: true loaded through RemoteWorkspace.load_skills_from_agent_server(...) (the standard remote path) will arrive on the client with disable_model_invocation=False and be fully invocable. The bug the PR fixes is reproduced over
the wire.

Add the field to SkillInfo, populate it in the response converter, and read it in _skill_from_dict. Worth a wire-level test too.

Comment thread openhands-sdk/openhands/sdk/context/agent_context.py
Comment thread openhands-sdk/openhands/sdk/skills/skill.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need the changes in invoke_skill tool.

If the skill is not displayed in the system prompt of the model, then the model will not the use the tool on it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the invoke_skill check intentionally because hiding the skill from <available_skills> only means the model is not told about it. The model can still guess a skill name, reuse a name from previous context, or a tool call can come from replay/resume behavior. If the executor does not check this field, a disabled skill can still run once invoke_skill is called.

So this PR uses both protections:

  1. Do not advertise the skill in <available_skills>.
  2. Reject direct invocation at runtime even if it is attempted.

I think this makes disable-model-invocation a real runtime rule, not just prompt hiding.

@Zheng-Lu
Copy link
Copy Markdown
Contributor Author

Zheng-Lu commented May 8, 2026

Hey @Zheng-Lu

thank you very much for this PR.

There are just a couple of things that need to be change/complete.

The most important one is that the fix is incomplete for skills delivered via the agent-server. Two endpoints don't carry disable_model_invocation:

  • Server response model (openhands-agent-server/openhands/agent_server/skills_router.py:85-94)
class SkillInfo(BaseModel):                                                                                                                                                                                                                                                  
    name: str                                                                                                                                                                                                                                                                
    type: Literal["repo", "knowledge", "agentskills"]                      
    content: str                                                                                                                                                                                                                                                             
    triggers: list[str] = Field(default_factory=list)                      
    source: str | None = None                                                                                                                                                                                                                                                
    description: str | None = None
    is_agentskills_format: bool = False                                                                                                                                                                                                                                      
    # ← no disable_model_invocation
  • And the constructor at line 160-169 doesn't forward it either.
  • Client deserializer (openhands-sdk/openhands/sdk/workspace/remote/base.py:761-768)
return Skill(                                                                                                                                                                                                                                                                
    name=skill_data.get("name", "unknown"),                                                                                                                                                                                                                                  
    content=skill_data.get("content", ""),                                                                                                                                                                                                                                   
    trigger=trigger,                                                       
    source=skill_data.get("source"),                                                                                                                                                                                                                                         
    description=skill_data.get("description"),                                                                                                                                                                                                                               
    is_agentskills_format=skill_data.get("is_agentskills_format", False),
    # ← no disable_model_invocation                                                                                                                                                                                                                                          
)

So a SKILL.md with disable-model-invocation: true loaded through RemoteWorkspace.load_skills_from_agent_server(...) (the standard remote path) will arrive on the client with disable_model_invocation=False and be fully invocable. The bug the PR fixes is reproduced over the wire.

Add the field to SkillInfo, populate it in the response converter, and read it in _skill_from_dict. Worth a wire-level test too.

Changed as requested

@Zheng-Lu Zheng-Lu requested a review from VascoSch92 May 8, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants