Skip to content

fix(tools): percent-decode filename in mcp_resource_to_file#1543

Open
Zawwarsami16 wants to merge 1 commit into
anthropics:mainfrom
Zawwarsami16:fix/mcp-resource-filename-decoding
Open

fix(tools): percent-decode filename in mcp_resource_to_file#1543
Zawwarsami16 wants to merge 1 commit into
anthropics:mainfrom
Zawwarsami16:fix/mcp-resource-filename-decoding

Conversation

@Zawwarsami16
Copy link
Copy Markdown

What's happening

mcp_resource_to_file in src/anthropic/lib/tools/mcp.py extracts a filename from the MCP resource's URI using urlparse(uri_str).path. urlparse() does not percent-decode the path component, so a resource URI like file:///docs/my%20notes.txt was returning a filename of "my%20notes.txt" instead of "my notes.txt". The same affects any non-ASCII filename — e.g. file:///%E6%97%A5%E8%A8%98.txt was returning the literal percent-encoded string instead of 日記.txt.

Downstream file-upload paths (the function's whole purpose is to feed files.upload()) generally don't recognize percent-encoded names, so the file ends up with a wrong-looking name on the user's account.

There's also a related smaller issue: when the URI ends with a / (so there is no last path segment), the previous code returned "" rather than None. The function's return-type annotation already declares tuple[str | None, bytes, str | None], so callers expect None as the "no filename" signal — getting "" instead is a footgun that downstream code can treat as a real (empty) filename.

The fix

path = urlparse(uri_str).path
last_segment = path.rsplit("/", 1)[-1] if path else ""
name = unquote(last_segment) if last_segment else None

That's it. urllib.parse.unquote handles both ASCII percent-encoding (%20 → space) and percent-encoded UTF-8 multi-byte sequences. The empty-string branch correctly collapses to None.

Tests

Three new tests in tests/lib/tools/test_mcp_tool.py::TestMCPResourceToFile:

  • test_percent_encoded_filename_is_decodedfile:///docs/my%20notes.txt"my notes.txt"
  • test_percent_encoded_unicode_filename_is_decodedfile:///%E6%97%A5%E8%A8%98.txt"日記.txt"
  • test_uri_with_trailing_slash_yields_no_filenamefile:///some/dir/None

The existing tests (which already cover plain ASCII filenames like doc.txt and img.png) keep passing, so this is purely additive on the behavior side.

$ uv run pytest tests/lib/tools/test_mcp_tool.py
..........................................                              [100%]
42 passed in 1.51s

No production code paths outside mcp_resource_to_file were touched.

urlparse() returns the path component without percent-decoding, so a
resource URI like file:///docs/my%20notes.txt was yielding a filename
of "my%20notes.txt" instead of "my notes.txt". The same affected any
non-ASCII characters in the URI — %E6%97%A5%E8%A8%98.txt would come
back literal instead of as 日記.txt.

Now passes the last path segment through urllib.parse.unquote(). Also
returns None (instead of "") for URIs that have no last segment (e.g.
file:///some/dir/), which matches the documented "no filename" signal
the function already advertises in its return type.

Three new tests in tests/lib/tools/test_mcp_tool.py cover percent-encoded
ASCII, percent-encoded UTF-8, and trailing-slash URIs. All 42 tests in
test_mcp_tool.py pass.
@Zawwarsami16 Zawwarsami16 requested a review from a team as a code owner May 14, 2026 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant