fix(tools): percent-decode filename in mcp_resource_to_file#1543
Open
Zawwarsami16 wants to merge 1 commit into
Open
fix(tools): percent-decode filename in mcp_resource_to_file#1543Zawwarsami16 wants to merge 1 commit into
Zawwarsami16 wants to merge 1 commit into
Conversation
urlparse() returns the path component without percent-decoding, so a resource URI like file:///docs/my%20notes.txt was yielding a filename of "my%20notes.txt" instead of "my notes.txt". The same affected any non-ASCII characters in the URI — %E6%97%A5%E8%A8%98.txt would come back literal instead of as 日記.txt. Now passes the last path segment through urllib.parse.unquote(). Also returns None (instead of "") for URIs that have no last segment (e.g. file:///some/dir/), which matches the documented "no filename" signal the function already advertises in its return type. Three new tests in tests/lib/tools/test_mcp_tool.py cover percent-encoded ASCII, percent-encoded UTF-8, and trailing-slash URIs. All 42 tests in test_mcp_tool.py pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's happening
mcp_resource_to_fileinsrc/anthropic/lib/tools/mcp.pyextracts a filename from the MCP resource's URI usingurlparse(uri_str).path.urlparse()does not percent-decode the path component, so a resource URI likefile:///docs/my%20notes.txtwas returning a filename of"my%20notes.txt"instead of"my notes.txt". The same affects any non-ASCII filename — e.g.file:///%E6%97%A5%E8%A8%98.txtwas returning the literal percent-encoded string instead of日記.txt.Downstream file-upload paths (the function's whole purpose is to feed
files.upload()) generally don't recognize percent-encoded names, so the file ends up with a wrong-looking name on the user's account.There's also a related smaller issue: when the URI ends with a
/(so there is no last path segment), the previous code returned""rather thanNone. The function's return-type annotation already declarestuple[str | None, bytes, str | None], so callers expectNoneas the "no filename" signal — getting""instead is a footgun that downstream code can treat as a real (empty) filename.The fix
That's it.
urllib.parse.unquotehandles both ASCII percent-encoding (%20→ space) and percent-encoded UTF-8 multi-byte sequences. The empty-string branch correctly collapses toNone.Tests
Three new tests in
tests/lib/tools/test_mcp_tool.py::TestMCPResourceToFile:test_percent_encoded_filename_is_decoded—file:///docs/my%20notes.txt→"my notes.txt"test_percent_encoded_unicode_filename_is_decoded—file:///%E6%97%A5%E8%A8%98.txt→"日記.txt"test_uri_with_trailing_slash_yields_no_filename—file:///some/dir/→NoneThe existing tests (which already cover plain ASCII filenames like
doc.txtandimg.png) keep passing, so this is purely additive on the behavior side.No production code paths outside
mcp_resource_to_filewere touched.