Skip to content

Simplify multimodal content handling; pass parts through to providers#94

Merged
adambalogh merged 3 commits into
claude/nifty-knuth-DgpW8from
claude/bold-cori-cQzqn
Jun 5, 2026
Merged

Simplify multimodal content handling; pass parts through to providers#94
adambalogh merged 3 commits into
claude/nifty-knuth-DgpW8from
claude/bold-cori-cQzqn

Conversation

@adambalogh
Copy link
Copy Markdown
Contributor

Summary

Refactor multimodal user message content handling to pass OpenAI-style content parts through to LangChain and providers largely unchanged, rather than converting them to LangChain standard blocks. This simplifies the enclave code path while preserving native multimodal support on the OHTTP route.

Changes

  • Remove design document: Delete docs/native-attachments-design.md (the design phase is complete; implementation details now live in code and tests)

  • Simplify content normalization (tee_gateway/llm_backend.py):

    • Replace _convert_user_content() with _normalize_user_content_parts() that passes text and image parts through as-is in their OpenAI form
    • Only rewrite file / input_file parts into LangChain standard file blocks (required for Anthropic compatibility)
    • Wrap primitive (non-dict) parts as text objects
    • Remove the "collapse text-only lists to string" behavior — preserve list structure so providers receive consistent multimodal input
  • Fix content handling edge cases (tee_gateway/llm_backend.py):

    • Explicitly check for None content values before applying or "" coercion, preserving empty lists and falsy values
  • Update tests (tee_gateway/test/test_tee_core.py):

    • Rename test_user_content_text_only_parts_collapse_to_stringtest_user_content_text_parts_passthrough and verify parts stay as a list
    • Add test_empty_user_content_list_is_preserved to ensure empty multimodal lists are not coerced to strings
    • Update image and remote URL tests to expect OpenAI image_url format (passed through unchanged) instead of LangChain standard blocks
    • Update docstrings to reflect pass-through behavior
  • Add multimodal serialization test (tests/test_structured_outputs.py):

    • New test test_hash_dict_preserves_multimodal_user_content verifies that multimodal content lists serialize consistently for request hashing

Implementation Details

  • No dependency changes: Existing pinned langchain-* versions already handle OpenAI-style image_url and file parts correctly at send time, so no PCR impact
  • Provider compatibility: Each provider's LangChain package translates OpenAI parts to its native API (Anthropic document, OpenAI file_data, etc.) automatically
  • Backward compatibility: Text-only messages still work; empty content lists are preserved rather than coerced to empty strings

https://claude.ai/code/session_01QjmXmpuiSCYrr9Lq5AYUve

dixitaniket and others added 3 commits June 3, 2026 16:43
* testing image format fix

* review fixes

* lint fix
# Conflicts:
#	tee_gateway/controllers/chat_controller.py
#	tee_gateway/llm_backend.py
#	tee_gateway/test/test_tee_core.py
Revert the bespoke image-conversion path in convert_messages to main's raw
pass-through (text/image parts already convert correctly to every provider's
native API, so images keep working untouched). Only file/PDF parts are
rewritten to LangChain standard file blocks, since Anthropic needs a
'document' block and rejects OpenAI's raw file shape.

Capability gating, the per-request size cap, and request-hash canonicalization
are retained. Drop the design doc.
@adambalogh adambalogh marked this pull request as ready for review June 5, 2026 12:57
@adambalogh adambalogh merged commit b3dbd7e into claude/nifty-knuth-DgpW8 Jun 5, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants