-
Notifications
You must be signed in to change notification settings - Fork 271
feat(python): eliminate SSE dependency 'httpx_sse' by hard-forking into core_utilities #9784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…parsing bugs and improved idiomatic class definition.
…SE EventStream object and better handle failed SSE parsing
| # Process any remaining data in buffer | ||
| if buffer.strip(): | ||
| line = buffer.rstrip('\r') | ||
| sse = decoder.decode(line) | ||
| if sse is not None: | ||
| yield sse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer processing logic at the end doesn't handle multi-line content correctly. If the final buffer contains multiple lines without a newline separator (e.g., data received with \r but no \n), only the entire buffer will be processed as a single line. This could cause event data to be incorrectly parsed.
Consider modifying this section to handle potential multi-line content in the remaining buffer, perhaps by splitting on \r characters and processing each line individually, similar to how the main loop processes lines.
| # Process any remaining data in buffer | |
| if buffer.strip(): | |
| line = buffer.rstrip('\r') | |
| sse = decoder.decode(line) | |
| if sse is not None: | |
| yield sse | |
| # Process any remaining data in buffer | |
| if buffer.strip(): | |
| # Split the buffer by \r and process each line | |
| for line in buffer.split('\r'): | |
| if line.strip(): | |
| sse = decoder.decode(line) | |
| if sse is not None: | |
| yield sse |
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong, and the SSE spec is very explicit about the double newline splitting events
…api/fern into aa/in-house-sse-processing
…use-sse-processing
| # Process any remaining data in buffer | ||
| if buffer.strip(): | ||
| line = buffer.rstrip('\r') | ||
| sse = decoder.decode(line) | ||
| if sse is not None: | ||
| yield sse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer processing logic at the end of the function doesn't properly handle incomplete SSE events. When the stream ends with data that doesn't include a final newline, the code attempts to process it as a complete line, but doesn't signal to the decoder that this is the end of the stream.
Consider adding a final empty line decode after processing any remaining buffer content:
# Process any remaining data in buffer
if buffer.strip():
line = buffer.rstrip('\r')
sse = decoder.decode(line)
if sse is not None:
yield sse
# Signal end of stream to finalize any pending event
final_sse = decoder.decode('')
if final_sse is not None:
yield final_sseThis ensures that any partial event in the buffer gets properly finalized and emitted.
| # Process any remaining data in buffer | |
| if buffer.strip(): | |
| line = buffer.rstrip('\r') | |
| sse = decoder.decode(line) | |
| if sse is not None: | |
| yield sse | |
| # Process any remaining data in buffer | |
| if buffer.strip(): | |
| line = buffer.rstrip('\r') | |
| sse = decoder.decode(line) | |
| if sse is not None: | |
| yield sse | |
| # Signal end of stream to finalize any pending event | |
| final_sse = decoder.decode('') | |
| if final_sse is not None: | |
| yield final_sse |
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should catch this one with a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually intentional. If the SSE server doesn't send a final chunk that closes out an SSE event. we don't want to force the processing of an event, which is what the graphite suggested logic would do:
# Signal end of stream to finalize any pending event
final_sse = decoder.decode('')
The current version would be a stricter way to handle SSE chunks, where the server must end on /n/n to close out events or our SDK won't process it.
| buffer += text_chunk | ||
|
|
||
| # Process complete lines | ||
| while "\n" in buffer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you test this with a multiline event (like a JSON blob containing a content field with extra lines)? This doesn't appear to handle that to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it out with Cohere's bug. The extra lines in the JSON blob must be escaped ("\n\n") and this works fine. If they are regular double new lines ("\n\n"), we will interpret it as a new event.
generators/python/sdk/versions.yml
Outdated
| - version: 4.31.0 | ||
| changelogEntry: | ||
| - summary: | | ||
| Eliminated SSE dependency "httpx-sse" by hard-forking the code into the generator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have it anchored in the user-facing outcome more? I also don't love "hard-forking" language in the changelog. I think the points we are trying to land are:
- Fix SSE handling of events longer than or containing escaped newlines
- Remove external dependency on httpx-sse
…api/fern into aa/in-house-sse-processing
… and added a test file 'tests/utils/test_sse_streaming.py'
…[DONE]] before instead of '[[DONE]]'
…use-sse-processing
…api/fern into aa/in-house-sse-processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we could include this file (or a version of it) like the other as-is unit test files and generate it for all the customers who have SSE included in their APIs.
I have a moderate preference for doing that, but regardless this looks good :)
…to core_utilities (#9784) * Hard forked httpx_sse, moved into core_utilities, made fixes to line parsing bugs and improved idiomatic class definition. * updated parser to: copy http_sse code into output, use new in-house SSE EventStream object and better handle failed SSE parsing * updated changelog * Automated update of seed files * skip terminator sse events * typing fixes * Fix formatting issues: remove trailing whitespace and apply ruff formatting * Automated update of seed files * improved changelog language * SSE Support arbitrary charsets from the headers - defaulting to utf-8 * added SSE unit tests * properly copy core/http_sse to src/.../core/http_sse * better typing in test_http_sse * formatting * refactored 'server-sent-event-examples' output (seed gen was broken), and added a test file 'tests/utils/test_sse_streaming.py' * Hard forked httpx_sse, moved into core_utilities, made fixes to line parsing bugs and improved idiomatic class definition. * updated parser to: copy http_sse code into output, use new in-house SSE EventStream object and better handle failed SSE parsing * updated changelog * skip terminator sse events * Automated update of seed files * typing fixes * Fix formatting issues: remove trailing whitespace and apply ruff formatting * improved changelog language * Automated update of seed files * SSE Support arbitrary charsets from the headers - defaulting to utf-8 * added SSE unit tests * properly copy core/http_sse to src/.../core/http_sse * better typing in test_http_sse * formatting * refactored 'server-sent-event-examples' output (seed gen was broken), and added a test file 'tests/utils/test_sse_streaming.py' * Updated AsyncGenerator typing * Properly generator stream terminator as a string i.e. was returning [[DONE]] before instead of '[[DONE]]' * updated http_sse's __init__.py generation * manual seed updates to fixture 'server-sent-event-examples' * Automated update of seed files * updated sse tests * formatting fixes --------- Co-authored-by: aditya-arolkar-swe <[email protected]> Co-authored-by: fern-support <[email protected]>
Description
Generator change to yield this output: cohere-ai/cohere-python#700
Fixes FER-6826
Changes Made
httpx_sse, moved intocore_utilities, made fixes to line parsing bugs and improved idiomatic class definition.http_ssecode into output, use new in-house SSEEventStreamclass and better handle failed SSE parsingTesting