Skip to content

s-ingale/sse-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

SSE from Scratch

Build a Server-Sent Events (SSE) client and server from scratch to understand how streaming HTTP actually works — from raw TCP bytes to LLM token streaming.

Most people just call stream=True and move on. This project peels back each layer to show what's really happening at the HTTP level.

Roadmap

Phase 0: Networking Fundamentals

Build the mental model — what actually happens when two computers talk to each other.

  • Understand the client-server model: what is a server, what is a client, how do they find each other
  • Learn what IP addresses and ports are — how a machine knows where to send data
  • Understand TCP: a reliable, ordered pipe between two programs (vs UDP which is fire-and-forget)
  • Learn what a socket is — the programming interface to TCP (think: a file you read/write to, but it's a network connection)
  • Build a basic TCP echo server and client in Python using the socket module
  • Understand HTTP as a text protocol on top of TCP — it's just structured text (request line, headers, body) sent over a socket
  • Build a minimal HTTP server from raw sockets: read the request, send back HTTP/1.1 200 OK with a body

Phase 1: Raw HTTP Chunked Encoding

Understand the foundation — how HTTP streams data without knowing the full response size upfront.

  • Build a raw TCP server (Python socket) that sends chunked HTTP responses
  • Observe the wire format: hex chunk sizes, \r\n delimiters, zero-length terminator
  • Build a raw TCP client that reads and reassembles chunked responses
  • Compare Content-Length vs Transfer-Encoding: chunked behavior

Phase 2: SSE Protocol

Layer the SSE text format on top of chunked HTTP.

  • Implement an SSE server that sends text/event-stream responses
  • Handle SSE fields: data:, event:, id:, retry:
  • Build an SSE client that parses the event stream line by line
  • Implement auto-reconnection with Last-Event-ID
  • Handle multi-line data: fields and named events

Phase 3: LLM Streaming Client

Use everything from Phase 1 & 2 to stream real LLM API responses.

  • Make a raw HTTP POST to an LLM API with stream: true (no SDK)
  • Parse the SSE response and extract token deltas from JSON
  • Handle the [DONE] / message_stop sentinel
  • Compare OpenAI vs Anthropic streaming formats side by side
  • Build a terminal UI that renders tokens as they arrive

Phase 4: Production Concerns

Things that break in the real world.

  • Buffering: understand why proxies (NGINX, CDNs) can kill your stream
  • Timeouts: handle idle connections and keep-alive
  • Error handling: mid-stream failures, malformed events, network drops
  • Backpressure: what happens when the client can't keep up

Learning Resources

Networking Fundamentals (start here)

  1. Networking Tutorial — Ben Eater (YouTube) — Short videos building up from first principles. Best "start from absolute zero" resource.
  2. Socket Programming in Python — Real Python — Hands-on, Python-specific. Goes from "what is a socket?" to a working client-server app.
  3. Socket Programming HOWTO — Python Docs — Short, authoritative reference on what bind, listen, accept, connect actually do.
  4. Build Your Own HTTP Server — Kite Metric — Project-based: build an HTTP server from scratch, learning TCP/IP along the way.

HTTP Chunked Encoding

  1. Deep Dive: Chunked Transfer Encoding — Sahan Serasinghe
  2. HTTP Streaming: Chunked vs Store & Forward — GitHub Gist
  3. What does Transfer-Encoding: Chunked mean? — Fir3net

SSE Protocol

  1. What is SSE? — Bunny.net Academy
  2. Using Server-Sent Events — MDN
  3. SSE vs WebSockets — Ably
  4. Build a Realtime App with SSE — DigitalOcean

LLM Streaming

  1. How Streaming LLM APIs Work — Simon Willison
  2. LLM Streaming with SSE — Daniel Corin
  3. OpenAI SSE Streaming API — Better Programming
  4. Comparing Streaming Structures Across LLM APIs — Percolation Labs

Suggested Reading Order

Phase 0: Read 1 (Ben Eater videos) for the mental model, then 2 (Real Python sockets) to build something. Then 4 (Kite Metric) to build an HTTP server from raw sockets.

Phase 1–3: Read 5 → 8 → 9 → 12, then start building. That gives you enough to write a raw-socket SSE server and client. Come back to the others as reference.

Additional Tips

  • Use curl -N --raw against your own server as you build — seeing the raw bytes land in your terminal makes everything click faster than reading about it.
  • Use Wireshark or tcpdump to inspect the actual TCP packets. Seeing the chunked frames at the network level removes all mystery.
  • Read the source of httpx-sse (~100 lines) — it's a thin SSE parser on top of httpx streaming. One of the best ways to see how little code the protocol actually requires.
  • Try breaking things intentionally — send malformed chunks, kill the server mid-stream, send events without \n\n terminators. Understanding failure modes teaches the protocol better than happy-path examples.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors