Add live transcription WebSocket client with microphone support by arnavmehta7 · Pull Request #20 · Camb-ai/cambai-python-sdk

arnavmehta7 · 2026-05-26T00:31:34Z

Summary

Adds camb.live_transcription — a typed, async WebSocket client wrapping the /apis/transcription/listen endpoint. End-to-end verified against the dev backend.

What ships

from camb.client import CambAI
from camb.live_transcription import Microphone, ServerMessageType

client = CambAI(api_key=...)
session = await client.live_transcription.connect(
    model="boli-v5",
    language="en-us",
    sample_rate=16000,
)

@session.on(ServerMessageType.RESULTS)
def _(msg):
    print(msg.transcript)   # cumulative interim — replace, don't concat

async with session:
    await session.stream_audio(Microphone(sample_rate=16000, chunk_size=1600))

Typed events — Ready, Results, Final (reserved), Error, Closed. All defined via a single ServerMessageType enum + Pydantic models + PARSER_REGISTRY. Adding a new event = three edits in one file.
Microphone helper — sounddevice-based capture, now a regular dependency.
File source — FileAudioSource paces a WAV at real-time for tests / CI.
Wildcard handler — session.on_any catches event types the SDK hasn't been updated for yet, so apps stay forward-compatible.
Custom transport — pluggable Transport protocol so tests can inject mocks; default uses websockets.
Error hierarchy — LiveTranscriptionError + …ConnectError / …ProtocolError subclasses.

Bugs found and fixed during integration testing

WebSocket race — server's Ready was being dropped when the handshake completed before the read loop attached.
waitUntilReady hang on auth failure — close before Ready left awaiters hanging; now wakes the event.
Double-open via async with — connect() already opens the session, so async with session: re-ran _open(), spawning a second reader task. Symptom: every event fired twice and the session wedged after a few frames. _open() is now idempotent.

Files

camb/live_transcription/
├── __init__.py          # public exports
├── client.py            # LiveTranscriptionClient (camb.live_transcription)
├── session.py           # LiveTranscriptionSession + module-level connect()
├── transport.py         # Transport protocol + WebsocketsTransport
├── events.py            # ServerMessageType + Pydantic models + PARSER_REGISTRY
├── options.py           # Encoding + ConnectOptions
├── audio_source.py      # AudioSource protocol + FileAudioSource
├── microphone.py        # Microphone (sounddevice)
└── errors.py            # LiveTranscriptionError hierarchy

examples/
├── live_transcription_microphone.py   # mic → session
└── live_transcription_file.py         # WAV → session at real-time pace

Test plan

Smoke test of imports + model parsing in py3125 conda env.
End-to-end: stream WAV → dev endpoint → received Ready + 9 cumulative Results + clean Closed(1000). Final transcript: " Welcome to our service. We're glad to have".
Negative path: invalid key → Closed(1008, "Invalid/Expired API Key") surfaced cleanly, no hang.
Post-fix verification: Ready fires exactly once, no late wedge under async with.
Reviewer: run examples/live_transcription_file.py public_docs/audio/demo-accent-en-us.wav against staging once the route is healthy.

Wraps /apis/transcription/listen behind an async session with a typed event dispatcher (Ready / Results / Final / Error / Closed). New events are added in three edits: enum member, payload model, registry entry. Microphone helper backed by sounddevice as an optional extra: pip install 'camb-sdk[microphone]'.

Without this, an auth failure (server emits Closed(1008) immediately) left wait_until_ready() hanging forever. Set the ready event in _emit_close so awaiters wake up and observe the Closed event with the real reason.

- examples/live_transcription_file.py streams any local WAV to the WebSocket at real-time pace (no audio device required). - README now lists Live Transcription in the feature bullets, the new optional [microphone] extra under Installation, an end-to-end example block (Example 5), and a pointer to the file example for environments without a mic.

The [microphone] extra was a single line; gating sounddevice behind it meant users got a runtime error on the first Microphone() instead of an install-time failure. Now sounddevice is a regular dep: - Drop the [project.optional-dependencies] block in pyproject.toml. - Add sounddevice>=0.4.6 to dependencies and requirements.txt. - Microphone.__init__ no longer guards the import or raises MicrophoneUnavailableError; the import is top-level. - Remove MicrophoneUnavailableError from errors.py and __init__.py exports. (Hard dep makes the error unreachable.) - Remove the [microphone] install hint from README and example docstrings.

The resource client's connect() already opens the session before returning it; users then wrap the returned session in 'async with session:' for cleanup. Without an idempotency guard, __aenter__ re-ran _open, spawning a second reader task on the same WebSocket. Symptoms this fixes: - Every event (Ready, Results, Closed) was fanned out twice — the Ready log line printed twice on connect. - After ~a few seconds of streaming, the dual consumers raced on the underlying websockets async iterator and the session wedged silently. Now _open early-returns if the reader task is already running, so the standard 'connect() + async with' pattern in the examples no longer double-opens.

Arnav Mehta and others added 6 commits May 26, 2026 04:10

Unblock waitUntilReady when session closes pre-Ready

877deaf

Without this, an auth failure (server emits Closed(1008) immediately) left wait_until_ready() hanging forever. Set the ready event in _emit_close so awaiters wake up and observe the Closed event with the real reason.

reqs bump

d5f9b7a

arnavmehta7 merged commit fd88096 into main May 26, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add live transcription WebSocket client with microphone support#20

Add live transcription WebSocket client with microphone support#20
arnavmehta7 merged 6 commits into
mainfrom
feat/live-transcription-mic

arnavmehta7 commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arnavmehta7 commented May 26, 2026

Summary

What ships

Bugs found and fixed during integration testing

Files

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant