Add live transcription WebSocket client with microphone support#20
Merged
Conversation
Wraps /apis/transcription/listen behind an async session with a typed event dispatcher (Ready / Results / Final / Error / Closed). New events are added in three edits: enum member, payload model, registry entry. Microphone helper backed by sounddevice as an optional extra: pip install 'camb-sdk[microphone]'.
Without this, an auth failure (server emits Closed(1008) immediately) left wait_until_ready() hanging forever. Set the ready event in _emit_close so awaiters wake up and observe the Closed event with the real reason.
- examples/live_transcription_file.py streams any local WAV to the WebSocket at real-time pace (no audio device required). - README now lists Live Transcription in the feature bullets, the new optional [microphone] extra under Installation, an end-to-end example block (Example 5), and a pointer to the file example for environments without a mic.
The [microphone] extra was a single line; gating sounddevice behind it meant users got a runtime error on the first Microphone() instead of an install-time failure. Now sounddevice is a regular dep: - Drop the [project.optional-dependencies] block in pyproject.toml. - Add sounddevice>=0.4.6 to dependencies and requirements.txt. - Microphone.__init__ no longer guards the import or raises MicrophoneUnavailableError; the import is top-level. - Remove MicrophoneUnavailableError from errors.py and __init__.py exports. (Hard dep makes the error unreachable.) - Remove the [microphone] install hint from README and example docstrings.
The resource client's connect() already opens the session before returning it; users then wrap the returned session in 'async with session:' for cleanup. Without an idempotency guard, __aenter__ re-ran _open, spawning a second reader task on the same WebSocket. Symptoms this fixes: - Every event (Ready, Results, Closed) was fanned out twice — the Ready log line printed twice on connect. - After ~a few seconds of streaming, the dual consumers raced on the underlying websockets async iterator and the session wedged silently. Now _open early-returns if the reader task is already running, so the standard 'connect() + async with' pattern in the examples no longer double-opens.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
camb.live_transcription— a typed, async WebSocket client wrapping the/apis/transcription/listenendpoint. End-to-end verified against the dev backend.What ships
Ready,Results,Final(reserved),Error,Closed. All defined via a singleServerMessageTypeenum + Pydantic models +PARSER_REGISTRY. Adding a new event = three edits in one file.sounddevice-based capture, now a regular dependency.FileAudioSourcepaces a WAV at real-time for tests / CI.session.on_anycatches event types the SDK hasn't been updated for yet, so apps stay forward-compatible.Transportprotocol so tests can inject mocks; default useswebsockets.LiveTranscriptionError+…ConnectError/…ProtocolErrorsubclasses.Bugs found and fixed during integration testing
Readywas being dropped when the handshake completed before the read loop attached.waitUntilReadyhang on auth failure — close before Ready left awaiters hanging; now wakes the event.async with—connect()already opens the session, soasync with session:re-ran_open(), spawning a second reader task. Symptom: every event fired twice and the session wedged after a few frames._open()is now idempotent.Files
Test plan
py3125conda env.Ready+ 9 cumulativeResults+ cleanClosed(1000). Final transcript:" Welcome to our service. We're glad to have".Closed(1008, "Invalid/Expired API Key")surfaced cleanly, no hang.Readyfires exactly once, no late wedge underasync with.examples/live_transcription_file.py public_docs/audio/demo-accent-en-us.wavagainst staging once the route is healthy.Related
🤖 Generated with Claude Code