Skip to content

feat(realtime): add realtime speech translation WebSocket client#23

Merged
arnavmehta7 merged 3 commits into
mainfrom
to-main/feat/realtime-speech-to-speech-translation-integration
May 27, 2026
Merged

feat(realtime): add realtime speech translation WebSocket client#23
arnavmehta7 merged 3 commits into
mainfrom
to-main/feat/realtime-speech-to-speech-translation-integration

Conversation

@arnavmehta7

Copy link
Copy Markdown
Collaborator

Implements camb.realtime, a new module that wraps the /v1/realtime WebSocket endpoint. Mirrors the live_transcription module's session, event dispatch, and transport patterns. Key behaviours:

  • connect() opens the WS, sends session.update with auth and config, and returns immediately; callers register handlers then await session.wait_until_ready() (90 s timeout) for session.created
  • Binary frames and JSON response.audio.delta both dispatch to AudioDeltaEvent with uniform data: bytes — handlers see no difference
  • session.starting is a supported event so callers can show a loading indicator during the 30+ s cold-boot of non-iris models
  • stream_audio() accepts any AsyncIterable[bytes], making it compatible with live_transcription.Microphone and FileAudioSource

Wires client.realtime property onto both CambAI and AsyncCambAI.

AZCamb and others added 3 commits May 26, 2026 18:36
Implements camb.realtime, a new module that wraps the /v1/realtime
WebSocket endpoint. Mirrors the live_transcription module's session,
event dispatch, and transport patterns. Key behaviours:

- connect() opens the WS, sends session.update with auth and config,
  and returns immediately; callers register handlers then await
  session.wait_until_ready() (90 s timeout) for session.created
- Binary frames and JSON response.audio.delta both dispatch to
  AudioDeltaEvent with uniform data: bytes — handlers see no difference
- session.starting is a supported event so callers can show a loading
  indicator during the 30+ s cold-boot of non-iris models
- stream_audio() accepts any AsyncIterable[bytes], making it compatible
  with live_transcription.Microphone and FileAudioSource

Wires client.realtime property onto both CambAI and AsyncCambAI.
Two runnable examples for the new camb.realtime module:
- realtime_translation_microphone.py: mic in (24kHz PCM16) -> plays
  translated audio out via sounddevice, prints translated text.
- realtime_translation_file.py: streams a 24kHz mono WAV -> writes the
  translated audio to an output WAV (headless-testable).

Both use model="iris": in live testing against realtime.camb.ai it was
the only model that returned translated audio (lilac/orchid connected but
emitted nothing; violet never reached session.created). Audio is PCM16
mono 24kHz in both directions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
examples: realtime speech-to-speech translation (mic + file)
@arnavmehta7 arnavmehta7 merged commit 3629dfc into main May 27, 2026
10 checks passed
@AZCamb AZCamb deleted the to-main/feat/realtime-speech-to-speech-translation-integration branch June 7, 2026 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants