The official Python SDK for interacting with Camb AI's powerful voice and audio generation APIs. Create expressive speech, unique voices, and rich soundscapes with just a few lines of Python.
- Dubbing: Dub your videos into multiple languages with voice cloning!
- Expressive Text-to-Speech: Convert text into natural-sounding speech using a wide range of pre-existing voices.
- Generative Voices: Create entirely new, unique voices from text prompts and descriptions.
- Soundscapes from Text: Generate ambient audio and sound effects from textual descriptions.
- Live Transcription: Stream microphone or file audio over a WebSocket and receive cumulative interim transcripts, word-level timing, and typed events.
- Access to voice cloning, translation, and more (refer to full API documentation).
Install the SDK using pip, ensure Python 3.9+:
pip install camb-sdkOr through
pip install git+https://github.com/Camb-ai/cambai-python-sdkTo use the Camb AI SDK, you'll need an API key. You can authenticate it by:
from camb.client import CambAI, AsyncCambAI
# Synchronous Client
client = CambAI(api_key="YOUR_CAMB_API_KEY")
# Asynchronous Client
async_client = AsyncCambAI(api_key="YOUR_CAMB_API_KEY")To deploy the model go to models from baseten example: https://app.baseten.co/deploy/mars8-flash and deploy then perform setup like below
client_baseten = CambAI(
tts_provider="baseten",
provider_params={
"api_key": "YOUR_BASETEN_API_KEY",
"mars_url": "YOUR_BASETEN_URL"
}
)
# Call TTS with Baseten
client_baseten.text_to_speech.tts(
text="Hello World and my dear friends",
language="en-us",
speech_model="mars-flash",
request_options={
"additional_body_parameters": {
"reference_audio": base64.b64encode(open("audio.wav", "rb").read()).decode('utf-8'), # also support public/signed urls
"reference_language": "en-us" # required
},
"timeout_in_seconds": 300
}
)client_with_provider = CambAI(
tts_provider="vertex",
provider_params={"project_id": "my-project", "location": "us-central1"}
)NOTE: For more examples and full ready to run files refer to the examples/ directory.
Convert text into spoken audio using one of Camb AI's high-quality voices.
| Model Name | Sample Rate | Description |
|---|---|---|
| mars-pro | 48kHz | High-fidelity, professional-grade speech synthesis. Ideal for long-form content and dubbing. |
| mars-8.1-pro-beta | 48kHz | Beta MARS Pro model. Try this model with the same source references, as it may perform much better for pronunciation, expressiveness with high-pitch references, overall prosody, accent control, and coverage. |
| mars-8.1-flash-beta | 48kHz | Beta MARS Pro model with faster speed. Try this model with the same source references, as it may perform much better for pronunciation, expressiveness with high-pitch references, overall prosody, accent control, and coverage. |
| mars-instruct | 22.05kHz | optimized for instruction-following and nuance control. |
| mars-flash | 22.05kHz | Low-latency model optimized for real-time applications and conversational AI. |
from camb.client import CambAI, save_stream_to_file
from camb.types.stream_tts_output_configuration import StreamTtsOutputConfiguration
# Initialize client (ensure API key is set)
client = CambAI(api_key="YOUR_CAMB_API_KEY")
response = client.text_to_speech.tts(
text="Hello from Camb AI! This is a test of our Text-to-Speech API.",
voice_id=20303, # Example voice ID, get from client.voice_cloning.list_voices()
language="en-us",
speech_model="mars-8.1-flash-beta", # options: mars-pro, mars-8.1-pro-beta, mars-flash, mars-instruct, auto
output_configuration=StreamTtsOutputConfiguration(
format="mp3"
)
)
save_stream_to_file(response, "tts_output.mp3")
print("Success! Audio saved to tts_output.mp3")You can also stream audio asynchronously using AsyncCambAI.
import asyncio
from camb.client import AsyncCambAI, save_async_stream_to_file
from camb.types.stream_tts_output_configuration import StreamTtsOutputConfiguration
async_client = AsyncCambAI(api_key="YOUR_CAMB_API_KEY")
async def main():
response = async_client.text_to_speech.tts(
text="Hello, this is a test of the text to audio streaming capabilities.",
language="en-us",
speech_model="mars-8.1-flash-beta", # options: mars-pro, mars-8.1-pro-beta, mars-flash, mars-instruct, auto
voice_id=147319,
output_configuration=StreamTtsOutputConfiguration(
format="mp3"
)
)
await save_async_stream_to_file(response, "text_to_audio_output.mp3")
print("Success! Audio saved to text_to_audio_output.mp3")
asyncio.run(main())For applications requiring faster responses, make sure you're using mars-flash (22.05kHz).
response = client.text_to_speech.tts(
text="Hey! I can respond much faster.",
language="en-us",
speech_model="mars-flash",
voice_id=<id>,
output_configuration=StreamTtsOutputConfiguration(
format="wav"
)
)You can list available voices to find a voice_id that suits your needs:
voices = client.voice_cloning.list_voices()
print(f"Found {len(voices)} voices:")
for voice in voices[:5]: # Print first 5 as an example
print(f" - ID: {voice["id"]}, Name: {voice["voice_name"]}, Gender: {voice["gender"]}, Language: {voice["language"]}")Create completely new and unique voices from a textual description of the desired voice characteristics.
from camb.client import CambAI
# Initialize client
client = CambAI(api_key="YOUR_CAMB_API_KEY")
try:
print("Generating a new voice and speech...")
# Returns 3 sample URLs
result = client.text_to_voice.create_text_to_voice(
text="Crafting a truly unique and captivating voice that carries a subtle air of mystery, depth, and gentle warmth.",
voice_description="A smooth, rich baritone voice layered with a soft echo, ideal for immersive storytelling and emotional depth.",
)
print(result)
except Exception as e:
print(f"Exception when calling text_to_voice: {e}\n")Generate sound effects or ambient audio from a descriptive prompt.
from camb.client save_stream_to_file
import time
response = client.text_to_audio.create_text_to_audio(
prompt="A gentle breeze rustling through autumn leaves in a quiet forest.",
duration=10,
audio_type="sound"
)
task_id = response.task_id
if task_id:
while True:
status = client.text_to_audio.get_text_to_audio_status(task_id=task_id)
if status.status == "SUCCESS":
result = client.text_to_audio.get_text_to_audio_result(status.run_id)
save_stream_to_file(result, "sound_effect.mp3")
print("Success! Sound effect saved to sound_effect.mp3")
break
time.sleep(2)Dub videos into different languages with voice cloning and translation capabilities.
from camb.types.language_enums import Languages
result = client.dub.create_dub(
video_url="your_accessible_video_url",
source_language=Languages.EN_US, # English (Or Check client.languages.get_source_languages())
target_languages=[Languages.HI_IN], # list of Languages like [Languages.HI_IN, Languages.FR_FR] or if you want single language then can use target_language=Languages.HI_IN
)
task_id = response.task_id
print(f"Dub Task created with ID: {task_id}")
while True:
status_response = client.dub.get_dubbing_status(task_id=task_id)
print(f"Current Status: {status_response.status}")
if status_response.status == "SUCCESS":
dubbed_run_info = client.dub.get_dubbed_run_info(status_response.run_id)
print(f"Dubbed Video URL: {dubbed_run_info.audio_url}")
print(f"Dubbed Video URL: {dubbed_run_info.transcript}")
print(f"Dubbed Video URL: {dubbed_run_info.video_url}")
break
time.sleep(5)Stream audio over a single WebSocket and receive cumulative interim
transcripts, word-level timing, and typed events. The session exposes a
microphone helper, a file source for tests, and the same on(event)
dispatcher in both SDKs.
import asyncio
import os
from camb.client import CambAI
from camb.live_transcription import Microphone, ServerMessageType
async def main():
client = CambAI(api_key=os.environ["CAMB_API_KEY"])
session = await client.live_transcription.connect(
model="boli-v5",
language="en-us",
sample_rate=16000,
)
@session.on(ServerMessageType.RESULTS)
def _(msg):
# Cumulative transcript: replace the previous interim rather
# than concatenating successive Results events.
print(f"\r{msg.transcript}", end="", flush=True)
@session.on(ServerMessageType.CLOSED)
def _(info):
print(f"\nClosed: code={info.code} reason={info.reason!r}")
async with session:
mic = Microphone(sample_rate=16000, chunk_size=1600)
await session.stream_audio(mic)
asyncio.run(main())Prefer streaming a file (no audio device dependency)? See
examples/live_transcription_file.py.
For the full event catalog (Ready, Results, Final, Error,
Closed), configuration options, and extensibility notes, see the
Live Transcription tutorial
and SDK guide.
The Camb AI SDK offers a wide range of capabilities beyond these examples, including:
- Voice Cloning
- Translations
- Translated TTS
- Audio Dubbing
- Transcription (async file/URL jobs)
- Live Transcription (streaming WebSocket — see Example 5 above)
- And more!
Please refer to examples for direct runnable examples and Official Camb AI API Documentation for a comprehensive list of features and advanced usage patterns.
This project is licensed under the MIT License - see the LICENSE file for details.
