vox

Text-to-speech using Voxtral-4B-TTS, powered by MLX for efficient inference on Apple Silicon.

Featuring real-time streaming audio, multiple languages and voice presets, and both interactive and command-line interfaces.

Prerequisites

macOS with Apple Silicon (M1, M2, M3, etc.)
uv - Python package manager (brew install uv)
FFmpeg - for audio playback (brew install ffmpeg)

Quickstart

Clone this repository and sync:

git clone https://github.com/tetsuo/vox.git
cd vox
uv sync

Note that mlx-audio is installed from the source repository (contains the latest Voxtral TTS support).

Usage

Type uv run vox --help to see all options:

usage: vox [options]

options:
  -h, --help        show this help message and exit
  --voice VOICE     voice preset (default: casual_male)
  --text TEXT       text to speak; use - to read from stdin
  --save PATH       save generated audio to a WAV file
  --save-dir DIR    auto-save each utterance to DIR/ (interactive mode)
  --no-play         generate audio but do not play it
  --list-voices     print available voices and exit
  --chunk-frames N  streaming chunk size in LM frames (default: 25, ~2s per chunk)
  --model MODEL     HuggingFace repo ID or local path
                    (default: mlx-community/Voxtral-4B-TTS-2603-mlx-6bit)

Examples

Speak a single phrase and exit:

uv run vox --text "hello world"
uv run vox --voice fr_female --text "bonjour le monde"

Read from STDIN:

echo "Hello world" | uv run vox
uv run vox --text=-

Generate audio without playback and save to WAV:

uv run vox --text "hello" --save output.wav --no-play

Auto-save every utterance in interactive mode:

uv run vox --save-dir ./takes

Interactive Mode

Start the interactive shell:

uv run vox

Then type text to speak.

Built-in commands:

:voice <name> - Switch voice (e.g., :voice fr_female)
:voices - List all available voices
:help - Show help
:quit or :q - Exit

Voices

The model supports 22 voices across 9 languages:

Language	Voices
English	casual_male, casual_female, cheerful_female, neutral_male, neutral_female
French	fr_male, fr_female
Spanish	es_male, es_female
German	de_male, de_female
Italian	it_male, it_female
Portuguese	pt_male, pt_female
Dutch	nl_male, nl_female
Arabic	ar_male
Hindi	hi_male, hi_female

Resources

Voxtral-4B-TTS Model: https://huggingface.co/mistralai/Voxtral-4B-TTS-2603
MLX Quantized Model: https://huggingface.co/mlx-community/Voxtral-4B-TTS-2603-mlx-6bit
Mistral Announcement: https://mistral.ai/news/voxtral-tts
MLX Framework: https://github.com/ml-explore/mlx

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
vox		vox
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vox

Prerequisites

Quickstart

Usage

Examples

Interactive Mode

Voices

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

vox

Prerequisites

Quickstart

Usage

Examples

Interactive Mode

Voices

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages