Skip to content

sm18lr88/OpenAI_TTS_GUI

Repository files navigation

OpenAI TTS GUI

  • Easily use high quality, cheap TTS.
  • Long text is supported.

Features

  • Models: tts-1, tts-1-hd, gpt-4o-mini-tts
  • Voices: All standard OpenAI TTS options (alloy, ash, ballad, cedar, coral, echo, fable, marin, onyx, nova, sage, shimmer, verse)
  • Format/Speed: MP3, Opus, AAC, FLAC, WAV, PCM; 0.25x-4.0x
  • Instructions: Custom voice instructions for gpt-4o-mini-tts
  • Presets: you can save or load your custom instructions
  • Long Text: handled automatically; no character limits
  • Feedback: Live character/chunk count for manual cost calculation if desired
  • Themes: Dark or Light (native Qt)
  • API Key: manually set, or reads from your environment variables
  • Retention: retain individual mp3 chunks if desired
  • Sidecar: JSON metadata written next to outputs for reproducibility
  • CLI: openai-tts --in text.txt --out out.mp3 --model tts-1
  • Stream-aware: uses stream_format="audio" for low-memory downloads; request IDs are recorded in sidecars
  • One-click debug: copy request IDs and open log folder from the GUI

Requirements

Installation

If installation seems too complicated, copy/paste this text into an AI chatbot to help you in the process.

  1. Clone the repository:

    git clone https://github.com/sm18lr88/OpenAI_TTS_GUI.git
    cd OpenAI_TTS_GUI
  2. Set up a virtual environment:

    • Using venv (recommended for most users):
      python -m venv venv
      # Windows
      .\venv\Scripts\activate
      # macOS/Linux
      source venv/bin/activate
    • Using uv (alternative):
      uv venv
      # Windows
      .\venv\Scripts\activate
      # macOS/Linux
      source venv/bin/activate
  3. Install dependencies:

    • The dependencies are specified in pyproject.toml. Install them using:
      pip install -U openai PyQt6 keyring platformdirs
    • If using uv:
      uv sync

Running the Application

One-time quick build (optional)

Create a single-file executable with PyInstaller:

pip install .[dev]
pyinstaller --noconfirm openai_tts.spec

Artifacts will appear in dist/.

  1. Set API Key:
    • Option 1: Set the OPENAI_API_KEY environment variable.
    • Option 2: Launch the app and use API Key -> Set/Update API Key... (stored in your OS keyring when available and obfuscated in api_key.enc).
    • (Optional) For self-hosted/OpenAI-compatible endpoints, set OPENAI_BASE_URL before launch.
  2. Start the GUI:
    python -m openai_tts_gui
  3. CLI (headless):
    openai-tts --in text.txt --out out.mp3 --model tts-1 --voice alloy --format mp3 --speed 1.0 --log-level INFO

Optional concurrency

Set TTS_PARALLELISM to enable parallel chunk generation (default = 1 / off):

# Example: use up to 4 concurrent requests
set TTS_PARALLELISM=4   # Windows PowerShell: $env:TTS_PARALLELISM=4

Development & QA

  • Style/lint: uv run ruff check
  • Types: uv run ty check
  • Tests: uv run pytest (uses repo-local .pytest_tmp for temp files)
  • Build (Windows): uv run pyinstaller --noconfirm openai_tts.spec

Troubleshooting

  • FFmpeg: Ensure it's on PATH or configured in config.py. The app performs a preflight check on startup.
  • Keys: OS keyring is used when available; api_key.enc is a fallback.
  • Logs & Data: Logs at the path shown in About; presets and app data live under the per-user data directory.

Tips

  • Speed: Adjustments far from 1.0x may impact quality; try Adobe Audition for speed tweaks, or select the gpt-4o-mini-tts model and instruct it to speak faster/slower.
  • Instructions: check examples at openai.fm.
  • Security: api_key.enc is obfuscated, not encrypted. Prefer environment variables or OS keyring storage.
  • ffmpeg: If not found, ensure it’s in PATH or set its path in config.py.
  • Logs: Check tts_app.log for errors or details.

Roadmap

  • Handles long text
  • PyQt6 GUI (formerly PySimpleGUI)
  • Basic API rate limit handling
  • Enhance chunking and concatenation
  • Option to retain chunk files
  • Light and Dark themes
  • Custom API key settings
  • Use environment variable if set
  • Code refactoring
  • Basic API key obfuscation
  • Speed boost: Parallel chunk processing
  • Granular progress reporting
  • Bundle new release into .exe
  • Price estimate

Versions / Notes

  • OpenAI Python SDK ==2.9.0.
  • PyQt6 ==6.10.0.
  • Streams audio via stream_format="audio" (OpenAI Python v2 default).

Support

Check tts_app.log for issues. Report problems on GitHub. For code help, consult AI models with logs or snippets.

About

GUI for OpenAI's TTS

Resources

License

Stars

Watchers

Forks

Packages

No packages published