v1.9.2 by TiTidom-RC · Pull Request #252 · TiTidom-RC/TTSCast

TiTidom-RC · 2026-05-22T16:01:59Z

This pull request introduces support for streaming Gemini TTS (Text-to-Speech) with low-latency playback and improves configuration options, security, and compatibility. The most significant changes are the addition of a streaming mode for Gemini TTS, enhancements to the audio proxy for secure streaming, and updates to the configuration UI and installation scripts.

Gemini TTS Streaming Support:

Added a new "Streaming Gemini TTS" mode, allowing audio playback to start as soon as the first syllables are generated, reducing perceived latency. This includes new configuration options (streamingDefault), test options (ttsTestStreaming), and the necessary backend logic to handle streaming requests. [1] [2] [3] [4] [5] [6] [7]

Audio Proxy Security and Streaming Implementation:

Refactored ttscast.audio.proxy.php to:
- Replace IP-based access restrictions with strict file name validation using regex and unpredictable IDs.
- Add support for streaming audio via named pipes (mkfifo), with validation for pipe existence and parameters.
- Improve error logging and HTTP response handling for invalid parameters, missing files, and exceptions. [1] [2] [3] [4]

Configuration and UI Improvements:

Updated the configuration UI to:
- Enable LINEAR16 audio encoding options for Google TTS.
- Add a checkbox for default streaming mode and an option to test streaming in the Gemini TTS section.
- Update Gemini AI model pricing and add the new Gemini 3.5 Flash model to the selection. [1] [2] [3] [4]

Dependency and Internal Updates:

Upgraded google-genai Python package to version 2.4.0 for improved compatibility.
Minor code and dictionary updates to support new features and terminology, including adding new keywords to .vscode/settings.json and updating the plugin version to 1.9.2. [1] [2] [3] [4] [5] [6] [7]

These changes enable faster, more flexible TTS playback and improve the maintainability and security of the plugin.

Update Translations

Update TTS Cast pluginVersion from 1.9.1 to 1.9.2 and bump google-genai in requirements.txt from 2.3.0 to 2.4.0. This advances the plugin release and brings a newer google-genai dependency for fixes/compatibility.

Introduce streaming mode for Gemini TTS to reduce perceived latency and stream audio to Chromecast devices. Changes include: - Add streamingDefault config (UI checkbox, install/update defaults, CLI arg) to enable streaming by default. - Daemon: support streaming path in getTTS — prefetch first chunk, create a named pipe (mkfifo) and spawn geminiTTSStream writer thread; cast uses a live stream_type when appropriate. - Add geminiTTS(streaming=True) path that returns a stream iterator and first chunk for format detection; add geminiTTSStream to write PCM chunks to the pipe and optionally cache a WAV file. - Audio proxy: add handling for type=stream that validates UUID.l16 names, serves audio/L16 (24k/mono), streams pipe contents and cleans up the pipe. - castToGoogleHome: accept streamType param and propagate to media metadata. - Utils and settings updates: new temporary stream folder, VSCode settings and mime entries updated. Includes error handling, non-blocking pipe writes with retry, and logging for debugging.

PHP: Replace hardcoded audio/L16 Content-Type with validated rate and channels from query params and remove the static 'l16' MIME entry. This ensures the proxy emits correct L16 headers (allowed rates: 8000,16000,22050,24000,44100,48000; channels: 1 or 2) and avoids malformed responses. Python: Include sampleRate and channels in the generated pipe URL and pass them into the TTS streaming thread. Change geminiTTS result checks from "is not None" to "isinstance(..., bytes)" before writing files to ensure only binary audio is written. These changes improve correctness when streaming different L16 configurations and prevent non-bytes values from being treated as audio.

Pass the new ttsTestStreaming flag from the PHP UI to the daemon and add a UI checkbox + new Gemini 3.5 Flash model option. Update daemon socket handling to accept ttsStreaming, extend generateTestTTS signature, and implement a streaming path that creates a FIFO pipe and casts live audio (geminiTTS stream) to devices; keep fallback to file-based generation. Also add http_options timeout to genai.Client creation calls to increase request timeouts and improve reliability.

Move the "Style de voix (Gemini TTS)" text input in plugin_info/configuration.php so it appears after the "Tester avec le mode streaming" option. This is a UI reorder only—the field, its data key (ttsTestGeminiStyle), placeholder and tooltip are unchanged; no functional logic was modified.

Propagate and retain the Gemini TTS stream client to prevent premature socket/HTTPX closure. Prefetch now returns an additional streamClient value; callers unpack it and pass it into geminiTTSStream. geminiTTSStream signature gains a trailing _client parameter (kept alive for httpx sockets), and threading invocations were updated to include this argument. No other functional changes.

Harden the PHP proxy and make streaming more robust and observable. Adds core include, strict filename/UUID regex validation for TTS/stream/sounds, and extensive logging for invalid params, missing pipes, open/read errors and stream lifecycle (start/bytes sent/finished). Replaces previous IP-based restriction with validation-based protection. Implements real-time chunked streaming by disabling PHP output buffers, reading the pipe in loops, setting Content-Encoding: identity to avoid compression, and cleaning up the pipe after use. Also logs and returns proper HTTP codes on errors. In the daemon, catch BrokenPipeError during Gemini TTS streaming, log a warning and return None to handle client disconnects gracefully.

Switch the HTTP stream Content-Type to audio/wav and prepend a WAV RIFF header for Gemini TTS streaming so clients correctly interpret 16-bit little-endian PCM. audio/L16 (RFC 2586) expects big-endian, but Gemini returns PCM LE; using WAV avoids endian mismatch. The daemon builds a minimal WAV header (channels, sampwidth=2, framerate=sampleRate) with the wave module, patches the RIFF and data chunk sizes to 0x7FFFFFFF as placeholders for an unknown/ongoing stream, and writes the header to the pipe before sending audio chunks. Modified files: core/php/ttscast.audio.proxy.php and resources/ttscastd/ttscastd.py.

Replace dynamic 'audio/L16;rate=...;channels=...' MIME strings with a fixed 'audio/wav' MIME type because the proxy delivers a WAV RIFF stream (PCM LE 16-bit). This change updates both the TestTTS prefetch path and the main TTS streaming path in resources/ttscastd/ttscastd.py so the stream is handled with the correct format.

Send HTTP headers and flush output before opening the FIFO to avoid Chromecast timeouts and start streaming immediately. Enable implicit flush and remove chunked transfer header so PCM data isn't buffered/compressed. Add debug logging around pipe open and streaming. Also reduce the ttscastd pipe-writer retry sleep from 50ms to 5ms to decrease startup latency before the first write.

Enable LINEAR16 options in the configuration UI and make TTS generation respect the gCloudAudioEncoding and gCloudSampleRate settings. Updated plugin_info/configuration.php to allow selecting LINEAR16 (24k/48k). Updated resources/ttscastd/ttscastd.py to choose file extension and MIME type based on gCloudAudioEncoding, and to use myConfig.gCloudSampleRate (instead of hardcoded 48000) for Google Cloud TTS AudioConfig and WAV file generation. This re-enables WAV output and allows configurable sample rates for LINEAR16 audio.

Insert info-level timing logs into resources/ttscastd/ttscastd.py to help measure streaming latency and caching. Adds t0_start before calling Gemini streaming TTS, t1_castStart after spawning the streaming thread and before casting to Google Home, and t2_cacheWritten after the streamed WAV is written to cache. These timestamps aid debugging and performance analysis of the Gemini TTS streaming path.

Ensure the FIFO opened for GeminiTTSStream writes is set to blocking mode by calling os.set_blocking(fd, True). This restores blocking behavior after the descriptor was opened non-blocking so writes will block if the pipe buffer is full, providing proper backpressure during streaming. Includes a type-ignore and POSIX/Debian note in the source comment.

PHP: Only enable the ttsTestStreaming flag when Gemini test is selected (ttsTestGemini == '1'), forcing it to '0' otherwise to prevent streaming being offered for non-Gemini tests. Python: Improve Gemini streaming timing logs in ttscastd.py by capturing a single timestamp (_t) and logging both the epoch float and a human-readable HH:MM:SS.mmm representation for t0_start, t1_castStart and t2_cacheWritten. This provides more precise and consistent timing info for debugging streaming flows.

Remove the previous global "Streaming TTS par défaut" form group and add a Gemini-specific "Streaming Gemini TTS par défaut" form group under the "IA & TTS - Gemini" section. The new block uses the same config key (data-l1key="streamingDefault") and retains the existing tooltips about daemon restart and streaming behavior, grouping the streaming option with Gemini TTS settings.

Add a warning in TTSCast when the 'streaming' option is requested but the active TTS engine is not 'geminitts' and myConfig.streamingDefault is false. This logs a clear message after sending plain-text TTS results to inform callers that streaming mode is only supported by the Gemini TTS engine and that they should remove the 'streaming:true' option from the TTS call.

Move the streamingDefault config into the Gemini TTS section and add a clarifying comment ('Gemini TTS uniquement — ignoré pour les autres moteurs'). This consolidates the Gemini-specific setting with related options and clarifies its scope; no functional change intended beyond organization and documentation.

TiTidom-RC added 18 commits May 17, 2026 16:51

Merge pull request #251 from TiTidom-RC/beta

da69e67

Update Translations

Bump plugin version and google-genai

ccb1749

Update TTS Cast pluginVersion from 1.9.1 to 1.9.2 and bump google-genai in requirements.txt from 2.3.0 to 2.4.0. This advances the plugin release and brings a newer google-genai dependency for fixes/compatibility.

TiTidom-RC merged commit d2769e5 into beta May 22, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.9.2#252

v1.9.2#252
TiTidom-RC merged 18 commits into
betafrom
dev

TiTidom-RC commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TiTidom-RC commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant