PocketTTS sessions by danielrothmann · Pull Request #471 · FluidInference/FluidAudio

danielrothmann · 2026-03-30T11:52:17Z

This PR implements a session API for PocketTTS. Closes #465

The goal was to improve reliability of long-running sessions with streaming text input. Previously, each call to synthesizeStreaming() paid the full voice prefill cost (~125 sequential CoreML predictions) and reset Mimi decoder state, causing latency and audio discontinuity between utterances.

PocketTtsSession is a new actor that performs voice prefill once at creation, then accepts streamed text via enqueue(). Each utterance only pays the text prefill cost. Mimi decoder state persists across utterances for audio continuity.

Cancellation is awaitable: await session.cancel() blocks until the generation task has fully stopped and the Neural Engine is free, preventing multiple inference loops from stacking up. If the consumer drops the frames stream, generation is cancelled automatically.

AudioFrame now includes an utteranceIndex field for text synchronisation on the consumer side.

danielrothmann · 2026-03-31T12:11:30Z

@Alex-Wengg the merge checks seem to pass but always stop on "test-tts" - I'm struggling to find more information about what that is or why it's not completing. Any idea?

Alex-Wengg · 2026-03-31T13:32:27Z

@danielrothmann thats fine we can ignore test tts

danielrothmann added 3 commits March 29, 2026 22:36

Add support for PocketTTS voice sessions

fa6fb61

Remove drain functionality that wasn't needed

a43b16a

Fix an issue where generation might proceed after cancellation

fcf452c

This comment was marked as resolved.

Sign in to view

danielrothmann added 2 commits March 30, 2026 14:08

Review: Use local utterance chunk index

265d4d6

Review: Switch on ml data type during KV cache clone

50e1c1a

Alex-Wengg approved these changes Mar 30, 2026

View reviewed changes

Merge branch 'main' into feature/pocket-tts-sessions

c871ad1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PocketTTS sessions#471

PocketTTS sessions#471
danielrothmann wants to merge 6 commits intoFluidInference:mainfrom
42futures:feature/pocket-tts-sessions

danielrothmann commented Mar 30, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

danielrothmann commented Mar 31, 2026

Uh oh!

Alex-Wengg commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danielrothmann commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

danielrothmann commented Mar 31, 2026

Uh oh!

Alex-Wengg commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

danielrothmann commented Mar 30, 2026 •

edited

Loading