Skip to content

Sessions: expose turn-detection / server-VAD / barge-in config in @app.session #663

Description

@AbirAbbas

Follow-up from #654.

The session config the control plane sends to OpenAI (createOpenAIRealtimeCall in control-plane/internal/handlers/sessions.go) is currently minimal: type, model, instructions, audio.output.voice, tool_choice. There's no way for a session author to configure turn detection / server-side VAD / interruption (barge-in), which are table-stakes for a usable voice UX.

Why it matters

  • Without server VAD + turn detection config, the app can't tune when the model considers a turn complete, silence thresholds, or whether the user can interrupt the model mid-utterance. These are the difference between a demo and something people will actually talk to.

Acceptance criteria (behavior)

  • @app.session(...) (and the TS/Go equivalents) accept turn-detection / VAD options (e.g. turn_detection type, threshold, silence duration, create_response/interrupt_response) and the control plane forwards them into the OpenAI realtime session config.
  • Defaults produce a sane interruptible voice experience (barge-in works out of the box).
  • Validation rejects unsupported combinations explicitly (consistent with the existing provider/transport validation philosophy — no silent inference).

Ref: OpenAI realtime session turn_detection config.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:aiAI/LLM integrationarea:control-planeControl plane server functionalityarea:sdkCross-SDK (Python + Go + TS) parity workenhancementNew feature or requesthelp wantedExtra attention is neededsdk:goGo SDK relatedsdk:pythonPython SDK relatedsdk:typescriptTypeScript SDK related

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions