Turn any form into a conversational experience. Three ways to fill it out — traditional form, AI-powered chat, or voice.
CleanShot.2026-02-13.at.19.39.57.mp4
npm install talking-formsimport { TalkingForm } from "talking-forms"
function App() {
return (
<TalkingForm
endpoint="/api/form"
context="Coffee shop order"
fields={[
{ name: "name", label: "Name", type: "text", required: true },
{ name: "drink", label: "Drink", type: "select", required: true, options: ["Espresso", "Latte", "Cappuccino"] },
{ name: "size", label: "Size", type: "select", required: true, options: ["Small", "Medium", "Large"] },
{ name: "extras", label: "Extras", type: "textarea" },
]}
onSubmit={(data) => console.log(data)}
/>
)
}Users see a tabbed interface with three views — Form, Chat, and Voice. All three share state: fill a field in one view, it syncs to the others.
The endpoint prop points to your backend route that handles LLM question generation. API keys stay on your server.
// app/api/form/route.ts
import { createHandler } from "talking-forms/server"
const handler = createHandler({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY,
})
export const POST = handlerimport express from "express"
import { createHandler } from "talking-forms/server"
const app = express()
app.post("/api/form", createHandler({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY,
}))| Provider | Value |
|---|---|
| OpenAI | "openai" |
| Anthropic | "anthropic" |
| Google Gemini | "google" |
createHandler({
provider: "anthropic",
apiKey: process.env.ANTHROPIC_API_KEY,
model: "claude-sonnet-4-5-20250929", // optional
})Voice mode uses ElevenLabs for text-to-speech (reading questions) and speech-to-text (transcribing answers). Add a second server route:
// app/api/voice/route.ts
import { createVoiceHandler } from "talking-forms/server"
export const POST = createVoiceHandler({
apiKey: process.env.ELEVENLABS_API_KEY,
voiceId: "optional-voice-id", // defaults to Rachel
})<TalkingForm
endpoint="/api/form"
voiceEndpoint="/api/voice"
fields={[...]}
onSubmit={...}
/>The voice tab shows a hold-to-speak microphone button. Questions are spoken aloud via TTS, answers are transcribed via STT, and the same field validation + fuzzy matching applies.
type Field = {
name: string
label: string
type: "text" | "email" | "tel" | "number" | "textarea" | "select" | "date"
required?: boolean
placeholder?: string
options?: string[] // for type: "select"
validation?: {
min?: number
max?: number
pattern?: string
message?: string
}
}Server route for LLM question generation.
Called when the user completes the form from any view.
<TalkingForm
onSubmit={(data) => {
// data is Record<string, string>
console.log(data)
}}
/>Describes what the form is for. Sent to the LLM so it generates contextually relevant questions.
<TalkingForm context="Coffee shop order" />
// LLM asks "What size drink would you like?" instead of "What size do you wear?"Server route for ElevenLabs TTS/STT. Required for the voice tab to work.
<TalkingForm voiceEndpoint="/api/voice" />Which tab to show first. Default: "form".
<TalkingForm defaultView="chat" />
// or
<TalkingForm defaultView="voice" />Custom first message in chat view.
<TalkingForm chatGreeting="Welcome! Let's get your order started." />Override colors and border radius.
<TalkingForm theme={{ primary: "#6366f1", radius: "0.5rem" }} />Called whenever a field value changes in any view.
<TalkingForm onFieldChange={(name, value) => console.log(name, value)} />- Questions are generated by the LLM based on your field definitions and
context - Questions are prefetched one step ahead and cached — so responses feel instant
- User answers are validated client-side (email regex, number ranges, pattern matching)
- Select fields show clickable chips (small sets) or a searchable dropdown (large sets)
- Fuzzy matching handles typos and punctuation (Levenshtein distance)
- Progress bar tracks answered + skipped fields
- Same question flow as chat, but spoken aloud via ElevenLabs TTS
- User holds the mic button to record, releases to send
- Audio is sent to ElevenLabs STT (Scribe v2, English) for transcription
- Transcribed text goes through the same validation + fuzzy matching
- If a select option doesn't match, the voice reads out the available options
- Standard form with dynamic field rendering based on your config
- Validation on blur with inline error messages
- All field types supported: text, email, tel, number, textarea, select, date
All three views share the same state via useTalkingForm. Fill a field in chat, switch to form view — it's already there. Progress syncs across all tabs.
Select fields adapt their UI based on option count:
- 8 or fewer options — clickable chip buttons in chat, spoken in voice
- More than 8 options — searchable dropdown with type-to-filter
Fuzzy matching handles typos, punctuation, and partial matches automatically.
Full type safety out of the box.
import type { Field, TalkingFormProps, FormData } from "talking-forms"
import type { CreateHandlerOptions, CreateVoiceHandlerOptions } from "talking-forms/server"MIT