Skip to content

docs: document Ollama + record demo against gemma3#24

Merged
JohnMcLear merged 3 commits into
mainfrom
docs/demo-gif-improvements
May 3, 2026
Merged

docs: document Ollama + record demo against gemma3#24
JohnMcLear merged 3 commits into
mainfrom
docs/demo-gif-improvements

Conversation

@JohnMcLear
Copy link
Copy Markdown
Member

Summary

  • README gains a dedicated Ollama (self-hosted, local models) section under "LLM provider examples": config snippet, model-tag note (gemma3:latest, llama3.1:8b, …), the apiKey quirk (Ollama ignores the bearer but the OpenAI client must still send a non-empty string), Docker host networking, and a caveat that 3–4B local models occasionally drift off the JSON edit format.
  • demo.gif / demo.png regenerated against a real local Ollama server running gemma3:latest instead of the previous mock LLM. The chat exchange in the gif now shows gemma3's own reply ("✅ Smoother phrasing") and an edit ("we'd like … during playtime") that's actually generated by the model, not a scripted response.

The internal _demo-gif build pipeline (kept outside this repo) was rewired in the same pass: pre-flight checks Ollama via /api/tags, warms the model, waits for Etherpad's dev-mode /p/:pad route handler to register before recording, and the steps file now waits on a non-placeholder "AI Assistant" chat reply (instead of the mock's literal "Tightened…" string).

Test plan

  • End-to-end build: _demo-gif/build-ep_ai_chat-demo.sh runs clean against a local Ollama (gemma3:latest) and writes both assets (verified locally — 51 s recording, gemma3 ~13 s for the edit call).
  • README rendered locally — Ollama block reads alongside the existing Anthropic/OpenAI/Gemini/GitHub Models/xAI examples.
  • CI green on this branch.

🤖 Generated with Claude Code

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

- README: add a dedicated Ollama section under "LLM provider examples"
  with config, model-tag note, apiKey quirk (Ollama ignores the header
  but the OpenAI client must still send a non-empty Bearer string),
  Docker host networking, and a small-model edit caveat.
- demo.gif / demo.png: regenerate against a real local Ollama server
  (gemma3:latest) instead of the previous mock. The chat exchange now
  shows gemma3's own response ("✅ Smoother phrasing") and the edit
  ("we'd like … during playtime") attributed to the AI author colour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JohnMcLear JohnMcLear force-pushed the docs/demo-gif-improvements branch from 86d19cd to d65622a Compare May 3, 2026 17:01
JohnMcLear added 2 commits May 3, 2026 18:09
The Ollama+gemma3 take was 49 s end-to-end, but ~28 s of that was
just "Thinking..." while the local model produced a reply — dead
air for a viewer. Re-encode the same recording through ffmpeg's
mpdecimate filter to drop near-duplicate frames (the static
Thinking... segment), repack timestamps with setpts, then append a
2.5 s freeze on the final frame so the loop seam is obvious.

Result: 9.1 s loop, 924 KB GIF (down from 49 s / 2.6 MB). Alice's
typing is still visible at full pace — mpdecimate's default
thresholds drop frames where almost no pixels changed, so the
typing animation survives but the spinner-dwells collapse. Final
still is the post-edit state: surgical attribution clearly visible
("we'd like" and "during" are AI-purple, surrounding words stay in
Alice's colour, Bob's line untouched in pink).

This was a one-shot post-process on the cached webm rather than a
re-record, so it doesn't burn another LLM round-trip. The
build-ep_ai_chat-demo.sh wrapper can fold the same mpdecimate
filter into its ffmpeg invocation as a follow-up.
Replaces the gemma3:4b take with one driven by qwen2.5:1.5b — a
~5× smaller model that responds in ~5 s on CPU vs ~30 s for
gemma3, so subsequent rebuilds will be faster end-to-end. After
the mpdecimate trim the final GIF length is essentially identical
to the gemma3 take (8.7 s vs 9.1 s; mpdecimate eats the dead air
either way), so the speed win is in the developer loop, not the
recorded artefact.

What qwen2.5 chooses to do is different from gemma3, though:
where gemma3 surgically tweaked Alice's body sentence
("we would love" -> "we'd like", "at" -> "during"), qwen2.5
rewrites the title line into a longer formal opener
("Letter to the Football Coach" -> "Dear Coach. Please find below
a proposal for increased football practice at playtime."). The
surgical-attribution story is still visible (Alice's body line and
Bob's sentence stay in their original colours; only the rewritten
title is AI-purple), it's just less of a "look how few words
changed" moment because the title has fewer shared tokens between
FIND and REPLACE.

GIF 874 KB / 8.7 s; PNG 146 KB. Same mpdecimate filter chain as
the previous trim:
  mpdecimate=hi=2304:lo=512:frac=0.05,setpts=N/FRAME_RATE/TB,
  scale=900:-1:flags=lanczos,fps=12,
  tpad=stop_duration=2.5:stop_mode=clone
@JohnMcLear JohnMcLear merged commit 60fda13 into main May 3, 2026
2 checks passed
JohnMcLear added a commit that referenced this pull request May 3, 2026
The take that landed in PR #24 still showed Bob's "Many of us play..."
sentence visibly indented by one column because the steps file typed
it with a literal leading space. The cleanup that used to strip it was
a property of the ep_ai_chat-specific *mock* LLM, not of the plugin —
once the demo switched to a real LLM (gemma3:4b), nothing was
removing the space, and the visual implied an asymmetry the plugin
isn't actually creating.

Fix is upstream of the AI: drop the leading space from Bob's typed
string in steps-ep_ai_chat.mjs (Alice's Enter already gives a clean
line break, so the defensive space was never necessary). Re-record
on top of that.

Result: Bob's line now sits flush against the line-number gutter as
intended, and the rest of the demo's story is unchanged — gemma3
surgically tightens Alice's sentence ("we would love to play more
football at playtime" -> "we'd appreciate more football time at
playtime"), the changed words are AI-purple, the surrounding words
stay in Alice's colour, Bob's contribution is untouched.

GIF 1.1 MB / 9.5 s; PNG 135 KB. Same mpdecimate trim filter chain
as PR #24's final commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant