Skip to content

feat(web): iOS-style nested cursor model picker (closes #48)#49

Open
heavygee wants to merge 29 commits into
mainfrom
feat/cursor-picker-ios-nested
Open

feat(web): iOS-style nested cursor model picker (closes #48)#49
heavygee wants to merge 29 commits into
mainfrom
feat/cursor-picker-ios-nested

Conversation

@heavygee

Copy link
Copy Markdown
Owner

Summary

Implements the iOS-style nested config view for the Cursor model picker (issue #48). When a non-Default base is picked, the 29-row model list collapses to Default + selected base + "Change model…" with the Variant section (if any) immediately below instead of buried under the full list.

Why

Cursor deprioritized the standard ACP `configOptions` migration, so HAPI's parameterized-picker UX needs to live with the current data shape long-term. The minority of users who pick a specific model deserve a focused configure experience instead of scrolling past every other model to find the Variant rows.

Follows #47 (Default-highlight fix in flat mode).

Behavior

  • No model selected (Default): Full model list + no Variant section (unchanged)
  • Non-Default base selected: Default radio + selected-base radio + "Change model…" link + Variant section (when applicable)
  • "Change model…" re-expands the full list
  • Picking from the expanded list auto-collapses back to the focused view
  • External session.model change → auto-resets to focused view for the new model (or full list when reset to Default)

Cursor-flavor only. Claude / Codex / Gemini / OpenCode pickers unaffected.

Implementation

  • New pure helper `filterCursorModelOptionsForCompactView` in `web/src/lib/cursorModelOptions.ts`
  • New local state `cursorBrowsingAllModels` in `HappyComposer.tsx`, reset on any `selectedModelBase` change
  • i18n key `misc.changeModel` for en + zh-CN

Test plan

  • `bun typecheck` clean
  • `bunx vitest run src/lib/cursorModelOptions.test.ts` - 15/15 pass (4 new for the helper)
  • `bun run test` from root - 974 / 976 pass; 2 failures are the known-flaky `cli/runner.integration.test.ts` version-mismatch test (same failures shown on fix(web): highlight Default in cursor flat-mode model picker #47 base run, web-only diff cannot cause them)
  • Dogfood in driver soup once fix(web): highlight Default in cursor flat-mode model picker #47 merges - manifest layer queued; operator scheduling permitting
  • Falsification on a live cursor session: pick `claude-fable-5`, confirm list collapses to Default + claude-fable-5 + Change model link, Variant rows immediately below. Click Change model, confirm full list returns. Pick Default, confirm full list returns.

Out of scope

Animations, gesture handling, forward/back navigation stack. All viable as follow-ups if needed; minimal-viable is two render branches gated on local state.

Made with Cursor

heavygee and others added 25 commits June 8, 2026 13:26
* fix(web/markdown): disable single-dollar inline math in remark-math

Default remark-math configuration treats $...$ as inline LaTeX, which
turns any prose containing two currency amounts (e.g. "save $400 vs the
$200 plan") into a KaTeX block — paragraphs collapse, whitespace is
stripped, the running text is re-rendered as math symbols.

Pass `singleDollarTextMath: false` to remarkMath so single $ is plain
text. Block math `$$...$$` (on its own line) still renders, matching
GitHub-flavored markdown semantics.

Single source of truth: MARKDOWN_PLUGINS is shared by MarkdownText,
Reasoning, and MarkdownRenderer — fix lands in all three surfaces.

Adds 3 regression tests that drive the unified pipeline end-to-end:
prose with multiple "$N" amounts produces no `class="katex"` and no
`<math>` element; `$$...$$` block math still does.

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore(web): declare unified/remark-parse/remark-rehype/hast-util-to-html

The new markdown-text regression test imports these directly to drive the
unified pipeline end-to-end. They were resolving via transitive deps from
remark-math and rehype-katex, which is fragile — a future dep upgrade can
remove the transitives and break the test.

Declare them explicitly under devDependencies. No code change; lockfile
records the same versions that were already installed transitively
(unified@11.0.5, remark-parse@11.0.0, remark-rehype@11.1.2,
hast-util-to-html@9.0.5).

Addresses the HAPI Bot review finding on PR tiann#805.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…tiann#680) (tiann#815)

Replaces hardcoded \"Claude Code\" strings in voice context injections
with the active session's flavor label (Cursor, Codex, Gemini, etc.)
via getFlavorLabel() from @hapi/protocol. Falls back to \"coding agent\"
for unknown or missing flavors.

Threads an agentLabel string param through:

- formatMessage
- formatNewSingleMessage
- formatNewMessages
- formatHistory
- formatSessionFull
- formatPermissionRequest
- buildSessionVoiceContextPlan (passes to its internal formatMessage call)

voiceHooks adds a single getAgentLabel(session) helper that reads
session.metadata.flavor and resolves the display label once per call.

Tests updated to cover the new parameter shape and to assert that
\"Claude Code\" is never substituted regardless of agent flavor.

formatReadyEvent already used a generic \"coding agent\" phrasing and
needs no signature change.

Closes tiann#680

Co-authored-by: Cursor <cursoragent@cursor.com>
… limit) (tiann#823)

* fix(cursor): requeue user message on transient agent exit (auth, rate limit)

cursorLegacyRemoteLauncher.runMainLoop popped a user message off the queue
before spawning `agent` and silently discarded it whenever `agent` exited
non-zero (auth expiry, rate limit, transient network). The wrapper logged
the failure at debug level only, never surfaced it to the web UI, and
emitted `ready` as if a normal turn had ended.

Capture stderr from the spawned process; classify exit-1 with a transient
signature (Authentication required, rate limit, ETIMEDOUT, ECONNRESET,
EAI_AGAIN) as recoverable; re-head the message via `queue.unshift`, surface
a friendly banner via `sendSessionEvent({type:'message',...})`, and backoff
~2s before the loop picks it up again. Cap at 5 consecutive transient
failures, after which the message is dropped with a clear "resolve and
resend" event so we never spin forever on a genuinely broken auth.

Non-transient non-zero exits also surface the stderr to the UI now (instead
of only the local ring buffer), so a real crash is visible to the operator.

Backoff is overridable via CURSOR_LEGACY_TRANSIENT_BACKOFF_MS for tests.

Tests cover: success path, transient auth requeue + banner, rate-limit
banner, non-transient crash surfaced without requeue, and the 5-failure
drop cap.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): preserve slash-command isolation on requeue; wait for stderr flush

Two findings from the cold-review bot on the requeue path:

1. `enqueueCursorUserMessage` uses `pushIsolated` for pass-through slash
   commands (e.g. `/compress`) so they never batch with sibling prompts.
   The transient-requeue path used plain `unshift`, which dropped the
   isolate bit and allowed the next collected batch to merge the slash
   command with a sibling - changing command semantics. Add
   `MessageQueue2.unshiftIsolated` and use it when the popped batch was
   isolated or when `parseCursorSpecialCommand` recognises the message.

2. `runAgentProcess` resolved on `child.on('exit', ...)`. Node may emit
   `exit` while the stderr pipe is still draining, so a fast "auth
   required" error printed-and-exited could be classified as
   non-transient with empty stderr and silently drop the user message -
   the exact bug this PR was supposed to fix. Resolve on `close` instead,
   which waits for stdio streams to flush.

Adds a unit test that requeues `/compress` after a transient auth failure
and asserts the second spawn still receives the slash command alone (not
batched with a sibling).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): restrict transient retry to exit code 1 only

Upstream codex review tiann#823 (Minor): the helper treated any non-zero exit
with matching stderr as transient, which could requeue a signal-killed
(SIGTERM 143, SIGKILL 137) or crashed (SIGABRT 134) process whose stderr
happens to contain a keyword like "rate limit". Documented contract is
exit-1-for-transient; tighten the classifier accordingly.

Adds regression test covering exit 143 + rate-limit stderr → no retry.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): clean up transientBackoff abort listener on timer completion

Upstream codex review tiann#823 (Minor): transientBackoff added an abort
listener with { once: true } but only removed it when the abort fired.
Because the launcher reuses one AbortController, repeated transient
retries accumulated stale listeners until the next abort.

Switch to a single completion path that clears the timer AND removes the
abort listener whichever side wins.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): cap in-memory stderr capture at 8 KB

Upstream codex review tiann#823 (Minor): runAgentProcess accumulated every
stderr chunk for the full child lifetime. A noisy `agent` failure could
grow CLI process memory without bound even though only the first 400
chars are ever displayed. Cap the retained copy at 8 KB; debug log of the
full stream is unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…#825)

* fix(hub): preserve flavor session ids in metadata across archive transitions

When a session ends (terminate, crash, local-launch failure, handoff),
the runner's archive write replaces sessions.metadata wholesale. If the
CLI's locally cached Metadata is null (e.g. Zod parse failed at bootstrap
and api.ts nulled it out) or stale, the spread in archiveAndClose ships
a sparse blob and the resume token (cursorSessionId, codexSessionId,
claudeSessionId, etc.) gets cleared from the row even though the
on-disk chat data is still intact.

Fix at the hub layer because update-metadata is the single chokepoint
for every metadata write surface (CLI, web, future): in the store-level
updateSessionMetadata, read the prior row's metadata inside a
transaction and carry forward a small allowlist of flavor resume tokens
when the incoming write omits them. Explicit overwrites still win.

The allowlist mirrors pickExistingSessionMetadata in sessionFactory.ts
which already preserves the same fields on bootstrap.

Closes tiann#820

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(hub): address cold-review findings on metadata merge

Three bot findings on the initial patch:

1. (P1) Sparse archive payloads still resulted in metadata blobs that
   failed MetadataSchema parse downstream — required `path`/`host` were
   not in the carry-forward set, so even though the resume token
   survived, hub session cache and CLI getSession nulled-out the row
   and resume_unavailable came back. Add PARSE_IDENTITY_FIELDS = `path`,
   `host` to the carry-forward.

2. (P2) Preserving `cursorSessionProtocol` whenever it was omitted
   carried a stale protocol over to a freshly written `cursorSessionId`,
   misrouting a future remote resume. Pair-aware logic: drop the prior
   protocol when next sets a new id; preserve the protocol only when
   next is silent on both id and protocol.

3. (P2) The successful update-metadata broadcast emitted the pre-merge
   payload to other CLIs in the session room, so even though the DB row
   was preserved, peer caches diverged. Switch the broadcast value to
   `result.value` (the persisted merged value) so live caches stay in
   sync with the truth.

Refactor preserveProtocolResumeFields into mergeSessionMetadata with
two tiers (PARSE_IDENTITY_FIELDS + SIMPLE_RESUME_TOKENS) plus the
cursor pair handler. 6 new tests cover the regressions; existing 16
still pass plus 1 new socket-level test for the broadcast.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(hub): preserve flavor + machineId across sparse metadata merges

Bot P2 on the prior fix: PARSE_IDENTITY_FIELDS (path, host) made the
blob parseable and SIMPLE_RESUME_TOKENS preserved the chat-id, but
flavor and machineId were still being dropped by sparse archive
payloads. Consequences:

- flavor: hub/src/web/routes/sessions.ts and sync/syncEngine.ts read
  `metadata?.flavor ?? 'claude'` to pick which session id field to
  resume. With flavor missing, a Cursor/Codex/Gemini session was
  routed as Claude and the preserved cursorSessionId was ignored.

- machineId: telegram/bot.ts and the CLI's resumable listing read
  `metadata?.machineId` to scope sessions to the current host. With
  machineId missing, the row dropped out of the resume picker.

Add a third carry-forward tier ROUTING_FIELDS = [flavor, machineId]
between PARSE_IDENTITY_FIELDS and SIMPLE_RESUME_TOKENS in
mergeSessionMetadata. 3 new tests cover preservation, no-invention,
and explicit override.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(hub,cli): support explicit-clear sentinel for carry-forward fields

Upstream cold-review (Major): the carry-forward semantics introduced
in the prior commits ("omit field → preserve from prior") collide
with cli/src/codex/session.ts resetCodexThread(), which intentionally
clears codexSessionId by deleting it from the metadata blob before
calling updateMetadata. With omit-as-preserve, the cleared id was
restored from the prior row and /clear on a Codex session no longer
dropped the persisted thread.

Add an explicit-clear sentinel: when next sets a carry-forward field
to `null`, the merge drops the key entirely from the persisted blob
(key removed; not stored as null since MetadataSchema fields are
`string().optional()`). `undefined` (key missing from next) keeps its
"carry forward" meaning. The two semantics now compose cleanly:

  - next.field = "x"   → next wins (caller sets a new value)
  - next.field = null  → drop the field (caller intentionally clears)
  - next omits field   → carry forward prior (caller didn't touch it)

Update resetCodexThread() to send `codexSessionId: null` so the
reset actually drops the persisted thread under the new merge.

4 new hub tests cover: explicit clear of a single token, clear-one-
preserve-others independence, no-op clear on a never-set field, and
the success-ack value reflects the cleared blob. cli/src/codex tests
(224/224) and hub suite (301/301) green; bun typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…ive rows (tiann#826)

* feat(hub+web): add POST /sessions/:id/reopen + Reopen button on inactive rows

Archived sessions retain their full transcript and metadata in the DB, but
today there is no path back to them from the web UI; the only way to revive
one is shell access plus sqlite metadata patching plus a manual /resume call.

This change adds a single one-click affordance:

- Hub: new POST /api/sessions/:id/reopen route on the existing sessions
  router. The route delegates to a new engine method `reopenSession` that:
  - is idempotent (active session -> 200 with `resumed:false`),
  - validates Cursor sessions still have a `cursorSessionId` once they have
    any messages (otherwise we cannot resume the agent thread),
  - clears `lifecycleState='archived'`, `archivedBy`, `archiveReason` via a
    versioned metadata update, and stamps `lifecycleStateSince`,
  - defaults `cursorSessionProtocol='stream-json'` for pre-tiann#799 Cursor
    sessions (sessions that have a `cursorSessionId` but no protocol set),
    so routing still reaches the legacy launcher; ACP sessions keep their
    explicit protocol,
  - forwards to the same `resumeSession` path the existing /resume route
    uses, including the `canFreshSpawnNeverStartedSession` fallback.

  422 is returned with `{ missing: [...] }` when the agent metadata needed
  to resume is gone; other engine errors map to 404/409/503/500 with the
  existing shape (mirrors /resume).

- Web: a "Reopen" entry in the SessionActionMenu that appears next to
  "Delete" on inactive sessions only. Wired into both the SessionList rows
  and the SessionHeader more-menu, with a small dismissable error dialog
  for the 422 missing-metadata case.

- Tests: route-level coverage for the four response shapes (200 reopen,
  200 idempotent, 404, 422) plus 409/503 error mappings; sessionCache
  tests for the archive-metadata clear (including the legacy Cursor
  protocol default); React component test for the menu item rendering on
  inactive vs active sessions; mutation hook test for the api wiring and
  the ApiError surface needed by the UI.

Closes tiann#819

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(reopen): address codex review findings on fork PR #33

Four P2 findings from the cold-review bot, three fixed and one explained:

1. Mutation now returns the reopen response so the UI can route to a possibly
   different sessionId. SyncEngine.resumeSession may merge the row into a
   freshly-spawned session id (matching the send-message resume flow); the
   chat view now navigates there, the row list calls onSelect on the new id.
2. reopenSession on the client now goes through `request()` instead of a
   hand-rolled fetch, so 401 + onUnauthorized refresh works the same as
   every other session action. `request()` now throws `ApiError` (with
   status/code/body) on non-401 errors - backward compatible because
   ApiError extends Error.
3. (Reply only) Pre-tiann#799 Cursor protocol propagates correctly without the
   extra plumbing the bot suggested: `clearSessionArchiveMetadata` writes
   `cursorSessionProtocol='stream-json'` to the DB; the CLI's
   `bootstrapExistingSession` preserves it via `pickExistingSessionMetadata`;
   if it's still absent at the launcher, `isLegacyCursorSession` defaults
   to stream-json whenever `cursorSessionId` is present.
4. Archive metadata is now restored when resume fails. `reopenSession`
   captures a snapshot of `lifecycleState`/`archivedBy`/`archiveReason`/
   `lifecycleStateSince` before the clear; if `resumeSession` returns an
   error (no machine online, spawn timeout, etc.), the snapshot is put
   back via the new `SessionCache.restoreSessionArchiveMetadata`. Engine
   test covers both the rollback and the no-rollback-on-success cases.

Error rendering helper moved to `web/src/lib/reopenError.ts` so the chat
header and the session row share one implementation, and gained a unit test.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): preserve engine error codes in ApiError.code on /reopen

`/sessions/:id/reopen` returns `{ error, code }` where `code` is the stable
taxonomy (`no_machine_online`, `resume_unavailable`, etc.) and `error` is the
human-readable message. The generic `request()` error path was reading only
`parsed.error`, so `ApiError.code` ended up being a message like
"No machine online" rather than `no_machine_online`, breaking taxonomy-based
branching in web callers.

`parseErrorCode` now prefers `parsed.code` and falls back to `parsed.error`
for legacy routes that only set `error`. Added api/client.test.ts covering
the three response shapes /reopen actually emits (503 with code, 500 without
code, 422 with missing[]).

Addresses upstream codex-action review on tiann#826.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(reopen): restore archive metadata exactly on rollback (drop fresh lifecycleStateSince)

For an archived session that predates `lifecycleStateSince` (the field is
absent from its metadata), `clearSessionArchiveMetadata` stamps a fresh
timestamp. If `resumeSession` then fails, the rollback was leaving that
fresh timestamp in place, making the rolled-back row look like it was
just archived rather than preserving the original lifecycle age.

`restoreSessionArchiveMetadata` now does an EXACT restore: when a snapshot
field is undefined the corresponding key on the metadata is deleted, not
left alone. Applies symmetrically to lifecycleState / archivedBy /
archiveReason / lifecycleStateSince. Test updated to assert the deletion
of the fresh timestamp.

Addresses upstream codex-action review on tiann#826.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…eam-json path (closes tiann#822) (tiann#828)

* fix(cursor): drop timing heuristic from tiann#784 intercept; scan raw payload (tiann#801 follow-up)

PR tiann#801 shipped a two-strategy intercept for the synthetic AskQuestion
skip response in legacy stream-json mode. Real-traffic data from a
post-merge run shows the marker-match strategy never fires (the
converter's `extractToolResult` discards the marker for tool shapes it
does not recognize, returning `{}`) and the timing-signature
defense-in-depth strategy fires only on false positives - notably the
Anthropic Vertex Claude tool calls cursor-agent surfaces in legacy
sessions, which all land as `name=unknown` with the `{}` extracted
result and frequently complete under the 500 ms threshold.

Measured on a single legacy-resumed session (`7b769423`): 1,136
`name=unknown` tool calls, 16 rewritten as `no_input_surface`, zero
actual marker strings stored anywhere in the session. The 16 rewrites
were legitimate fast tool calls (Anthropic Vertex `toolu_vrtx_*` IDs)
mischaracterized as fabricated skip responses.

Changes:
- Remove the timing-signature heuristic and its supporting state
  (started-at map, elapsed-ms calculation, latency threshold, test-only
  state reset).
- Move the marker scan from the post-`extractToolResult` output to the
  raw `tool_call` payload, so it can see the marker on stream-json
  shapes the converter does not specifically recognize. Function-shaped
  tools exclude `function.arguments` from the scan to avoid matching
  agent-controlled input. Other shapes scan the full payload (no
  agent-input field exists at the top level).
- Refresh tests: drop timing-based positive cases, add a marker-in-raw-
  payload positive case for `name=unknown` shapes, and add a regression
  that legitimate fast `name=unknown` tool calls without the marker
  pass through with `status: completed`.
- Document scope: this intercept now lives only on the legacy stream-
  json path, which only resumed pre-ACP sessions hit. New cursor remote
  sessions go through `cursorAcpBackend` and the `cursor/ask_question`
  ACP extension method (tiann#799) - immune to this bug. The intercept
  drains with the legacy session population.

Tracking: tiann#784. Builds on tiann#801, complements tiann#799.

* fix(cursor): exclude agent input from marker scan; surface top-level Anthropic tool names (Codex P2)

Codex flagged a false-positive case on the fork-stage review of this
branch (#35, P2): an Anthropic tool_use shape with a
top-level `name` (e.g. `{id, name: 'TodoWrite', input: { ... }}`) gets
labelled `name=unknown` by the converter and passes the AskQuestion
gate. If the agent's `input` quotes the synthetic-skip marker - which
happens whenever an agent edits or documents this very bug - the
intercept would rewrite a perfectly fine TodoWrite as a fabricated
skip.

Two-part fix:

1. `extractToolName` now reads the top-level `name` field as a final
   fallback. A real `TodoWrite` / `Bash` / `str_replace_based_edit_tool`
   surfaces with its actual name and is rejected by the AskQuestion
   gate before the marker scan runs. The original AskQuestion
   fabrication case still surfaces as `unknown` (per tiann#784 issue body
   the name is stripped in the fabricated payload) and remains
   detectable.

2. Defense in depth: introduce `AGENT_INPUT_KEYS = {input, args,
   arguments}` and exclude these from the non-function shape's marker
   scan. Even if a tool reaches this code path with `name=unknown` and
   the marker buried in its `input`, the intercept won't fire on agent-
   controlled text.

Two new regression tests:

- Anthropic tool_use shape `{id, name: 'TodoWrite', input: {todos: [
  marker]}}` → passes through with `status: 'completed'`.
- `name=unknown` shape with marker only inside `input` → passes through
  with `status: 'completed'`.

All 20/20 tests pass; typecheck clean (cli + web + hub).
tiann#835)

Fixes incomplete cliModelSkus while agent acp holds the CLI lock (tiann#831) and
replace single-pid ACP lock with cross-process refcount (tiann#832). Web picker
merges machine/session catalogs and waits for SKU readiness before showing
variant labels.

Fixes tiann#831
Fixes tiann#832

Co-authored-by: Cursor <cursoragent@cursor.com>
…#836)

* fix(web): dedupe sidebar sessions by flavor resume id

Wire deduplicateSessionsByAgentId into SessionList and resolve cursor
threads via cursorSessionId so resume/archive no longer shows duplicate
inactive rows for the same ACP session.

Fixes tiann#833

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): use SessionSummary.agentSessionId for sidebar dedup

SessionList only receives SessionSummary from the API; native ids like
cursorSessionId are already mapped into metadata.agentSessionId by
toSessionSummary. Drop resolveAgentSessionIdFromMetadata to fix typecheck.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): hide inactive empty session stubs in sidebar

Filter inactive rows with no agentSessionId and no title signal before
grouping sessions, and expose lifecycleState on SessionSummary for future
sidebar rules. Completes the tiann#833 P0 follow-up alongside agent-id dedup.

Fixes tiann#833

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): scope sidebar dedup key by flavor

Prevent cross-flavor collisions when flattened agentSessionId retains a
stale native id. Add regression test and relax claudeRemote CI timeout.

Fixes tiann#833

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
)

Pre-write resume token into session metadata before awaiting session/load
so hub and web see cursorSessionId immediately after resume spawn.

Fixes tiann#834

Co-authored-by: Cursor <cursoragent@cursor.com>
…rimitive (tiann#798)

* feat(web): scratchlist v1.1 — composer-toggle drawer + reusable FUE primitive

The v1 always-visible amber band proved too heavy for what's a 20% feature
in a typical session. v1.1 dials it back to a composer-toggle that opens
an on-demand drawer, paired with a reusable FUE (First-User Experience)
primitive so existing operators get a subtle pulsing dot + on-click
explainer the first time they see the toggle.

UX changes:
- Notepad icon in the composer toolbar (next to schedule-send) toggles
  scratchlist mode. Drawer renders only while mode is on.
- Composer's send button repaints amber and reads "Send to scratchlist"
  while the mode is sticky; submit routes adds into the scratchlist
  instead of the chat. Click the icon again to leave.
- Small entry-counter badge appears on the toggle when entries exist;
  empty-state shows just the icon (no zero-state guilt UI).

New reusable FUE primitive:
- web/src/lib/use-fue.ts: state machine (unseen → engaging → acknowledged)
  with localStorage persistence, namespaced under hapi.fue.v1.<featureId>
  so it can't collide with any future upstream onboarding flow.
- web/src/components/Fue.tsx: <FueDot> (small pulsing badge) and
  <FueCallout> (portal-rendered popover with title/body + "Got it"
  affirmative-action dismiss). No auto-timeout — reading speed varies
  and silent disappearance undercuts user trust.
- AGENTS.md adds a "Adding new web features — consider an FUE" section
  so future contributors discover the primitive.

Refactors:
- ScratchlistPanel.tsx: split rendering into <ScratchlistInventory>
  (presentational list) and <ScratchlistDrawer> (composer-controlled
  drawer with hint copy). Original <ScratchlistPanel> kept exported
  for the existing fixture-based tests.
- SessionChat.tsx: scratchlist state lifted into useScratchlist hook
  so the composer-toolbar counter and the drawer share one source of
  truth. onSend wrapped to route through scratchlist.add when mode
  is on.

Tests:
- 9 useFue hook tests (initial state, engage idempotency, no
  auto-acknowledge, dismiss, featureId switching, post-acknowledged
  engage no-op, resetFue helper).
- 5 placement helper tests (above/below switching, viewport edge
  clamping, visualViewport offset support).
- All 21 existing scratchlist lib tests + 14 ScratchlistPanel tests
  continue to pass.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): prevent cross-session leak in useScratchlist hook

Per upstream review on PR tiann#798 (github-actions[bot] [Major]):

  > useScratchlist persists current `entries` whenever `sessionId`
  > changes. On A -> B navigation, React first commits with B's id
  > and A's entries; after paint, this persist effect can write A's
  > entries to hapi.scratchlist.v1.B before the rehydrate effect
  > loads B. The previous keyed panel existed specifically to avoid
  > this race.

Lifting state out of the v1 panel (which sidestepped the race via
key={props.session.id} forced remount) re-introduced this same data-
loss window. The composer-controlled drawer in v1.1 cannot remount
on session change because its parent SessionChat doesn't either.

Fix: keep the loaded sessionId in state alongside the entries so they
swap atomically, and persist against the LOADED sessionId rather than
the prop. After A->B, the loaded sessionId is still A until rehydrate
runs, so a spurious persist re-writes A's storage with A's entries -
a no-op instead of a corruption.

Tests:
- New use-scratchlist.test.ts with 6 tests:
  - hydrates from localStorage on mount
  - add() persists to current session's storage only
  - rerender to a new session preserves the new session's existing entries
  - after switching, add() targets the new session
  - regression test that spies on Storage.prototype.setItem and asserts
    the rerender lifecycle never produces a (B-key, A-entries) write
  - remove()/move() target the loaded sessionId
- The setItem-spy test correctly fails against the buggy code (verified
  by temporarily reverting the fix) and passes with the fix in place.
- Full web suite: 88 files, 756 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): route attachment/scheduled submits to chat instead of dropping them

Per upstream review on PR tiann#798 (github-actions[bot] [Major]):

  > Prevent scratchlist mode from dropping attachments — in scratchlist
  > mode the wrapper returns success after adding only `text`, while
  > HappyComposer still treats composer attachments as sendable input.
  > A text+attachment submit therefore routes through this branch,
  > stores only the text, and silently discards the attachment instead
  > of sending or preserving it.

Same hazard applies to scheduledAt: scratchlist entries are pure-text
notes - they can't represent attachments or schedule metadata - so any
submit carrying either MUST fall through to props.onSend (chat) even
when the scratchlist toggle is on. Otherwise the wrapper short-circuits
to scratchlist.add(text), reports success to the composer, and the
composer dutifully clears attachments + schedule that the user just
queued.

Fix: extracted the routing rule into shouldRouteToScratchlist(mode,
attachments, scheduledAt) - returns true only when mode is on AND the
payload is pure text. onSendForComposer uses it.

Tests:
- 5 new shouldRouteToScratchlist unit tests (mode off, mode on +
  text-only, mode on + attachments, mode on + schedule, mode on + both)
- All in web/src/components/SessionChat.test.ts (13 tests total now)
- Full web suite: 88 files, 761 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): clear pendingSchedule when scratchlist-mode submission falls back to chat

Per upstream review on PR tiann#798 (github-actions[bot] [Major]):

  > The follow-up change correctly falls through to props.onSend when
  > scratchlist mode is on but scheduledAt is present, yet the
  > accepted-send cleanup still checks only !scratchlistMode. That
  > means a scheduled chat send made while the amber scratchlist UI is
  > active is accepted, but pendingSchedule stays set, so the next
  > normal send can accidentally reuse the same schedule.

Fix: handleSend now gates the cleanup branch on the actual route taken
(routedToScratchlist) rather than the scratchlist UI state. Reuses the
same shouldRouteToScratchlist helper so route + cleanup share a single
source of truth.

Tests:
- 2 new tests in SessionChat.test.ts that pin the decision matrix
  handleSend depends on:
  - 'cleanup gate: scheduled chat send while scratchlist toggle is on
     still clears schedule'
  - 'cleanup gate: pure-text scratchlist add does NOT clear schedule'
- Full web suite: 88 files, 763 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): UnifiedButton must reflect actual routing, not raw scratchlist toggle

Per upstream review on PR tiann#798 (github-actions[bot] [Major]):

  > Send button advertises scratchlist routing even when the submit
  > will go to chat — shouldRouteToScratchlist correctly falls back
  > to normal chat for attachments or scheduledAt, but UnifiedButton
  > still turns amber and labels the action as "Send to scratchlist"
  > whenever scratchlistMode is true. A scheduled send or attachment
  > send made in that state will be submitted to chat while the UI
  > says it is being stashed, which can send content to the agent
  > unexpectedly.

Fix:
- UnifiedButton's prop renamed `scratchlistMode` -> `routesToScratchlist`
  to make the contract explicit: "this submit really will go to the
  scratchlist", not "the scratchlist toggle is on".
- The call site computes `routesToScratchlist` from
  `scratchlistMode && !hasAttachments && pendingSchedule == null`,
  mirroring SessionChat's shouldRouteToScratchlist exactly. The button
  is now amber + "Send to scratchlist" only when the actual send path
  will hit scratchlist; attachments / pending schedule force a chat-
  style render that matches the real routing.
- UnifiedButton exported so it can be unit-tested directly.

Tests:
- 3 new render tests in ComposerButtons.test.tsx covering:
  - routesToScratchlist=true → amber + "Send to scratchlist"
  - routesToScratchlist=false → black + "Send" (the regression case)
  - omitted prop → defaults to chat-style render
- Full web suite: 89 files, 766 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): exit scratchlist mode when promoting an entry to the composer

Per upstream review on PR tiann#798 (HAPI Bot, follow-up after b256fe5):

  > Found one major issue: promoting a scratchlist item to the composer
  > keeps scratchlist mode enabled, so the next send re-adds it to the
  > scratchlist instead of sending to chat.

Promoting an entry to the composer means "I want to send this for real
now". With scratchlist mode still on, the next composer submit routes
back to scratchlist (per the v1.1 modal-mode contract), so the user's
click loop becomes promote -> send -> re-add -> nothing-actually-sent.

Fix: ScratchlistDrawerHost now calls onExitScratchlistMode whenever it
promotes an entry to the composer. Promote-to-queue does NOT exit the
mode (queue path bypasses the wrapper anyway, and the operator may
still be capturing related notes).

Tests:
- Exported ScratchlistDrawerHost so its host-level callbacks can be
  unit-tested in isolation (previously only ScratchlistDrawer was
  testable; the wiring was untested).
- New SessionChat.exit-mode.test.tsx with 2 tests:
  - promote-to-composer fires setText AND onExitScratchlistMode
  - promote-to-queue fires onSend but does NOT exit mode
- Full web suite: 90 files, 768 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(scratchlist): Ctrl/Cmd+Shift+S toggles scratchlist mode (v1.1 hotkey)

The v1 always-visible panel had Ctrl/Cmd+Shift+S to expand the panel and
focus the input. v1.1 mounts the drawer only when scratchlistMode is on,
so the v1 listener (inside the panel) is dead code: it can't fire while
the drawer is unmounted, and the user has no way to open the drawer
without clicking the toolbar icon. Re-bind the shortcut at SessionChat
scope so it's always alive and toggles the mode.

Convention matches sibling globals (Ctrl/Cmd-m cycles agent model).
Ctrl/Cmd-Shift-S is unreserved by Chrome / Firefox / Safari (browser
Save As is Ctrl-S / Cmd-S, no Shift), so the user's save-page muscle
memory keeps working. Modifier requirement (Ctrl/Cmd+Shift) means it
can't collide with literal-character typing in any input - no focus
suppression needed.

The matcher is extracted to a pure helper isScratchlistToggleHotkey
so it's unit testable without mounting SessionChat. 6 new tests pin
the modifier matrix:
  - Ctrl+Shift+S (Linux/Windows) -> match
  - Cmd+Shift+S (macOS)          -> match
  - Cmd/Ctrl+S without Shift     -> reject (browser Save reservation)
  - bare S / Shift+S             -> reject (literal typing)
  - Ctrl+Shift+Alt+S             -> reject (avoid OS clashes)
  - other modifier+key combos    -> reject

Tooltip + FUE body now mention the hotkey so it's discoverable from
the same UI surface that introduces the feature (en + zh-CN).

Web suite 90 files / 774 tests, all green. Typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): hotkey skips dialogs / inputs / contentEditable

Bot finding on PR tiann#798 (PRRT_kwDOQuQOSc6HGtLn): the window-level
Ctrl/Cmd+Shift+S listener fires for every focus target, so the
shortcut can toggle scratchlist mode "behind" an open modal (rename
session, schedule picker, FUE callout, image preview), making the
next composer send route to scratchlist instead of chat. UX bug.

Add isScratchlistHotkeyBlockedTarget(target) and gate the listener
on it. Block targets:

  - any descendant of an open [role="dialog"] (Radix UI's
    DialogContent renders role="dialog"; FueCallout, ScheduleTimePicker,
    ImagePreview also use role="dialog")
  - HTMLInputElement (single-line inputs)
  - HTMLSelectElement
  - any contentEditable host (with attribute-based fallback for jsdom,
    which doesn't implement isContentEditable)

NOT blocked:
  - HTMLTextAreaElement (the composer textarea is the expected focus
    target when the operator presses the hotkey - blocking would
    defeat the shortcut)
  - the document body / unfocused targets

8 new unit tests pin the matrix. Function is exported / pure so
callers can reuse the same blocked-target rule for future global
shortcuts. Suggested fix from the bot applied modulo:

  - Use !== null on closest() result (explicit boolean for return type)
  - Add attribute-based contentEditable fallback for jsdom test env

* feat(scratchlist): copy-to-clipboard action on each entry

Add a per-entry "Copy to clipboard" button between the send-to-queue and
delete actions. On click, write the entry text via the shared
safeCopyToClipboard helper (which already handles the navigator.clipboard
primary path + the execCommand fallback for Safari / non-secure-context
edges); on success, briefly flip the icon to a check and the
aria-label/title to "Copied!" for 1500ms so the operator gets visual +
screen-reader confirmation. Failures (clipboard denied AND execCommand
fallback unavailable) silently no-op rather than throw at the click
handler.

Mirrored across both surfaces:
  - ScratchlistInventory (used by the v1.1 composer-toggle drawer)
  - ScratchlistPanel inline list (the v1 always-visible panel)

A small useCopiedFeedback() hook owns the "which entry just got copied"
state + the 1.5s auto-clear timeout. Pure state machine; the caller
wires safeCopyToClipboard separately so the hook itself stays free of
jsdom clipboard quirks. Cleared on unmount via the standard ref-tracked
timeout pattern, so promote-and-navigate-away can't leak.

Locale keys: scratchlist.action.copy / scratchlist.action.copied (en + zh-CN).

Three new tests:
  - v1 panel happy path: writeText called with the entry text, button
    flips to the "Copied!" label, entry is preserved (copy is non-destructive).
  - v1 panel failure path: writeText rejects AND execCommand returns
    false; button stays in "Copy to clipboard" state — no false success.
  - v1.1 drawer happy path: writeText called, label flips, and crucially
    no other entry handlers (onSend, onDelete, setText, onExitScratchlistMode)
    fire — copy is independent of all the other actions.

Web suite 90 files / 785 tests, all green. Typecheck clean.

* fix(scratchlist): reset all per-session state via keyed wrapper

Bot finding on PR tiann#798 (PRRT_kwDOQuQOSc6HHOsa): when the operator
navigates between sessions on the same route (/sessions/A ->
/sessions/B), React reuses the SessionChat component instance.
Effects run AFTER the first paint, so for a single render window the
new session is rendered with the previous session's scratchlist
entries (useScratchlist's rehydrate-effect) AND drawer-open state
(scratchlistMode reset effect). Visual leak; drawer actions targeting
stale state.

Apply the bot's suggested fix verbatim modulo the type extraction:

    export function SessionChat(props) {
        return <SessionChatInner key={props.session.id} {...props} />
    }

Canonical React idiom for "fully reset state on prop change": the
keyed wrapper unmounts and re-mounts the inner component when
session.id changes, so every hook (useScratchlist's initial-state
factory, useState, useHappyRuntime, ...) starts fresh. This
supersedes the now-redundant effect-based reset:

  - useEffect(() => { setScratchlistMode(false) }, [session.id])  REMOVED

useScratchlist's atomic-loaded-sessionId persistence (added on the
prior PR round) stays as defense-in-depth for any caller that uses
the hook without the keyed-wrapper pattern.

Web suite 90 files / 785 tests, all green. Typecheck clean.

* fix(web): retain composer text on send failure (closes tiann#776)

When the message composer submits and the hub responds with a 4xx/5xx
or the fetch fails outright, assistant-ui clears the composer
synchronously the moment send is invoked. Without intervention the
operator's typed text is destroyed at exactly the moment they most
need it preserved. SessionChat additionally clears any pending
schedule on accept, so a failed scheduled send was also silently
downgrading to immediate on the next attempt.

Behaviour:

- useSendMessage exposes onError({ sessionId, text, scheduledAt, error })
  so the route can hand the input back to the composer. sessionId is
  the resolved target (post-resolveSessionId), so an inactive-session
  resume that resolves a new id, kicks off async navigation, then
  fails the POST restores into the resumed session's composer rather
  than the old one.

- router.tsx stores sendErrors keyed by sessionId. Per-session lookup
  replaces the clear-on-session-change effect, so errors do not bleed
  between sessions and a session-scoped failure persists across
  navigation.

- HappyComposer accepts ComposerSendError, restores text via
  api.composer().setText() once per failure id, and re-establishes any
  pending schedule via onSchedule({ type: 'absolute', ms: scheduledAt }).
  It renders a red ring on the composer wrapper and a role="alert"
  inline message; both clear the moment the operator types or sends.

- onError forks on input.attachments. Text-only sends use the
  composer-restore path (removeOptimisticMessage drops the row so the
  failed bubble does not duplicate the restored text). Attachment
  sends keep the legacy failed-bubble UX (status='failed' + in-thread
  retry button) because the composer-restore path can't reinstate
  uploaded attachment metadata. retryMessage extracts attachments
  from the stored optimistic message via getMessageAttachments so
  failed-bubble retry of an attachment send re-fires with its files.

Acceptance (issue tiann#776):

- Submit -> 500/502/503/network error -> composer text not cleared
- Submit -> 400/401/403 -> composer text not cleared, error inline
- Submit -> 2xx -> composer clears as today
- Operator can edit retained text and retry without re-typing
- Failed scheduled sends restore as scheduled, not as immediate

Tests in web/src/hooks/mutations/useSendMessage.test.tsx cover
text-only 4xx/5xx/network retention, scheduled-send carry-through,
optimistic-row removal on text-only failure, sessionId carry-through
under resolveSessionId, attachment failure fallback, and attachment
retry preservation. Full web suite passes (705 tests). bun typecheck
clean. No SCHEMA_VERSION bump (frontend-only).

* fix(test): correct AttachmentMetadata fixture shape + JSX namespace import

Two pre-existing test-only typecheck failures surfaced once scratchlist v1.1
was stacked into the driver soup.

* SessionChat.test.ts - the attachment() fixture used the legacy schema
  (kind, sizeBytes) instead of the current AttachmentMetadataSchema
  (filename, size, path). Updated to match the live shape so the cast
  is honest.

* ComposerButtons.test.tsx - JSX namespace is no longer global under
  the current TS lib config; switched the helper signature from
  JSX.Element to React's ReactElement (same runtime, named import).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(scratchlist): soften panel chrome - drop strong amber fill, keep subtle accent border (tiann#812)

Per tiann#812 (and PR 827 from @swear01) the always-visible amber fill
on the scratchlist panel was too loud as a scroll element. This
swaps the warning *fill* for the chat-user-surface tone and uses
neutral text/pills/focus, but keeps the warning *border* as a soft
accent so the panel still reads as a different destination from a
normal user message.

The strong destination signal continues to live on the composer
Send button (it goes amber-500 only while scratchlist mode is
routing) and the active toggle button - those carry the
moment-of-action signal the user actually presses, and ComposerButtons
tests + the FUE copy already depend on that behavior, so they're
unchanged.

Credit to @swear01 (PR 827) for the styling note; this branch
absorbs that restyle and supersedes the Settings-toggle approach
because v1.1 hides the panel by default behind the composer drawer
toggle (no Settings entry needed).

Adds a regression-guard test asserting the panel uses the
chat-user-surface bg + warning-border (not the warning fill).

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: add Fable model presets for Claude sessions

Claude Code 2.x accepts the fable / fable[1m] model aliases for
Fable 5. hapi passes the model string through verbatim, so adding
the presets to CLAUDE_MODEL_LABELS surfaces them in the new-session
and composer model pickers, labels, and the 1M context-window
heuristic.

* test: update modelOptions full-list assertions for Fable presets

Addresses review feedback on tiann#860: getModelOptionsForFlavor appends
every Claude preset, so the two complete-array expectations must
include the new fable entries.
When the shell exited inside the remote terminal view, the page kept
showing a banner ("Terminal exited with code 0.") and left the user
stranded with no obvious next step. On mobile this is awkward, and it
does not match the muscle memory from native terminal emulators where
typing `exit` closes the tab/window.

Schedule a goBack() shortly after `terminal:exit` fires so the user
briefly sees the exit info, then returns to the session chat (same
destination as the existing back arrow via useAppGoBack).

The auto-close timer is cleared on unmount, on sessionId change, and
when the socket reconnects after a transient drop so a stale exit
event cannot navigate away from a freshly reconnected terminal.

Closes tiann#856

Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(opencode): use ACP-reported reasoning effort options

Expose thought_level options from OpenCode ACP to the web UI via RPC/API
instead of hardcoded presets, and validate effort values before setConfigOption.

Fixes tiann#852

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(opencode): sync hub effort after coerced setConfigOption

When resolveThoughtLevelEffort falls back to a different supported value,
roll back session state after a successful ACP update so keepalive and the
web UI do not keep advertising the rejected effort.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
… to ACP (tiann#844)

* feat(cursor): invisible sync-on-open migrator from legacy stream-json to ACP

Closes tiann#824

When the operator reopens a legacy stream-json Cursor session in HAPI,
the hub now transparently transplants its `~/.cursor/chats/<wsh>/<uuid>/store.db`
into `~/.cursor/acp-sessions/<uuid>/`, verifies it loads via `agent acp`,
flips `metadata.cursorSessionProtocol = 'acp'`, and removes the legacy
source - all before `resumeSession` returns. Subsequent opens are pure ACP.

The primary justification is safety, not feature parity. tiann#784 (`cursor-agent`
fabricates `Questions skipped by the user` responses in legacy stream-json
mode) still fires regularly in dogfood despite tiann#801's mitigation: the agent
ships destructive side effects against fabricated consent. Migration to ACP
closes the protocol-level door because the `AskQuestion` tool does not exist
on the ACP side, so there is nothing to fabricate.

working. That tradeoff was reasonable at the time. The accumulated tiann#784
evidence makes legacy sessions actively unsafe; this PR makes the upgrade
path invisible enough that users stop avoiding it.

A pre-PR spike established that legacy and ACP `store.db` files use the
identical SQLite schema; only the directory layout differs. The migrator
therefore:

1. Sanity-checks the source store and pre-flips state (`session.active`,
   `lifecycleState`, on-disk presence, target collision)
2. Optionally archives a stale-running row (`forceArchiveRunning: true` is
   the default for the auto-migrate path because the caller already
   verified `session.active === false`)
3. Atomically creates `~/.cursor/acp-sessions/<uuid>/` with mode `0o700`
4. Copies `store.db` and chmods to `0o600` (multi-user-host hardening)
5. Writes a minimal `meta.json` sidecar (`schemaVersion`, `cwd`, optional
   `title`) with mode `0o600`
6. Spawns `agent acp` under HAPI_HOME isolation and verifies the session
   loads via `session/load`. On long histories the verify also drives a
   trivial single-turn prompt; on short ones load-only is enough
7. Flips `cursorSessionProtocol = 'acp'` AND clears the
   `cursorMigrationState` banner flag in a SINGLE metadata write
8. Removes the legacy source store (only after verify succeeded and the
   protocol flip committed). The legacy `~/.cursor/chats` parent dir is
   left as-is

Every failure leaves the legacy state intact. No `rm` fires without a
verify success AND a committed protocol flip.

The transplant takes 15-20s on long histories (copy a multi-hundred-MB
store, spawn `agent acp`, replay thousands of notifications, tear down
the probe). Without a progress indicator the wait reads as "broken" to a
fresh reviewer. A minimal banner ships alongside the migrator:

- Hub sets `metadata.cursorMigrationState = 'in_progress'` BEFORE the
  long-running transplant. The session-cache refresh emits the existing
  `session-updated` SSE event (no new event type), so the web client
  picks it up in milliseconds. No client-side polling needed.
- Hub clears the flag in the SAME metadata write that flips
  `cursorSessionProtocol` to `'acp'` on success, so the banner disappears
  in the same render tick the chat re-renders as ACP - no flicker window.
- Hub clears the flag explicitly in the auto-migrate helper's `finally`
  on failure/exception, so the banner never gets stuck if migration
  falls back to the legacy launcher.
- Web renders an accessible (role=status, aria-live=polite) banner with
  an indeterminate spinner. Deliberately no fake percentage - we do not
  have phase data and a fake progress bar would lie.

This PR is intentionally sequenced AFTER swear01's three ACP mop-up PRs
(merged today as ad038bb, 8094b50, fa363c2), all of which are
prerequisite for safe concurrent ACP launches.

The verify probe spawns `agent acp` directly via `AcpVerifyProbe` under
HAPI_HOME isolation (the migrator overrides `HOME` to a temp dir for the
verify pass), so it never touches `<real-HAPI_HOME>/locks/agent-acp-active/`
at all. Per swear01's tiann#835 design note, the post-flip ACP launcher claims
the lock through the standard `registerActiveAcpTransport` entry and
behaves like any other concurrent ACP start. The migrator itself never
writes `pid` or `count` files directly.

The auto-migrate path is gated by `HAPI_CURSOR_LEGACY_AUTO_MIGRATE`. Set
to `0`, `false`, `no`, or `off` to suppress it entirely (legacy sessions
keep running through the existing stream-json launcher). Default is on.

A REST endpoint at `POST /api/sessions/:id/migrate-to-acp` allows
explicit migration of a single session outside the sync-on-open path
(e.g. for a specific cold archived session a user wants to re-engage).
Bulk migration surfaces (CLI subcommand, web button, bulk REST endpoint,
candidate-listing API) were deliberately stripped per reviewer feedback;
per-session sync-on-open + this escape hatch are the only two paths.

- 343 hub unit tests (4 new for the migration banner flag transitions,
  53 for the migrator core, 32 for the verify probe, 15 for the auto-
  migrate helper guard matrix, plus existing suites)
- 3 integration tests against a real `agent acp` (skipped by default
  unless `HAPI_CURSOR_LEGACY_MIGRATOR_INTEGRATION=1`)
- 10 web unit tests for the banner component (visibility paths + a11y)
- Typecheck clean across cli, web, hub

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): address upstream Codex review findings on tiann#844 (2 majors)

Finding 1 (Major) — verify probe `agentLookupHome` ignored `metadata.homeDir`
in service-account hub deployments. `migrateOne` resolved the legacy store
under `metadata.homeDir` (the recorded session-owner home) but the default
createProbe factory still set `agentLookupHome` from `this.deps.homeDir()`
(the hub user's home). On a service-account hub, the store lookup
succeeded but `agent acp` discovery fell back to the hub user's
`~/.local/bin`, so verify silently failed and sync-on-open quietly fell
back to legacy.

Fix:
- Widen `CursorLegacyMigratorDeps.createProbe` signature from
  `(env) => AcpVerifyProbe` to `(env, agentLookupHome) => AcpVerifyProbe`
- Default factory uses the passed `agentLookupHome`
- `verifyInTempHome` threads `opts.sourceHome` (already the resolved
  session-owner home) through as the 2nd arg

2 new regression tests pin the contract:
- service-account case (metadata.homeDir != deps.homeDir()): captured
  agentLookupHome MUST equal metadata.homeDir
- legacy session record (no metadata.homeDir): falls back to
  deps.homeDir() correctly

Finding 2 (Major) — `bun.lock` win32-x64 pinned to 0.20.0 while
`cli/package.json` required 0.20.1 (rebase artifact from the v0.20.0 →
v0.20.1 release commit landing in upstream/main between the original
spike and the rebase). Frozen-install Windows users would either get
the wrong native binary or have the lock rejected.

Fix: regenerated bun.lock so the entry resolves
`@twsxtd/hapi-win32-x64@0.20.1`. `bun install --frozen-lockfile` now
passes clean.

Test budget:
- 391 hub unit tests pass (2 new for createProbe agentLookupHome
  contract, +0 regressions)
- Typecheck clean across cli + web + hub
- `bun install --frozen-lockfile` clean

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
… supervision (tiann#814)

* feat(runner): HAPI_DISABLE_VERSION_HANDOFF opt-out for mtime self-restart

The heartbeat in cli/src/runner/run.ts triggers spawnHappyCLI(['runner','start'])
+ process.exit(0) when getInstalledCliMtimeMs() differs from startedWithCliMtimeMs.
The same mtime guard fires in controlClient.isRunnerRunningCurrentlyInstalledHappyVersion
when a fresh CLI invocation inspects the live runner.

For operators who own process supervision (systemd, tmux, custom rebuild
pipelines, etc.), source-file mtimes shift for reasons unrelated to npm
upgrades. The clean exit defeats Restart=on-failure under systemd and
leaves the machine offline.

Setting HAPI_DISABLE_VERSION_HANDOFF=1 in the runner's environment now skips
both checks while keeping the rest of the heartbeat (session pruning, state
file persistence) intact. Default behavior is unchanged for npm consumers.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(runner): preserve original argv across self-restart and verify handoff

The mtime-driven self-restart in cli/src/runner/run.ts spawned
`hapi runner start` with no arguments, then process.exit(0)'d
unconditionally after a 10s sleep. Two failure modes:

1. The forwarded `runner start-sync` lost the operator's --workspace-root
   flags (anything passed at the original invocation). Browse + spawn
   silently degraded to "no workspace roots".
2. If the replacement runner failed to come up at all (build was mid-flight,
   binary missing, etc.) the original runner still exited cleanly. Under
   systemd Restart=on-failure that means no runner is brought back, and
   the machine drops off the hub until manual intervention.

Changes:

- persistence.ts: add startedWithArgv?: string[] to RunnerLocallyPersistedState
- run.ts: snapshot process.argv.slice(2) at startup, persist it on initial
  state write and on every heartbeat, replay it as the new runner's argv
  (default to ['runner','start-sync'] when nothing was captured)
- controlClient.ts: new waitForRunnerHandoff(oldPid, {timeoutMs}) polls
  runner.state.json for a different live PID
- run.ts: only clearInterval + process.exit(0) when handoff is confirmed.
  On spawn failure or 30s timeout, refresh the mtime baseline (so we don't
  respawn-loop on the same drift) and stay alive so the machine keeps
  serving.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(runner): address Codex review findings on tiann#814

Two Major correctness fixes flagged by upstream Codex review on PR tiann#814:

(1) Stale-mtime poisoning on failed handoff (run.ts:854,867)

The previous failure paths assigned
  startedWithCliMtimeMs = installedCliMtimeMs
which the next heartbeat persisted to runner.state.json. Downstream
isRunnerRunningCurrentlyInstalledHappyVersion() then reported the
still-stale runner as current, masking the failure until the *next*
genuine mtime change. Symptom: an mtime change that briefly failed
to hand off would be silently forgotten.

Fix: leave startedWithCliMtimeMs immutable. Gate handoff entry on
a new nextHandoffAttemptAt timestamp; failure paths bump it by
HANDOFF_RETRY_BACKOFF_MS (5 min) via deferHandoffRetry(). The
heartbeat continues to write the honest "still on the old code"
mtime, and the runner naturally re-attempts after the cooldown.

(2) HAPI_DISABLE_VERSION_HANDOFF not honored by live runner
    (controlClient.ts:192, persistence.ts)

The env var was only checked in the invoking CLI process. Under the
documented systemd use case the env is set on the service unit but
NOT on the operator's interactive shell - so a shell `hapi runner
start` would still treat mtime drift as stale and kill the supervised
runner during a rebuild. The exact regression this layer was built
to prevent.

Fix: capture HAPI_DISABLE_VERSION_HANDOFF at runner start time into
state.startedWithVersionHandoffDisabled, persisted via the heartbeat.
The controlClient mtime check now OR's the live env var with the
persisted snapshot, so any caller honours the running runner's
opt-out regardless of their own environment.

Tests: cli typecheck clean; 14/14 runner unit tests pass.
Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(runner): address Codex tiann#814 [Major] argv-capture + handoff race

Two additional Major findings on the runner self-restart layer that
were not addressed in a49fc57:

1. run.ts:672 - process.argv.slice(2) returns ['start-sync', ...] in
   compiled binary mode (raw argv is [hapi, runner, start-sync, ...]),
   so the handoff spawned `hapi start-sync ...` which resolveCommand
   treats as an unknown top-level and falls back to Claude. Replaced
   with getCliArgs() (the project's canonical argv normalizer) plus a
   defensive guard that falls back to ['runner', 'start-sync'] if the
   captured argv does not begin with 'runner'.

2. run.ts:892 - waitForRunnerHandoff did not actually keep the old
   runner alive. The child's startRunner() unconditionally called
   stopRunner() before acquiring the lock or writing its own state,
   so the parent's /stop handler resolved shutdown and exited BEFORE
   the child committed. If the child then failed (lock contention,
   auth error, anything between stopRunner and writeRunnerState),
   the machine went offline with no runner at all.

   New handoff protocol:
   - Parent sets HAPI_RUNNER_HANDOFF_FROM_PID=<pid> on the spawned
     child's env, then releases the lock BEFORE entering
     waitForRunnerHandoff (breaks the parent-holds-lock /
     child-needs-lock-to-write-state deadlock).
   - On wait-timeout the parent re-acquires the lock (long-retry, 30s)
     and defers retry; if re-acquire fails (third party took the
     lock) the parent exits cleanly so it does not stay alive without
     the lock invariant.
   - Child detects the env signal; if state.pid matches and that pid
     is alive, this is an authorized handoff: skip stopRunner(),
     skip the version-match early-exit, and acquire the lock with a
     longer retry window (60 attempts x 500ms) so it waits through
     the parent's asynchronous release.

CLI typecheck clean. 14/14 runner unit tests still pass. The wider
46/664 failures in the CLI suite are pre-existing in this branch
(unrelated: AppServerEventConverter, cursorEventConverter, hook
server, Query) - baseline before this commit has 42+; my changes do
not regress them.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…n#868)

Bun's fetch and node:http honor HTTP_PROXY/HTTPS_PROXY env vars, which can
route loopback traffic through a configured proxy (e.g. Surge/Clash). When
NO_PROXY doesn't explicitly exclude localhost, this breaks loopback
communication: SessionStart hooks fail to arrive (transcripts don't sync, web
UI stays empty), runner control client times out, and MCP server connections
fail.

Normalize NO_PROXY at the CLI entrypoint to always cover loopback
(localhost, 127.0.0.1, ::1). Child processes inherit the patched env so their
loopback traffic is covered too. Non-loopback traffic continues using the
configured proxy. Supersedes the runner control client workaround from tiann#563.
When the terminal page loaded, focus stayed on the page body — the user
had to click/tap into the xterm area before keystrokes were captured.
Add terminal.focus() at the end of handleTerminalMount so focus lands
inside the terminal as soon as the xterm instance is attached to the DOM.
The quick-input buttons already call terminalRef.current?.focus() after
each press; this extends the same pattern to the initial mount.

Closes tiann#875

Co-authored-by: Cursor <cursoragent@cursor.com>
* test: reproduce issue tiann#863

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: surface ACP stdin write failures to web UI (closes tiann#863)

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…#844 regression) (tiann#877)

* fix(cursor): migrator path-priority + ambiguity surface (closes tiann#844 regression)

The legacy-to-ACP migrator's `findLegacyChatStore()` walks
`~/.cursor/chats/<workspace-hash>/<cursorSessionId>/store.db` via
`readdirSync()` and returns the FIRST match. When the same cursor
session id exists in more than one workspace-hash drawer (operator
opened the session from a worktree, an old workspace clone, etc.)
the readdir order picks an arbitrary candidate. The migrator then
transplants alien content into the ACP target, deletes the source
drawer, and reports success - because the verify probe only checks
"loads cleanly", not "loaded the right content". Operator session
resurrects with no recall of its real history.

Four-part fix (all four must land together):

1. Path-priority discovery in `findLegacyChatStore(id, home, cwd?)`:
   - Optional 3rd arg = canonical workspace path (caller passes
     `session.metadata.path`).
   - Compute md5(cwd) and check that drawer FIRST.
   - Fall back to readdir scan only if the canonical drawer is empty.
   - If 2+ candidates remain after fallback, throw
     `AmbiguousLegacyStoreError` listing all of them
     (workspaceHash, sizeBytes, mtimeMs).
2. Ambiguity surface in `maybeAutoMigrateLegacyCursorSession`:
   - Catch `ambiguous_legacy_store` / `size_mismatch` refusals and
     promote `cursorMigrationState` from 'in_progress' to a new
     'ambiguous' state instead of silently clearing the banner.
     Operator sees an actionable web-banner.
3. Size sanity check before transplant:
   - Compare HAPI's known message count (new `MessageStore.countMessages`
     + `CursorLegacyMigratorDeps.getHapiMessageCount` dep) against
     the candidate `store.db`'s blob count. If message count > 100
     AND blob count < messageCount/4, refuse with `size_mismatch`.
   - Skipped when message count is 0 (brand-new session) or the dep
     is unwired (unit tests, CLI direct callers).
4. Diagnostic logging on every successful transplant:
   - `[migrator] transplanted` info log capturing cursorSessionId,
     picked workspaceHash, candidate count discovered, sourceBytes,
     sourceBlobCount, targetAcpPath, sourceRemoved, canonical-path
     md5. Future regressions of this bug shape are diagnosable from
     `journalctl -u hapi-hub` without blob-overlap forensics.

Tests added in `hub/src/cursor/cursorLegacyMigrator.test.ts`:
  - regression guard for single-drawer discovery
  - canonical-path wins over readdir order
  - ambiguity throws with all candidates listed (3-drawer + 2-drawer
    no-canonical-arg variants)
  - canonical-path resolves ambiguity cleanly
  - listLegacyChatStoreCandidates enumeration
  - workspaceHashFromPath shape
  - migrateOne happy path with canonical workspace + 3 sibling decoys
  - migrateOne refuses with ambiguous_legacy_store (3 drawers, no
    canonical match) and leaves all sources untouched
  - migrateOne proceeds when canonical path resolves
  - size_mismatch refuses tiny candidate when messageCount=6000
  - size_mismatch passes when candidate blob count meets the floor
  - size sanity skipped on messageCount=0, missing dep, throwing dep,
    boundary (messageCount=100)
  - countLegacyStoreBlobs returns counts / null on bad path
And in `hub/src/sync/syncEngineAutoMigrate.test.ts`:
  - cursorMigrationState promoted to 'ambiguous' on
    ambiguous_legacy_store / size_mismatch refusals.

Schema:
  - `shared/src/schemas.ts`: cursorMigrationState enum gains 'ambiguous'.
  - `shared/src/apiTypes.ts`: CursorMigrateRefusalReason gains
    'ambiguous_legacy_store' + 'size_mismatch'.

Real-world repro (operator's tooling session, 2026-06-09): three legacy
drawers contained one cursor session id - one with the real 21k-blob
history, two with stale 19/568-blob diagnostic snapshots. Migrator
silently transplanted the 568-blob alien content; resurrected session
had no memory of prior history. Manual rescue completed; this fix
prevents recurrence and surfaces the ambiguity to the operator instead.

* fix(cursor): address cold review on migrator path-priority fix

Self-review against the cold-PR rubric surfaces four polish items on
the previous commit; all four addressed in-loop before push.

- Major: `migrator:transplanted` candidate count was captured AFTER
  the source rm, so for the dominant single-candidate happy path the
  log reported `candidateCount=0, sourceRemoved=true`. Useless for
  diagnosing a future regression of the bug shape this PR is fixing.
  Snapshot candidates + source-side size + source-side blob count
  BEFORE any destructive step and use those for the log.
- Minor: `sourceBytes` and `sourceBlobCount` were read from the
  destination path (acpSessionDir/store.db). The cp guarantees they
  match, but the field names imply source-side measurement. Now they
  measure the source directly.
- Minor: `setCursorMigrationStateAmbiguous` silently returned false on
  cache miss / repeated version mismatch / write failure, letting the
  finally{} block clear the banner without any log. Now emits a
  warn-level log so the gap is diagnosable from journalctl.
- Minor: `findLegacyChatStore` is exported public API and used as a
  free function in unit tests. An out-of-band caller bypassing
  preflightSession could pass `..` or `/etc/passwd` and have the inner
  `join(chatsRoot, wsh, id, 'store.db')` resolve to an arbitrary on-
  disk path. The probe is read-only `statSync` so blast radius is
  small, but enforce the same CURSOR_SESSION_ID_RE at the function
  boundary as a defence-in-depth. New unit test locks the behaviour.

Hub test suite: 414 pass, 0 fail. Typecheck clean across cli/web/hub.

* fix(cursor): cold-review polish on migrator path-priority (tiann#873)

- Web `CursorMigrationBanner` now renders a "Manual review needed"
  state for `cursorMigrationState === 'ambiguous'` (Major #1: caller
  was promoting the metadata flag but no UI surfaced it).
- Pin the md5-fixture contract for `workspaceHashFromPath`: raw,
  no-normalization, trailing-slash-distinct hashes computed via
  `printf '%s' <path> | md5sum` (Major #2: prevents algorithm drift
  that would silently revert path-priority discovery to fallback).
- Snapshot full candidate set BEFORE the canonical fast-path resolves
  a single drawer so the `migrator:transplanted` log reports the
  decision-time count, not a post-rm undercount (Minor #1).
- Warn log when canonical-path drawer is missing but readdir hands
  back exactly one candidate - regression-equivalent behaviour, but
  the size mismatch warrants a journalctl trail (path-normalization
  corner case the maintainer can grep for).
- Boundary test: `messageCount = 101` (first value above the skip
  threshold) engages the size sanity check, pinning the cutoff
  contract (Nit).
- Schema docstring on `cursorMigrationState` enum spelling out the
  banner contract per value (Nit).
- syncEngine `getHapiMessageCount` warn-logs `countMessages` throws
  instead of silently downgrading to 0 (would chronically disable
  the floor).

Drafted with claude-4.6-sonnet-thinking via Cursor; reviewed and
tested by the operator. tiann#873.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): correct log-search strings in ambiguous banner copy

The en/zh-CN locale strings told users to grep for
'migrator:ambiguous_legacy_store' and 'migrator:size_mismatch'
but the hub emits '[migrator] ambiguous legacy store; refusing
transplant' and '[migrator] size sanity check refused transplant'.

Fix both locale files to quote the actual log prefix so the
journalctl grep the operator is directed to actually hits.

Addresses tiann#877 bot finding (Minor).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): address tiann#877 bot Minor findings (trim + boundary guard)

- Remove .trim() from canonical path before hashing: Cursor hashes
  raw workspace-path bytes; trimming a POSIX path with leading/
  trailing spaces would hash to the wrong drawer, causing a false
  canonical miss and potential ambiguity refusal.

- Add CURSOR_SESSION_ID_RE guard to listLegacyChatStoreCandidates:
  the function was exported without the same traversal-ID boundary
  check present in findLegacyChatStore. A future direct caller
  bypassing findLegacyChatStore could stat paths outside the intended
  <wsh>/<cursorSessionId>/store.db shape.

- Move CURSOR_SESSION_ID_RE declaration above both functions that
  reference it so there is no temporal-dead-zone hazard.

Addresses tiann#877 bot review Minor findings.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…iann#879)

Add 'auto' as a first-class HAPI permission mode for claude-flavored sessions,
enforced by Claude's classifier rather than emulated in canCallTool. Includes
mode configuration, CLI respawn on auto transitions, plan-exit targeting, API
extensions, and documentation updates.
When a non-Default base is selected in the Cursor model picker, collapse the
29-row model list to just Default + the selected base, with a "Change model…"
toggle to re-expand. The Variant section (when present) is now the next thing
under the user's eye instead of being buried below the full base list.

Cursor de-prioritized the standard ACP configOptions migration, so the
parameterized-picker UX needs to live with the current data shape long-term.
Anyone picking a non-Default model is already in the minority - this gives
that minority a focused configure experience instead of a long scroll.

- New helper `filterCursorModelOptionsForCompactView` in cursorModelOptions.ts
  collapses the option list to Default + selected base, with passthrough for
  Default/null selection. Unit tested (4 cases).
- HappyComposer.tsx adds `cursorBrowsingAllModels` local state. Resets to
  compact whenever `selectedModelBase` changes (external session update,
  Default reset, or user picking a different base in the expanded list).
  Picking any row also collapses back so the Variant section stays in view.
- i18n: `misc.changeModel` for en + zh-CN.
- No change to Claude / Codex / Gemini / OpenCode pickers (gated on
  `agentFlavor === 'cursor'`).

Follows #47 (Default-highlight fix in flat mode). Web-only change.

Co-authored-by: Cursor <cursoragent@cursor.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f688b80b26

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


if (resumeSessionId && backend.supportsLoadSession()) {
// Register pending cursorSessionId before awaiting session/load (Zed PR #54431).
session.onSessionFoundWithProtocol(resumeSessionId, 'acp');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Defer cursor session registration until load succeeds

When a resumed Cursor ACP session/load fails (missing/corrupt store, transient ACP error, or a legacy id routed here), this preemptive onSessionFoundWithProtocol has already persisted the cursorSessionId/cursorSessionProtocol to the new HAPI session. The hub deduplicates sessions by that native id, so an older inactive session with the same Cursor id can be merged/deleted before the load is proven usable, leaving the user on a failed ACP session instead of preserving the original resumable record. Register after loadSession succeeds, or roll back/suppress dedupe on failure.

Useful? React with 👍 / 👎.

this.clearCursorMigrationState(session.id, namespace)
}
}
return session

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop resume after auto-migration cannot fall back

If automatic legacy Cursor migration fails, returning the unchanged session lets resumeSession continue and spawn a runner, but that remote spawn creates a fresh HAPI session rather than bootstrapping the original metadata. The child therefore does not receive the old cursorSessionProtocol: 'stream-json' marker and defaults to the ACP launcher, so transient verify failures or manual-review cases do not actually fall back to the legacy launcher and can produce a broken ACP duplicate. Return a resume error here or pass the original session metadata/id through the runner path.

Useful? React with 👍 / 👎.

{t('misc.model')}
</div>
{modelOptions.map((option) => {
{visibleModelOptions.map((option) => {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute cursor picker after browsing toggle

For Cursor sessions with a selected non-default base, this overlay is memoized but the dependency list below still tracks modelOptions instead of visibleModelOptions/cursorBrowsingAllModels. Clicking the new “Change model” row only flips that omitted browsing state, so the memoized overlay does not recompute and the full model list never opens until some unrelated dependency changes. Include the derived visible options or browsing flag in the memo deps.

Useful? React with 👍 / 👎.

Comment thread cli/src/runner/run.ts

// Connect to server
apiMachine.connect();
scheduleCursorModelsPrewarm();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid mutating Cursor state during runner prewarm

On machines with Cursor Agent installed and an empty Cursor model cache, this unconditional startup prewarm calls listCursorModels(), whose cache-miss path runs runCursorAcpModelProbe() and issues ACP session/new against the operator's real environment just to read model metadata. Merely restarting the runner can therefore create an empty Cursor ACP session/store even when the user has not opened a Cursor session. Gate the prewarm behind an existing cache or use a non-mutating/isolated catalog probe.

Useful? React with 👍 / 👎.

dzshzx and others added 3 commits June 17, 2026 10:26
Co-authored-by: dzshzx <22311806+dzshzx@users.noreply.github.com>
…iann#898) (tiann#904)

* test: reproduce issue tiann#898 (Codex fast mode service tier)

* fix(codex): add Fast mode (service tier) toggle and /fast command (closes tiann#898)

* feat(codex+web): Fast mode UI toggle with full persistence

Wires the Codex Fast mode (service tier) end-to-end so it can be toggled
from the web composer and survives reload/handoff:

- shared: serviceTier on Session/SessionPatch, session-alive payload,
  resume target, and a SessionServiceTierRequest schema
- cli: AgentSessionBase carries serviceTier through keepAlive; runCodex
  syncs it to the session instance
- hub: service_tier column (schema v10 + migration), store setter,
  sessionCache + syncEngine plumbing, POST /sessions/:id/service-tier
- web: api.setServiceTier + mutation, a Fast/Standard toggle in the
  composer settings (gated to Codex GPT-5.5/5.4), and StatusBar now
  reflects the real tier instead of the effort heuristic

Refs tiann#898

* fix(codex): preserve unset/persisted service tier on startup keepalive

Addresses HAPI Bot [Major] on PR tiann#904: applyCurrentConfigToSession ran
setServiceTier(currentServiceTier ?? null) on wrapper-ready, collapsing the
untouched `undefined` state into explicit Standard. The immediate
setCollaborationMode keepalive then persisted serviceTier: null, silently
downgrading resumed Fast sessions and disabling account-default Fast.

- Seed currentServiceTier from the persisted session (sessionInfo.serviceTier),
  so a resumed Fast thread keeps running Fast.
- Only call setServiceTier when the tier is explicit (!== undefined), preserving
  the three-state omit semantics at the keepalive boundary.
- Add regression tests: persisted Fast is re-asserted; untouched omits the tier.

* feat(codex+web): gate Fast toggle on catalog-advertised service tier

The Fast toggle was gated on a model-name regex (gpt-5.5/5.4), which still
showed a no-op control to API-key users — Fast credits only apply with
ChatGPT login. Codex's model/list catalog advertises the service tiers
actually available for each model in the current auth/plan context, so gate
on that instead:

- cli: capture serviceTiers (ids) per model in ModelListItem + normalizeModel
- shared: CodexModelSummary.serviceTiers (flows through the existing
  getSessionCodexModels pass-through; no hub change needed)
- web: codexModelAdvertisesFastTier(sessionModel, models) replaces the regex;
  SessionChat gates the toggle on it (hidden while the catalog is
  loading/errored). The toggle now only appears when toggling it will
  actually take effect.

Refs tiann#898

* fix(codex): make explicit Standard service tier sticky across resume

Addresses HAPI Bot [Major] (round 2): a single persisted null conflated
"untouched" with "explicit Standard". A user who turned Fast off persisted
null, but startup mapped null -> undefined (untouched) and omitted serviceTier,
so an account/thread-default Fast could silently return after restart/resume.

Introduce a distinct stored representation:
- 'fast' / 'standard' are explicit user choices; null/undefined = untouched.
- Translate 'standard' -> Codex app-server serviceTier: null ONLY when building
  thread/turn params (toAppServerServiceTier); untouched omits the field.
- /fast off now stores 'standard'; the web Standard option sends 'standard'.
- Tighten SessionServiceTierRequest to enum(['fast','standard']) so stray tier
  strings are never forwarded.

Tests: sticky-Standard-on-resume regression; turn/thread params translate
'standard'->null and omit on untouched; hub route applies fast/standard and
rejects unsupported values + local sessions.

Refs tiann#898

* fix(codex): recognize real Fast tier (id 'priority', name 'Fast') in catalog gate

Live E2E against an authed Codex session revealed the model catalog advertises
the Fast tier with id 'priority' and display name 'Fast' (not id 'fast'), so the
/fast/i gate — which only saw tier ids — wrongly hid the toggle for valid
ChatGPT users on gpt-5.5/gpt-5.4. Capture both the tier id and name as
lowercased tokens so the existing name-based match recognizes 'Fast'. The sent
value stays 'fast' (the documented service_tier value / raw additionalSpeedTiers
request tier). Verified end-to-end: gpt-5.5/gpt-5.4 gate on, gpt-5.4-mini off.

Refs tiann#898

* fix(codex): preserve service tier across session resume

Resuming a Codex session spawns a fresh session (serviceTier null) and merges
the old one in. Unlike model/effort/permissionMode, serviceTier was neither
threaded through the resume spawn nor preserved in mergeSessionData, so a
resumed Fast (or explicit Standard) session silently reverted to the account
default.

Thread serviceTier through the spawn path like its siblings:
- hub: resumeSession passes session.serviceTier to spawnSession; rpcGateway +
  syncEngine carry it in the spawn RPC payload; mergeSessionData preserves it
  old->new (safety net).
- cli: SpawnSessionOptions.serviceTier; apiMachine forwards it; buildCliArgs
  emits --service-tier for codex; the codex command parses it; runCodex seeds
  currentServiceTier from the spawn override first (opts.serviceTier ??
  sessionInfo.serviceTier), so a resumed thread immediately runs the right tier.

Verified end-to-end: set Fast -> kill process -> reopen -> resumed session (new
id) still runs Fast. Tests: buildCliArgs --service-tier (codex only), runCodex
spawn-override seed, mergeSessionData service-tier preservation.

Refs tiann#898

* fix(codex): send advertised 'priority' tier id for Fast, not 'fast'

The model catalog advertises the Fast tier with request id 'priority' (display
name 'Fast'), and OpenAI docs confirm service_tier='fast' maps to the request
value 'priority'. The app-server serviceTier override is a raw request value
that does not validate unknown strings (a live probe accepted 'bogus-xyz'), so
sending 'fast' risks being silently ignored — no Fast applied.

Translate the stored 'fast' state to app-server 'priority' at the thread/turn
param boundary (toAppServerServiceTier); the stored/UI/command representation
stays 'fast'/'standard'. Verified live: a turn with serviceTier='priority' runs
and consumes the Fast-tier rate budget.

Addresses HAPI Bot [Major]. Refs tiann#898

* fix(codex): validate --service-tier CLI value (fast|standard)

Addresses HAPI Bot [Minor]: the internal --service-tier spawn arg accepted any
non-empty string, unlike the web /service-tier enum, so a malformed value could
be seeded into currentServiceTier and persisted via keepalive. Parse it to
'fast'|'standard' and reject anything else, matching the web endpoint.

Refs tiann#898
When a multi-variant Cursor base is picked, the settings overlay stays open
and swaps to a variant-only step instead of dismissing. The default variant
is applied immediately, listed first, and pre-selected; picking any variant
(including re-confirming default) dismisses the overlay. Single-variant bases
and Default still pick-and-dismiss in one tap.

- resolveDefaultCursorVariantWire + buildCursorEffortPickerOptionsWithDefaultFirst
- Multi-variant base change applies default wire (was base-only, no apply)
- HappyComposer cursorDrillDownBase local state; back link returns to full list
- SessionChat passes resolveModelVariantsForBase for pre-render variant rows

Co-authored-by: Cursor <cursoragent@cursor.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e24dc9ad8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

&& modelEffortOptions
&& modelEffortOptions.length > 0
)
&& !cursorVariantDrillDownActive

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Render compact Cursor base list after selection

When a Cursor session already has a non-Default base and the settings menu is opened, cursorVariantDrillDownActive is false, so this leaves showModelSettings true and the overlay still renders the full modelOptions list before the Variant section; the new compact-view helper/misc.changeModel path is never wired in. The intended collapsed view therefore never appears for existing selected models, so variants remain buried under the full catalog.

Useful? React with 👍 / 👎.

Comment on lines +57 to +61
return {
ok: true,
wireId: defaultWire,
nextSelectedBase: value,
shouldApply: defaultWire !== null

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid resetting variants when reselecting a base

When the user is already on a non-default variant for a multi-variant Cursor base and clicks that selected base row to enter the new drill-down, this plan returns the base default and handleCursorBaseModelChange immediately applies it. That means merely opening/configuring the current base can silently replace composer-2.5[fast=false] with the default variant before the user chooses anything; keep the existing wire when the clicked base matches the current base, or drill down without applying.

Useful? React with 👍 / 👎.

Comment on lines +396 to +399
if (!selectedModelBase || selectedModelBase === 'auto') {
setCursorDrillDownBase(null)
setCursorDrillDownDefaultVariant(null)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset Cursor drill-down when the selected base changes

If the settings overlay is in drill-down for base A and the session model changes to another non-Default base B (for example from another client or a failed apply rolling back), this effect does not clear the drill-down because B is not auto. The overlay then keeps hiding the model list and shows A's variant rows while B is selected, so the next variant click can switch the wrong base; clear the drill-down whenever selectedModelBase differs from cursorDrillDownBase, not only for Default.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants