Skip to content

Provider/model runtime identity can diverge across config reload, session restore, UI, model refresh, and reload recovery #380

@wtry

Description

@wtry

Provider/model runtime identity can diverge across config reload, session restore, UI, model refresh, and reload recovery
Summary
Jcode currently appears to maintain provider/model state across multiple independent sources without one authoritative runtime identity.

Observed state sources include:

• active Agent.provider
• connection-level/shared provider
• session.provider_key
• session.model
• session.route_api_method
• config defaults from config.toml
• provider memory caches
• persisted provider model catalog caches
• TUI remote model catalog cache
• remote history/bootstrap payloads
• reload recovery continuation state
• UI header / welcome / model picker state

These sources can diverge. When they do, different code paths may use different provider/model identities, causing confusing behavior such as:

• config edits not affecting the active runtime until killall jcode
• header/welcome showing the wrong provider
• session restore using stale provider/model/route metadata
• model refresh targeting one provider while the UI snapshot comes from another
• model picker showing stale catalog entries
• reload recovery continuing with an unintended provider/model
• custom OpenAI-compatible profiles being displayed or routed incorrectly

This is especially visible with custom OpenAI-compatible providers, where the provider slot, profile id, display name, API base, auth env key, route API method, selected model, and catalog namespace are all separate concepts.

──────────────────────────────────────────────────────────────────

Confirmed code observations

  1. Config reload does not rebuild the active provider runtime
    config() reloads the config cache when the config fingerprint changes.

Startup currently registers lightweight reload listeners such as:

• AuthStatus::invalidate_cache
• Bus::global().publish_models_updated()

There does not appear to be a config reload listener that rebuilds or reconciles the current Agent.provider.

So editing config.toml may update future config reads while leaving the active provider runtime unchanged.

  1. Model refresh uses a connection-level provider, while the update event is generated from the agent
    In client_lifecycle.rs:

┌─ rust
│ Request::RefreshModels { id } => {
│ handle_refresh_models(id, &provider, &agent, &client_event_tx).await;
│ }
└─

In handle_refresh_models:

┌─ rust
│ let refresh_future = provider_clone.refresh_model_catalog();
└─

After refresh succeeds, the UI event is generated from the agent:

┌─ rust
│ let event = available_models_updated_event(&agent_clone).await;
└─

So if the connection-level provider and the active agent provider diverge, refresh may target provider A while the UI snapshot is generated from provider B.

  1. History fallback can display provider/model from persisted session metadata
    send_history_from_persisted_session uses:

┌─ rust
│ provider_name: history_provider_name_from_session(&session)
│ .or_else(|| Some(provider.name().to_string())),

│ provider_model: session.model.clone().or_else(|| Some(provider.model())),
└─

history_provider_name_from_session() maps persisted keys such as:

┌─ rust
│ "gemini" => "Gemini"
│ "openrouter" => "OpenRouter"
│ "openai" => "OpenAI"
└─

So if a persisted session contains stale provider_key = "gemini", the UI can display Gemini even if the intended/current runtime is a custom OpenAI-compatible profile.

  1. Session restore depends on persisted provider/model/route metadata
    Both Agent::new_with_session and restore_session use:

┌─ rust
│ MultiProvider::model_switch_request_for_session_route(
│ &model,
│ session.provider_key.as_deref(),
│ session.route_api_method.as_deref(),
│ )
└─

This means stale persisted session metadata is not only cosmetic. It participates in restoring model/provider route state.

  1. TUI remote state and model catalog cache can rehydrate old provider/model state
    History and AvailableModelsUpdated events create a ModelCatalogSnapshot and call:

┌─ rust
│ replace_remote_model_catalog_snapshot(...)
│ persist_remote_model_catalog_cache()
└─

The TUI later hydrates from remote_model_catalog_cache.json if remote model options are empty.

So stale remote catalog cache can repopulate the model picker with old provider/model/routes.

──────────────────────────────────────────────────────────────────

Problematic cases
Case 1: Config file changed, active runtime remains old
Steps

  1. Start Jcode with provider/model A.
  2. Edit ~/.jcode/config.toml to provider/model B.
  3. Continue using the already-running Jcode process.
  4. Open the model picker or send a request.

Possible result

• Config reads may see provider/model B.
• Active Agent.provider may still be provider/model A.
• UI may partially update because publish_models_updated() fires.
• Actual request runtime may not match the newly edited config.

Expected

Either the active provider runtime should be rebuilt/reconciled, or the UI should clearly state that the current session is still pinned to the previous runtime.

──────────────────────────────────────────────────────────────────

Case 2: Header/welcome shows Gemini while using a custom OpenAI-compatible provider
Steps

  1. Use a custom OpenAI-compatible profile.
  2. Restore or attach to a session whose persisted provider_key is stale, for example "gemini".
  3. Trigger history fallback or startup history loading.

Possible result

• UI receives provider_name = "Gemini" from persisted session metadata.
• Actual runtime may be a custom OpenAI-compatible provider.
• Header/welcome/model display shows the wrong provider.

Expected

Header/welcome should display the active runtime identity, not stale persisted session metadata.

──────────────────────────────────────────────────────────────────

Case 3: Model refresh targets one provider but UI snapshot comes from another
Steps

  1. Start with provider A at connection level.
  2. Switch or restore the active session to provider B.
  3. Run model refresh.

Possible result

• provider.refresh_model_catalog() refreshes provider A.
• AvailableModelsUpdated is generated from agent, possibly provider B.
• User sees stale, mismatched, or confusing model picker results.

Expected

Model refresh should target the active agent runtime/provider/profile.

──────────────────────────────────────────────────────────────────

Case 4: Restored session forces stale provider route
Steps

  1. Create a session with provider/model/route metadata.
  2. Later change config/provider setup.
  3. Restore the old session.

Possible result

• Restore uses persisted session.model, session.provider_key, and session.route_api_method.
• The restored route may no longer match current config or intended provider.
• UI and runtime may disagree about which provider/model is active.

Expected

Session restore should reconcile persisted metadata with current runtime identity, or clearly warn when historical session routing is being used.

──────────────────────────────────────────────────────────────────

Case 5: Custom OpenAI-compatible profile identity is split across several fields
Steps

  1. Configure a custom OpenAI-compatible profile.
  2. Use a model from that profile.
  3. Switch providers/models or restore a session.
  4. Refresh the model list.

Possible result

Different code paths may refer to the provider as:

• openrouter
• OpenAI-compatible runtime
• profile id
• custom display name
• API base
• route API method
• persisted provider_key
• old route provider
• cached catalog namespace

This can lead to wrong display names, wrong model picker entries, stale refresh behavior, or wrong session restoration.

Expected

Custom OpenAI-compatible profiles should have a stable runtime identity that includes provider key, profile id, display name, API base, model, route method, and catalog namespace.

──────────────────────────────────────────────────────────────────

Case 6: TUI model picker rehydrates stale remote catalog cache
Steps

  1. Use remote/shared server mode.
  2. Receive a model catalog snapshot for provider/model A.
  3. Later switch to provider/model B.
  4. Open the model picker before fresh remote routes arrive.

Possible result

• TUI hydrates remote_model_catalog_cache.json.
• Old provider/model/routes are restored into the picker.
• Model picker shows stale entries.

Expected

Remote model catalog cache should be invalidated or namespaced by provider/profile/runtime identity.

──────────────────────────────────────────────────────────────────

Case 7: Deleting one cache does not clear all model state
Steps

  1. Observe a stale model list.
  2. Delete one apparent cache location.
  3. Restart or refresh model picker.

Possible result

Stale model information can still come from another layer:

• provider memory cache
• provider persisted catalog
• OpenAI/Anthropic scoped catalog cache
• OpenRouter/OpenAI-compatible disk cache
• TUI remote model catalog cache
• persisted session model/provider metadata

Expected

There should be a clear cache hierarchy, a single invalidation path, or a debug command showing where the current model list came from.

──────────────────────────────────────────────────────────────────

Case 8: Busy agent fallback sends persisted startup snapshot
Steps

  1. Attach to a session while the agent is busy.
  2. handle_get_history fails to lock the agent.
  3. Server falls back to persisted session snapshot.

Possible result

• UI receives provider/model from persisted session metadata.
• That snapshot may be stale relative to the live runtime.
• Header/welcome can temporarily or permanently show the wrong provider/model.

Expected

Fallback history should either be marked as stale/non-authoritative or later reconciled with live runtime identity.

──────────────────────────────────────────────────────────────────

Case 9: Reload recovery continuation resumes with wrong or stale provider/model
Steps

  1. Start a task with provider/model A.
  2. Server reload happens while the turn is in progress.
  3. Jcode reconnects and shows:

┌─ text
│ Reload complete - continuing because a recovery directive was pending.
└─

  1. Continuation request is sent.

Possible result

The continuation may be sent using a stale or unintended provider/model.

Example observed failure:

┌─ text
│ OpenAI-compatible chat request failed
│ endpoint: https://ark.cn-beijing.volces.com/api/coding/v3/chat/completions
│ model: volcengine:ark-code-latest
│ auth: ARK_API_KEY
│ status: 404 Not Found
│ response: UnsupportedModel: The requested model does not support the coding plan feature
└─

The user did not explicitly request this provider/model for the resumed task, but reload recovery continued automatically and repeatedly retried the failing request.

Expected

Reload recovery should preserve and validate the exact runtime identity used by the interrupted turn, or require explicit confirmation if provider/model identity changed across reload. It should not silently continue with a mismatched provider/model.

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Case 10: Auto-poke can continue with unintended provider/model
Steps

  1. Auto-poke is enabled.
  2. There are incomplete todos.
  3. Jcode sends an automatic follow-up:

┌─ text
│ Auto-poking: 3 incomplete todos. /poke off to stop.
└─

  1. The follow-up is sent through the currently active provider/model.

Possible result

If active provider/model state has drifted, auto-poke can trigger a request through an unintended provider/model. This can surface as provider-specific failures unrelated to the user’s actual task.

Expected

Auto-poke should use the same validated active runtime identity as the current session, and should not silently continue if the provider/model identity is stale or ambiguous.

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Expected behavior
There should be one authoritative active runtime identity for each session, for example:

┌─ rust
│ struct ProviderRuntimeIdentity {
│ provider_key: String,
│ profile_id: Option,
│ display_name: String,
│ api_method: String,
│ api_base: Option,
│ catalog_namespace: String,
│ model: String,
│ }
└─

All of the following should derive from that identity:

• actual request runtime
• header/welcome display
• current model display
• model picker routes
• model catalog refresh target
• persisted session metadata
• remote history payload
• remote catalog cache key
• reload recovery continuation
• auto-poke follow-up runtime

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Suggested fix direction

  1. Make active Agent.provider / runtime identity the authoritative source for active sessions.
  2. Make RefreshModels refresh the active agent provider/profile, not a separate connection-level provider.
  3. Reconcile or invalidate persisted session.provider_key, session.model, and session.route_api_method when config/provider identities change.
  4. Namespace TUI remote model catalog cache by provider/profile/runtime identity.
  5. Include runtime identity in reload recovery directives.
  6. Validate reload recovery identity before automatic continuation.
  7. Add diagnostics showing:
    • active runtime provider
    • active runtime model
    • provider display name
    • profile id
    • API base
    • session provider key
    • session model
    • route API method
    • config default provider/model
    • model catalog source
    • remote cached provider/model

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Acceptance criteria
• Header/welcome always matches the active runtime provider/model.
• Refreshing model list refreshes the active provider/profile.
• Restoring a session cannot silently display one provider while using another.
• Custom OpenAI-compatible profiles have stable display names and isolated catalog caches.
• Editing config either updates/reconciles the active runtime or clearly states that the running session is still using the previous runtime.
• Reload recovery does not continue with a mismatched provider/model.
• Auto-poke does not silently send follow-up requests through a stale or unintended provider/model.
• Users should not need killall jcode or manual cache deletion to recover provider/model consistency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority: mediumP2 - normal prioritytriage: needs-decisionNeeds maintainer decision/design thought

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions