Skip to content

ci: read integration test default models from checked-out branch#3109

Closed
xingyaoww wants to merge 1 commit intomainfrom
fix-integration-default-model-source
Closed

ci: read integration test default models from checked-out branch#3109
xingyaoww wants to merge 1 commit intomainfrom
fix-integration-default-model-source

Conversation

@xingyaoww
Copy link
Copy Markdown
Collaborator

@xingyaoww xingyaoww commented May 7, 2026

Problem

The integration-runner.yml workflow uses pull_request_target, which always evaluates workflow-level env: vars from the default branch (main). When a release branch updates the default model list before those changes are merged to main, the integration tests still run with the old model set.

This is what happened with the v1.21.0 release PR #3103: the integration-test label was applied before the model update PR #3102 was merged to main, so CI ran against kimi-k2-thinking and deepseek-v3.2-reasoner instead of the intended kimi-k2.6 and deepseek-v4-flash.

Fix

Move the default model list from a workflow-level env: var (DEFAULT_MODEL_IDS) into a Python constant (DEFAULT_INTEGRATION_MODEL_IDS) in .github/run-eval/resolve_model_config.py.

Since the setup-matrix step already checks out the PR branch before running resolve_model_config.py, the Python constant is always read from the correct branch — regardless of what main contains.

Changes

File Change
.github/run-eval/resolve_model_config.py Add DEFAULT_INTEGRATION_MODEL_IDS list
.github/workflows/integration-runner.yml Remove DEFAULT_MODEL_IDS env var; read defaults from Python
tests/cross/test_resolve_model_config.py Add test that all default IDs exist in MODELS

How to update default models going forward

Edit the DEFAULT_INTEGRATION_MODEL_IDS list in .github/run-eval/resolve_model_config.py. Changes take effect immediately on any branch — no need to wait for main to be updated.


This PR was created by an AI agent (OpenHands) on behalf of the user.

@xingyaoww can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:d220ba0-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-d220ba0-python \
  ghcr.io/openhands/agent-server:d220ba0-python

All tags pushed for this build

ghcr.io/openhands/agent-server:d220ba0-golang-amd64
ghcr.io/openhands/agent-server:d220ba0-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:d220ba0-golang-arm64
ghcr.io/openhands/agent-server:d220ba0-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:d220ba0-java-amd64
ghcr.io/openhands/agent-server:d220ba0-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:d220ba0-java-arm64
ghcr.io/openhands/agent-server:d220ba0-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:d220ba0-python-amd64
ghcr.io/openhands/agent-server:d220ba0-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:d220ba0-python-arm64
ghcr.io/openhands/agent-server:d220ba0-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:d220ba0-golang
ghcr.io/openhands/agent-server:d220ba0-java
ghcr.io/openhands/agent-server:d220ba0-python

About Multi-Architecture Support

  • Each variant tag (e.g., d220ba0-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., d220ba0-python-amd64) are also available if needed

The integration-runner workflow uses pull_request_target, which always
evaluates workflow-level env vars from the default branch (main).  When
a release branch updates DEFAULT_MODEL_IDS before those changes are
merged to main, the integration tests still run with the old model set.

Move the default model list into DEFAULT_INTEGRATION_MODEL_IDS in
resolve_model_config.py.  Because the setup-matrix step already checks
out the PR branch, the Python constant is read from the correct branch
regardless of what main contains.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator Author

Closing — this was over-engineered. The workflow env var is the right place for the default model list. The original issue was a one-time timing problem (label applied before the model update was merged to main), not a systemic issue that needs a code fix.

@xingyaoww xingyaoww closed this May 7, 2026
@xingyaoww xingyaoww deleted the fix-integration-default-model-source branch May 7, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants