feat(lucebox): autotune sweep + profile/smoke diagnostics (stacked on #335)#5
feat(lucebox): autotune sweep + profile/smoke diagnostics (stacked on #335)#5easel wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 39087d8adb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| local cmd="$1"; shift | ||
| case "$cmd" in | ||
| config|models|check|print-run|print-serve-argv) | ||
| config|smoke|models|check|profile|print-run|print-serve-argv) |
There was a problem hiding this comment.
Keep profile off the exec fast path
When the server container is already running, this routes lucebox profile through cmd_exec_in_container. But profile.run_profile immediately shells out to docker inspect/docker exec to run luce-bench in the server container, while the long-running server container is started without the Docker socket mounted. In the normal “server running” case the nested profile command therefore cannot see Docker and exits before taking a snapshot; route profile through the docker-run orchestrator like autotune --sweep.
Useful? React with 👍 / 👎.
| if rc is not None: | ||
| return rc | ||
|
|
||
| cfg = config_mod.load() or config_mod.live_config() |
There was a problem hiding this comment.
Preserve scalar env overrides during sweeps
When config.toml exists, this raw load ignores the LUCEBOX_PORT, LUCEBOX_CONTAINER, and LUCEBOX_MODELS values that the wrapper exports and the systemd unit uses. With a non-default port/container, the sweep restarts the right service but then waits on cfg.port or profiles cfg.container_name from the stale/default file values, causing every cell to fail readiness or target the wrong container. Use the same env overlay path as the CLI before building candidates and probing the server.
Useful? React with 👍 / 👎.
39087d8 to
9f63eea
Compare
Stacked on the core lucebox CLI (Luce-Org#335). Restores the second-order tuning and diagnostics surface deferred from that PR: - autotune.py: the empirical sweep machinery — candidate_configs, the Profile registry + per-arch coding-agent-loop brackets, get_profile. (recommend_preset stays in download.py from the core split.) - sweep.py: the per-cell config sweep driver (config set -> restart -> luce-bench snapshot -> winner pick). - profile.py / smoke.py + the `autotune`, `profile`, `smoke` CLI commands. - Wrapper: the `autotune --sweep` exec-routing special case (sweeps must stay on docker run, not exec into the container they'd restart) and the autotune/profile/smoke entries in usage + completion + the exec set. Tests: sweep / profile / smoke / autotune-cli / candidate-configs suites and the Profile/bracket tests restored; test_cli guards that the client launcher verbs are still deferred while autotune/profile/smoke register. lucebox 122 passed, wrapper 55 passed, ruff + mypy + shellcheck clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9f63eea to
a9a1f30
Compare
Stacked on Luce-Org#335 (core lucebox CLI). Base is `feat/lucebox-cli`, so this PR's diff is only the tuning/diagnostics cluster. Review/merge after Luce-Org#335.
Summary
Restores the second-order tuning + diagnostics surface deferred from Luce-Org#335:
Tests
sweep / profile / smoke / autotune-cli / candidate-configs suites + the Profile/bracket tests restored. `test_cli` guards that the client launcher verbs stay deferred while autotune/profile/smoke register.
lucebox 122 passed (123 with the inherited recommend_preset test), wrapper 55 passed, ruff + mypy + shellcheck clean.
🤖 Generated with Claude Code