Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
6fe0105
chore: start 0.31.1 development
steipete May 28, 2026
4c9e6a8
feat: add provider search
steipete May 29, 2026
b3c0d57
docs: update changelog for provider search
steipete May 29, 2026
01528f5
fix: add Opus 4.8 Claude pricing fallback
devYRPauli May 29, 2026
9674523
fix: preserve Codex web credits-only refresh
soumikbhatta May 29, 2026
06b7de1
fix: bound serve requests and coalesce cache misses
enieuwy May 29, 2026
72716d9
fix: bound OpenAI WebKit refresh lifecycle
steipete May 30, 2026
5ce38b6
fix: retry startup status after offline launch
steipete May 30, 2026
e6d61a8
Harden menu bar status item placement
pdurlej May 29, 2026
8545c76
fix: preserve menu bar placement on upgrade
steipete May 30, 2026
d7db992
fix: refresh cold-start menu readiness
steipete May 30, 2026
4a2ef3a
fix: parse updated Augment auggie status
bcharleson May 30, 2026
a509408
docs: update changelog for Augment parser fix
steipete May 30, 2026
cdd7e34
fix: require HTTPS for provider redirect cookies
Hinotoi-agent May 30, 2026
190d883
docs: update changelog for redirect cookie fix
steipete May 30, 2026
1cc54a0
fix: preserve Claude web usage on auth flakes
LeoLin990405 May 30, 2026
3631312
docs: update changelog for Claude web resilience
steipete May 30, 2026
5dec44e
fix: filter noisy Antigravity model quotas
guhyun9454 May 30, 2026
d2d1fc3
docs: update changelog for Antigravity quota filtering
steipete May 30, 2026
dbc944d
fix: harden CLI installer privilege boundary
Hinotoi-agent May 30, 2026
c566197
docs: update changelog for CLI installer hardening
steipete May 30, 2026
e7d9326
fix: isolate notarization temp files
Hinotoi-agent May 30, 2026
c28e3bb
docs: update changelog for notarization temp hardening
steipete May 30, 2026
482f1da
Fix menu tracking background rebuild stalls (#1233)
steipete May 30, 2026
ddb8054
chore: prepare 0.32.0 release
steipete May 30, 2026
d7a8b38
style: apply SwiftFormat to CLI server
steipete May 31, 2026
041bf4a
test: stabilize release precheck
steipete May 31, 2026
44e4d16
chore: normalize widget project package name
steipete May 31, 2026
1351961
docs: update appcast for 0.32.0
steipete May 31, 2026
6106ca0
chore: open 0.32.1 development
steipete May 31, 2026
d5a5796
fix: defer menu refresh until close
steipete May 31, 2026
e7a96dc
fix: cache codex account reconciliation
steipete May 31, 2026
3488587
fix: preserve Claude CLI token ownership
RajvardhanPatil07 May 31, 2026
07ed3fa
fix: reduce CodexBar menu refresh work
steipete May 31, 2026
07ee69b
chore: finalize 0.32.1 changelog
steipete May 31, 2026
37bc49f
docs: update appcast for 0.32.1
steipete May 31, 2026
af37ac8
chore: start 0.32.2 development
steipete May 31, 2026
460975a
fix: improve compact menu card padding
steipete May 31, 2026
db6eb87
chore: add live QA skill
steipete May 31, 2026
8784d1e
fix: harden live QA skill
steipete May 31, 2026
03827eb
fix: fail QA matrix on parser errors
steipete May 31, 2026
1e03bca
fix: align QA provider discovery
steipete May 31, 2026
0dc51f9
perf: cap automatic codex token scans
steipete Jun 1, 2026
b54a9e6
fix: enforce codex scan budget on refresh
steipete Jun 1, 2026
917fc72
perf: speed up codex token scanning
steipete Jun 1, 2026
6fa9423
chore: update widget project package group
steipete Jun 1, 2026
c778cf7
chore: finalize 0.32.2 changelog
steipete Jun 1, 2026
a6a538e
test: stabilize release precheck
steipete Jun 1, 2026
3f41906
docs: update appcast for 0.32.2
steipete Jun 1, 2026
bd921a6
chore: normalize widget project package reference
steipete Jun 1, 2026
4756ba0
chore: start 0.32.3 development
steipete Jun 1, 2026
ffd8d75
fix: handle Copilot token-billing unavailable quotas
devYRPauli Jun 1, 2026
51a8e23
fix: stop OpenAI WebView route reload loop
ProspectOre Jun 1, 2026
dc4e483
docs: update changelog for Copilot and OpenAI fixes
steipete Jun 1, 2026
085319c
fix: defer closed menu rebuilds during refresh
ProspectOre Jun 2, 2026
440aeb1
perf: cache provider brand icons
steipete Jun 2, 2026
d8b7619
chore: normalize widget package reference
steipete Jun 2, 2026
55c0a10
fix: clear bad status item placement
steipete Jun 2, 2026
1154f55
test: stabilize TTY runner harness
steipete Jun 2, 2026
b9b05d5
chore: finalize 0.32.3 changelog
steipete Jun 2, 2026
743c3c4
docs: update appcast for 0.32.3
steipete Jun 2, 2026
d57f180
chore: normalize widget package reference
steipete Jun 2, 2026
9e6557c
chore: start 0.32.4 development
steipete Jun 2, 2026
0be735b
Avoid redundant menu-open refreshes (#1277)
hhh2210 Jun 2, 2026
0ebb504
docs: add menu refresh changelog entry
steipete Jun 2, 2026
edf3d8d
chore: finalize 0.32.4 changelog
steipete Jun 2, 2026
723734e
docs: update appcast for 0.32.4
steipete Jun 2, 2026
0efa836
docs(026): v0.32.x upstream-sync research set + autonomous loop drive…
o1xhack Jun 3, 2026
6d3e54d
Merge tag 'v0.32.4' into upstream-sync/v0.32.4-mobile.1.11.0
o1xhack Jun 3, 2026
5d8f616
fix(cost): roll cache invalidation for v0.32.x Codex parser merge (pa…
o1xhack Jun 3, 2026
cd64162
docs(026): R1 — G1 merge + G2 parser cache invalidation done (2/10)
o1xhack Jun 3, 2026
d9b746f
test: add Keychain-prompt safety guidance to AGENTS.md (fixes pre-exi…
o1xhack Jun 4, 2026
4a088b3
docs(026): R2 — G3/G6/G7/G9 verified (6/10); next G4 iOS scope decision
o1xhack Jun 4, 2026
3d59278
feat(026): iOS 1.11.0 — v0.32.4 sync release notes (4-lang) + version…
o1xhack Jun 4, 2026
bf42bee
docs(026): R3 — G4/G5/G8 done, iOS 1.11.0 staged (9/10); G10 release …
o1xhack Jun 4, 2026
811f9c4
feat(ios): provider search on the Usage tab (filter 20+ providers) — …
o1xhack Jun 5, 2026
19b5b5e
docs(026): R4 — iOS Usage provider search added (build 149, CR SHIP)
o1xhack Jun 5, 2026
6be4d1d
docs(026): R5 — G10 partial (Mac 0.32.4.1 draft+install, iOS 149 Test…
o1xhack Jun 5, 2026
934e462
ci: fetch-depth 0 for lint-build-test so parser-version audit can com…
o1xhack Jun 6, 2026
8f9db73
ci: run on PR + mobile-dev/main pushes only, not every feature-branch…
o1xhack Jun 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 118 additions & 0 deletions .agents/skills/qa-test/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
name: qa-test
description: "CodexBar live QA/e2e testing: run provider usage matrix checks, validate real app config, use Peekaboo for menu proof, use Browser Use/official docs for API spec or logged-in dashboard checks, and handle 1Password credentials safely."
---

# CodexBar Live QA

Use for live provider testing, release smoke tests, menu verification, or debugging “provider works/fails” reports.

## Rules

- Work from the CodexBar repo checkout.
- Use the packaged CLI first: `CodexBar.app/Contents/Helpers/CodexBarCLI`.
- Do not use `CodexBar.app/Contents/MacOS/codexbar`; that is the app binary and may appear to hang as a CLI.
- Never run broad `env`, `set`, or secret regex dumps.
- Use `$one-password` for secrets: all `op` commands inside one persistent tmux session, service account first, no raw secret output.
- Treat browser-cookie/keychain flows as prompt-risky. Prefer CLI/API-token checks and `KeychainNoUIQuery`-safe tests unless the user explicitly requested live UI.
- For current API behavior, browse official provider docs only.

## CLI Matrix

Run the bundled script:

```bash
.agents/skills/qa-test/scripts/live_provider_matrix.sh --enabled
```

Useful modes:

```bash
.agents/skills/qa-test/scripts/live_provider_matrix.sh --provider all
.agents/skills/qa-test/scripts/live_provider_matrix.sh --providers openai,zai,deepseek
.agents/skills/qa-test/scripts/live_provider_matrix.sh --default
```

Interpretation:

- `--enabled` asks `CodexBarCLI config providers` for enabled providers, honoring `CODEXBAR_CONFIG` and default toggles.
- `--default` runs the app-facing default command with no provider override.
- `--provider all` forces every registered provider and is expected to fail for providers without sessions/keys.
- A green app config needs `--enabled` and `--default` clean; `--provider all` is a discovery/triage tool.

## Config QA

Validate config:

```bash
CodexBar.app/Contents/Helpers/CodexBarCLI config validate
stat -f '%Lp %N' "$HOME/.codexbar/config.json"
```

Redact config shape:

```bash
jq '(.providers // []) |= map(.apiKey = (if .apiKey then "<redacted>" else .apiKey end) |
.secretKey = (if .secretKey then "<redacted>" else .secretKey end) |
.cookieHeader = (if .cookieHeader then "<redacted>" else .cookieHeader end) |
(if .id == "stepfun" and has("region") then .region = "<redacted>" else . end) |
.tokenAccounts = (if .tokenAccounts then (.tokenAccounts | .accounts = (.accounts | map(.token = "<redacted>"))) else .tokenAccounts end))' \
"$HOME/.codexbar/config.json"
```

Before editing config, make a backup:

```bash
cp "$HOME/.codexbar/config.json" "$HOME/.codexbar/config.pre-qa-$(date +%Y%m%d%H%M%S).json"
chmod 600 "$HOME/.codexbar"/config.pre-qa-*.json
```

## Live Menu QA

Use Peekaboo after CLI checks:

```bash
pkill -x CodexBar || pkill -f 'CodexBar.app/Contents/MacOS/CodexBar' || true
open -n "$PWD/CodexBar.app"
peekaboo menu list-all --json | rg -i 'codexbar'
peekaboo menu click-extra --title codexbar-merged --json
screencapture -x /tmp/codexbar-live-menu.png
```

Crop top-right menu if needed:

```bash
sips --cropToHeightWidth 900 340 --cropOffset 20 2650 /tmp/codexbar-live-menu.png \
--out /tmp/codexbar-live-menu-crop.png >/dev/null
```

Verify visually with `view_image`. Confirm provider tabs/rows match enabled config and no failing provider dominates the first screen.

## Browser Use

Use `$browser-use` only when a logged-in dashboard, API key page, or provider docs need browser/profile state.

Existing Chrome path:

```bash
mcporter call chrome-devtools.list_pages --args '{}' --output text
mcporter call chrome-devtools.navigate_page --args '{"url":"https://provider.example"}' --output text
mcporter call chrome-devtools.take_snapshot --args '{}' --output text
```

If Browser Use is unavailable, say so and use web search for public official docs; do not substitute isolated Playwright for login/profile-dependent pages.

## Fix Triage

- Missing auth/session: configure key/session if available; otherwise leave provider disabled or report blocked auth.
- Wrong provider API/spec: inspect official docs, then patch fetcher/settings/tests.
- Provider key exists but live API rejects it: keep key stored if useful, disable provider if the menu would show a persistent error.
- User-facing behavior changes need `CHANGELOG.md`.
- Code fixes need focused tests, `make check`, `$autoreview`, and live CLI proof before landing.

## Known CodexBar QA Notes

- OpenAI Admin API key is the useful usage provider key. Project `OPENAI_API_KEY` values can fail legacy credit-balance fallback with 403.
- Deepgram usage requires a key/project with Management API permissions; transcription-only keys can return 403.
- Groq usage uses the Prometheus metrics API, not ordinary inference endpoints.
- MiniMax pay-as-you-go API keys and Token Plan/Coding Plan keys are different; wrong key kind can leave usage unavailable.
4 changes: 4 additions & 0 deletions .agents/skills/qa-test/agents/openai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
interface:
display_name: "CodexBar QA Test"
short_description: "Run live CodexBar CLI and menu QA safely."
default_prompt: "Run CodexBar live QA with CLI, Peekaboo, browser docs, and 1Password-safe credential checks."
10 changes: 10 additions & 0 deletions .agents/skills/qa-test/references/api-specs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# API Spec Pointers

Use current official docs for provider API behavior. Prefer these searches/pages before patching fetchers:

- MiniMax: `https://platform.minimax.io/docs/llms.txt`; key types differ between pay-as-you-go API keys and Token Plan/Coding Plan keys.
- Deepgram: `https://developers.deepgram.com/llms.txt`; usage/project APIs require Management permissions and project-scoped keys.
- Groq: `https://console.groq.com/docs/prometheus-metrics`; usage metrics use `https://api.groq.com/v1/metrics/prometheus`.
- LLM Proxy/LiteLLM: `https://docs.litellm.ai/`; CodexBar expects an LLM-API-Key-Proxy compatible `/v1/quota-stats` endpoint plus base URL.

When citing docs in a user-facing answer, browse the current page and include source links.
183 changes: 183 additions & 0 deletions .agents/skills/qa-test/scripts/live_provider_matrix.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
#!/usr/bin/env bash
set -u

ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
CLI="${CODEXBAR_CLI:-$ROOT/CodexBar.app/Contents/Helpers/CodexBarCLI}"
TIMEOUT_BIN="${TIMEOUT_BIN:-$(command -v gtimeout || command -v timeout || true)}"
WEB_TIMEOUT="${CODEXBAR_QA_WEB_TIMEOUT:-12}"
CASE_TIMEOUT="${CODEXBAR_QA_CASE_TIMEOUT:-60}"

usage() {
cat <<'USAGE'
Usage:
live_provider_matrix.sh --enabled
live_provider_matrix.sh --default
live_provider_matrix.sh --provider all
live_provider_matrix.sh --providers openai,zai,deepseek

Environment:
CODEXBAR_CLI=/path/to/CodexBarCLI
CODEXBAR_CONFIG=/path/to/config.json
CODEXBAR_QA_WEB_TIMEOUT=12
CODEXBAR_QA_CASE_TIMEOUT=60
USAGE
}

if [[ ! -x "$CLI" ]]; then
echo "missing CodexBarCLI at $CLI" >&2
exit 2
fi
if [[ -z "$TIMEOUT_BIN" ]]; then
echo "missing timeout command (install coreutils for gtimeout)" >&2
exit 2
fi
if ! command -v node >/dev/null 2>&1; then
echo "missing node" >&2
exit 2
fi

mode="${1:-}"
shift || true

providers=()
case "$mode" in
--enabled)
provider_status="$(mktemp)"
provider_err="$(mktemp)"
provider_list="$(mktemp)"
if ! "$CLI" config providers --format json --json-only >"$provider_status" 2>"$provider_err"; then
rm -f "$provider_status" "$provider_err" "$provider_list"
echo "failed to list providers via CodexBarCLI config providers" >&2
exit 2
fi
if ! node - "$provider_status" >"$provider_list" <<'NODE'; then
const fs = require("fs");
const path = process.argv[2];
const raw = fs.readFileSync(path, "utf8").trim();
const payload = JSON.parse(raw);
if (!Array.isArray(payload)) {
throw new Error("config providers output is not an array");
}
for (const item of payload) {
if (item && item.enabled === true && typeof item.provider === "string" && item.provider) {
console.log(item.provider);
}
}
NODE
rm -f "$provider_status" "$provider_err" "$provider_list"
echo "failed to parse CodexBarCLI config providers output" >&2
exit 2
fi
while IFS= read -r provider; do
[[ -n "$provider" ]] && providers+=("$provider")
done <"$provider_list"
rm -f "$provider_status" "$provider_err" "$provider_list"
if [[ "${#providers[@]}" -eq 0 ]]; then
echo "no enabled providers found via CodexBarCLI config providers" >&2
exit 2
fi
;;
--default)
providers=("__default__")
;;
--provider)
if [[ -z "${1:-}" ]]; then
echo "missing provider" >&2
exit 2
fi
providers=("${1:-}")
;;
--providers)
if [[ -z "${1:-}" ]]; then
echo "missing providers" >&2
exit 2
fi
IFS=',' read -r -a providers <<< "${1:-}"
;;
-h|--help|"")
usage
exit 0
;;
*)
echo "unknown mode: $mode" >&2
usage >&2
exit 2
;;
esac

redact_node='
const redact = s => String(s || "")
.replace(/[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+/g, "<email>")
.replace(/sk-[A-Za-z0-9_-]{12,}/g, "sk-REDACTED")
.replace(/gsk_[A-Za-z0-9_-]{12,}/g, "gsk_REDACTED")
.replace(/[A-Za-z0-9_-]{32,}/g, m => /[A-Za-z]/.test(m) && /[0-9]/.test(m) ? "<redacted-token>" : m);
'

run_one() {
local name="$1"
shift
local out err start end elapsed st node_status
out="$(mktemp)"
err="$(mktemp)"
start="$(date +%s)"
"$TIMEOUT_BIN" "$CASE_TIMEOUT" "$CLI" usage "$@" --format json --json-only --web-timeout "$WEB_TIMEOUT" >"$out" 2>"$err"
st=$?
end="$(date +%s)"
elapsed=$((end - start))
node - "$name" "$st" "$elapsed" "$out" "$err" <<NODE
const fs = require("fs");
$redact_node
const [name, st, elapsed, outPath, errPath] = process.argv.slice(2);
const raw = fs.readFileSync(outPath, "utf8").trim();
const err = fs.readFileSync(errPath, "utf8").trim();
let rows = [];
let formatterFailed = false;
try {
const payload = raw ? JSON.parse(raw) : [];
const arr = Array.isArray(payload) ? payload : [payload];
for (const p of arr) {
rows.push(
\`\${p.provider || name}:\${p.error ? "fail" : "ok"}:source=\${p.source || "unknown"}\` +
(p.account ? \`,account=\${redact(p.account)}\` : "") +
(p.usage ? ",usage=yes" : "") +
(p.credits ? ",credits=yes" : "") +
(p.error ? \`,error=\${redact(p.error.message).slice(0, 180)}\` : "")
);
}
} catch (error) {
formatterFailed = true;
rows.push(\`\${name}:parse-fail:error=\${redact(error.message)} stdout=\${redact(raw).slice(0, 200)} stderr=\${redact(err).slice(0, 200)}\`);
}
if (!rows.length) {
formatterFailed = true;
rows.push(\`\${name}:empty:stderr=\${redact(err).slice(0, 200)}\`);
}
console.log(\`TEST \${name} exit=\${st} elapsed=\${elapsed}s :: \${rows.join(" | ")}\`);
if (formatterFailed) process.exit(1);
NODE
node_status=$?
rm -f "$out" "$err"
if [[ "$node_status" -ne 0 ]]; then
return 1
fi
return "$st"
}

overall=0
ran=0
for provider in "${providers[@]}"; do
[[ -z "$provider" ]] && continue
ran=$((ran + 1))
if [[ "$provider" == "__default__" ]]; then
run_one default || overall=1
elif [[ "$provider" == "all" ]]; then
run_one all --provider all || overall=1
else
run_one "$provider" --provider "$provider" || overall=1
fi
done
if [[ "$ran" -eq 0 ]]; then
echo "no provider cases ran" >&2
exit 2
fi
exit "$overall"
17 changes: 13 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
name: CI

on:
# CI runs on PRs (the merge gate) and on pushes to the long-lived branches
# only — NOT on every feature-branch commit. A feature branch gets its CI
# through the PR it opens (pull_request: opened/synchronize), so intermediate
# work-in-progress commits no longer each spawn a (often red) CI run.
push:
# `["*"]` only matches single-level branch names; anything with a slash
# (e.g. `feature/…`, `release/…`) is silently skipped. `["**"]` matches
# arbitrary-depth branches so every push triggers CI.
branches: ["**"]
branches: [mobile-dev, main]
pull_request:

concurrency:
Expand All @@ -23,6 +24,14 @@ jobs:
timeout-minutes: 70
steps:
- uses: actions/checkout@v6
# Full history so `lint.sh audit_parser_version` can compute the
# `origin/mobile-dev...HEAD` merge-base. A shallow (depth-1) checkout
# makes the parser-version audit false-fail on any PR that touches the
# cost-usage parser (it can't see the parserLogicVersion bump in the
# diff). See Scripts/lint.sh: "In CI, ensure your checkout fetches
# origin/mobile-dev (e.g. fetch-depth: 0)."
with:
fetch-depth: 0

- name: Select Xcode 26.1.1 (if present) or fallback to default
run: |
Expand Down
2 changes: 1 addition & 1 deletion .mac-release.env
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ MAC_RELEASE_ARTIFACT_PREFIX='CodexBar-macos-[A-Za-z0-9_+-]+-'
MAC_RELEASE_FEED_URL='https://raw.githubusercontent.com/steipete/CodexBar/main/appcast.xml'
MAC_RELEASE_DOWNLOAD_URL_PREFIX='https://github.com/steipete/CodexBar/releases/download/v${MARKETING_VERSION}/'

MAC_RELEASE_PRECHECK='swiftformat Sources Tests >/dev/null && swiftlint --strict && swift test --parallel'
MAC_RELEASE_PRECHECK='swiftformat Sources Tests >/dev/null && swiftlint --strict && swift test --enable-xctest --disable-swift-testing && swift test --enable-swift-testing --disable-xctest --no-parallel'
MAC_RELEASE_PACKAGE_CMD='Scripts/sign-and-notarize.sh'
MAC_RELEASE_TAG_SIGNED=1
MAC_RELEASE_TAG_FORCE=1
Expand Down
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ Full status definitions and index are in `CodexBarMobile/Research/README.md`.
- Build with `xcodebuild` to verify compilation
- Run unit tests if applicable
- Verify on simulator or real device as needed
- Never run tests/checks or ad-hoc validation that can display macOS Keychain prompts. Live provider probes, browser-cookie imports, `codexbar usage` against real accounts, and real SecItem reads must be explicitly requested; otherwise use parser tests, stubs, test stores, or `KeychainNoUIQuery`.

## Step 5 — Documentation

Expand Down
Loading
Loading