Skip to content

[codex] Recover browser image artifact downloads#252

Draft
derekszen wants to merge 3 commits into
steipete:mainfrom
derekszen:codex/devtools-fetch-rootcause
Draft

[codex] Recover browser image artifact downloads#252
derekszen wants to merge 3 commits into
steipete:mainfrom
derekszen:codex/devtools-fetch-rootcause

Conversation

@derekszen

Copy link
Copy Markdown
Contributor

Summary

  • retry manual-login DevTools tab creation on fresh browser launches, not only reused Chrome sessions
  • fall back to authenticated browser-context fetch when Node fetch cannot download ChatGPT generated image artifacts
  • add focused regression coverage for transient DevTools failures and browser-context image artifact recovery

Root Cause

Long-running manual-login browser image runs could fail in two adjacent ways. Fresh manual-login Chrome launches still used zero retries when opening an isolated DevTools tab, so a transient ECONNREFUSED surfaced immediately. Separately, generated image URLs could be visible in the ChatGPT page but fail from Node-side fetch; fetching inside the authenticated browser context recovers those artifacts.

Validation

  • pnpm exec vitest run tests/browser/chromeLifecycle.test.ts tests/browser/chatgptImages.test.ts
  • pnpm run build
  • pnpm run format:check
  • pnpm run lint
  • local browser smoke: patched CLI recovered from fetch failed and saved a valid PNG

@clawsweeper

clawsweeper Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 12, 2026, 12:15 AM ET / 04:15 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model internal, reasoning high; reviewed against b1c0755507ed.

Label changes

Label changes:

  • add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
  • remove status: 📣 needs proof: Current PR status no longer selects a status label.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🌊 off-meta tidepool, so this older rating label is no longer current.
  • remove P2: Current review triage priority is none.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

What I checked:

  • failure reason: retryable codex transport failure (capacity)
  • codex failure detail: Codex review failed for this PR with exit 1.
  • codex stderr: ],"bestSolution":"Require a redacted real browser run that forces Node fetch failure and proves the browser-context path saves a valid representative image, while confirming the broader warning and timed-turn behavior or splitting it.","triagePriority":"P2","impactLabels":[],"mergeRiskLabels":["merge-risk: 🚨 availability"],"mergeRiskOptions":[{"title":"Add representative live recovery proof","body":"Show a real forced Node-fetch failure, browser-context recovery logs, and a valid saved image with type and size before merge.","category":"fix_before_merge","recommended":true,"automergeInstruction":""},{"title":"Split the broader browser behavior","body":"Move warning-modal and timed-turn parsing into separate PRs if they cannot be demonstrated as part of the same image recovery scenario.","category":"pause_or_close","recommended":false,"automergeInstruction":""}],"reviewMetrics":[{"label":"Change breadth","value":"4 runtime files, 3 test files","reason":"The branch affects four distinct browser behaviors, so one claimed smoke without inspectable evidence is not enough."}],"labelJustifications":[{"label":"P2","reason":"This is a normal-priority intermittent browser reliability repair with focused scope and no demonstrated widespread outage."},{"label":"merge-risk: 🚨 availability","reason":"The new fallback transfers complete image payloads through a timed DevTools evaluation, which could stall or fail real browser run.
  • codex stdout: No stdout captured.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added the rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. label Jun 12, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels Jun 12, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 12, 2026
@steipete

Copy link
Copy Markdown
Owner

Maintainer update: I could not push the verified fixup commit to this fork branch; GitHub rejected dedene/codex/devtools-fetch-rootcause with permission denied from this environment.

I preserved this PR's contributor commits and pushed the repaired stack to steipete/oracle as draft replacement #253: #253

Keeping this PR open per maintainer instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Normal priority bug or improvement with limited blast radius. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants