perf(graphql): tighten resolver execute hot path#8498
Conversation
Three O(depth) string operations run per resolver call against the
current Object-keyed `rootCtx.fields`: `pathToArray` +
`path.join('.')` in `assertField`, and `lastIndexOf('.')` + `slice`
in `getParentField`. The collapse-dedupe in `resolve.js#start` then
walks the path a third time only to early-return for every collapsed
duplicate. On a deep author/posts/comments workload that is a
measurable share of total CPU spent allocating and discarding path
strings.
Switch `rootCtx.fields` to a `Map<Path, FieldCtx>` keyed on the
graphql-js Path linked-list node identity (`{ prev, key, typename }`
is fresh per `addPath` and never collides). Four savings ride on the
swap:
1. `assertField` is one `Map.get(path)` lookup; no path array, no
string join, no `Object.create` per execute.
2. `getParentField` walks `path.prev` and reads each level via
`Map.get` (O(1) lookup, no substring).
3. The collapse dedupe runs up front in `start()` against a Set
keyed on the lazily-computed path string; duplicates return before
any parent lookup or span construction.
4. `finishResolvers` iterates `Map.values()` and skips field-ctxes
that never started a span. `Span#finish` is idempotent, so the
old loop silently double-published every collapse-dup onto the
currently-active span; trimming those publishes is the cleanup
the earlier dedupe earns.
The path string is computed once, lazily, inside `start()` when a
span is actually created, and shared between the collapse-Set key
and the `graphql.field.path` tag.
Microbench (deep 5x5x4 query, 2000 iters x 7 trials, drop best+worst,
Node 24.15 / V8 13.6):
* baseline 753.7 us/op, stddev 14.2 us
* patched 447.1 us/op, stddev 3.2 us
-306.6 us/op (-40.7 %) at the execute boundary, and the stddev drops
4.4x as the per-call path-array allocations stop driving GC.
Drive-by fix:
* Pin `shouldInstrument`'s non-collapse depth branch with a new "with
collapsing disabled and a depth >=1" describe. The existing spec
exercises depth filtering only with default `collapse: true`.
`field.ctx.info = info` in `resolveAsync`'s completion callback pinned the entire `GraphQLResolveInfo` object (schema, fragments, rootValue, operation, variableValues, fieldNodes, parentType, returnType, path) on every published field-ctx for the request's lifetime. The only consumer reads `info` synchronously inside the resolve plugin's `start()` while building span tags / collecting resolver vars / driving AppSec's `resolverInfo` channel; IAST's subscriber to `apm:graphql:resolve:start` reads `data.rootCtx` and `data.args`, never `info`. Nothing reads it from the post-start ctx, so the write was dead retention only the GC paid for.
…luck info Two coordinated pieces ride on the same per-field-ctx in `assertField`: 1. Collapse dedupe and depth gating move from `GraphQLResolvePlugin#start` into `assertField`. Depth-skipped and collapse-duplicate paths return early before the `startResolveCh` publish, so the resolve plugin's start no longer runs for them, nor does `updateField` fire for them. The depth-count and length-count linked-list walks fold into one combined walk that captures both totals. `GraphQLExecutePlugin#bindStart` writes `ctx.collapse` and `ctx.depth` onto the root ctx so the instrumentation layer can do the gating without reaching into plugin config. Subscribers other than the resolve plugin (IAST taint tracking on `apm:graphql:resolve:start`) see one publish per tracked field instead of one per resolver call -- collapse-duplicate and depth-skipped fields stop showing up there. 2. The per-field-ctx no longer retains `GraphQLResolveInfo`. The resolve plugin reads `info.fieldName`, `info.returnType`, `info.fieldNodes[0]`, and `info.variableValues` at start time and nothing else; plucking the four into the ctx drops per-request retention of `schema`, `fragments`, `rootValue`, `operation`, `parentType`, which heap-snapshot analysis attributes 30-60 % of per-request peak retention. Microbench (deep 5x5x4 author/posts/comments query, 5000 iters x 11 trials, drop best+worst, Node 24.15 / V8 13.6): * baseline 444.8 us/op, stddev 9.3 us * patched 321.2 us/op, stddev 25.1 us -123.6 us/op (-27.8 %) at the `execute()` boundary. `finishResolvers` drops the `!fieldCtx.currentStore` skip and `updateField` drops its `shouldInstrument` check; both guarded against the depth- and collapse-skipped paths that the new `assertField` no longer adds to `rootCtx.fields`.
`assertField` walked `info.path` twice per resolver call and joined a fresh segment array into a path string -- the same path string for every sibling of a collapsed list field. Store the path string in a `Map<Path, string>` on the rootCtx so siblings reuse the parent's cached string and only pay `parent + '.' + key`. The map dies with the rootCtx when execute finishes, so no manual cleanup. V8's ConsString keeps the per-call concat cheap, and the depth-count walk now lives behind `if (depth >= 0)` so the default `depth: -1` config skips it entirely. Microbench (deep 5x5x4 author/posts/comments query, 5000 iters x 11 trials, drop best+worst, Node 24.15 / V8 13.6, two alternated baseline/patched pairs): * baseline 285.7 us/op median, stddev 2.8 us * patched 276.4 us/op median, stddev 3.9 us -9.3 us/op (-3.3 %) at the `execute()` boundary, on top of the existing path-identity-keyed resolver state.
…d list Moving resolve gating into `assertField` made collapse-duplicate paths return `undefined`, so the resolve plugin's `start` and `updateField` handlers no longer fired for them. The plugin's `updateField` is what writes `field.finishTime`, so the side effect is that for a collapsed `friends.*.name` resolving a two-element list, the span closes with the *first* sibling's finish time -- the span's reported duration is the first resolution only, not the work the field actually did. Change `rootCtx.collapsedFields` from `Set<pathString>` to `Map<pathString, field>` and have `assertField` return the existing shared field on a dedupe hit. Subsequent siblings still skip `startResolveCh` (the span is already started) and skip `rootCtx.fields.set` (one map entry per pathString), but they now publish on `updateFieldCh`, so the handler advances `finishTime` on each invocation and the span closes with the last sibling's time. Microbench cost on the deep 5x5x4 author/posts/comments query (5000 iters x 11 trials, drop best+worst, Node 24.15 / V8 13.6, three runs): * before fix 276.4 us/op median, stddev 3.9 us * after fix 302.3 us/op median, stddev 3.3 us +25.9 us/op (+9.4 %) at the `execute()` boundary -- the cost of one extra channel publish per sibling that the regression silently saved. Regression test in `index.spec.js` subscribes to `apm:graphql:resolve:updateField` and asserts a two-element collapsed list produces two publishes for `'friends.*.name'`. The test fails on the pre-fix state and passes after.
…callInAsyncScope
`resolveAsync` forwarded to `callInAsyncScope` with `arguments`,
which materialised an Arguments object on every resolver call only
so `callInAsyncScope` could `apply()` through it. Inline the abort
check, try/catch, and `.then` chain into `resolveAsync`, call the
user's resolver with named params (`source`, `args`, `contextValue`,
`info`) and skip the materialisation. The duplicated
publish-finish branches share one `publishResolverFinish` helper so
the body stays scannable.
`callInAsyncScope` is now only called from `wrapExecute`, which
always passes a `cb` and always has an `AbortController` set on the
ctx. The `cb = cb || (() => {})` and
`abortController?.signal.aborted` defensive slack go away.
Resolve-hot-path microbench (10 trials x 2000 iters, Node 24.15 /
darwin arm64):
* trimmedMedianUs 291.373 -> 289.088 (-0.8 %)
* trimmedMeanUs 289.936 -> 288.659 (-0.4 %)
* stddevUs 6.458 -> 2.979 (-54 %)
The median drop is small; the run-to-run stddev halving is the
clearer signal -- dropping the per-call Arguments materialisation
and an extra function frame makes the resolver path consistent
across trials.
Overall package sizeSelf size: 5.86 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | dc-polyfill | 0.1.11 | 25.74 kB | 25.74 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
|
BenchmarksBenchmark execution time: 2026-05-24 17:33:55 Comparing candidate commit 2d50a6f in PR branch Found 23 performance improvements and 0 performance regressions! Performance is the same for 1476 metrics, 94 unstable metrics. scenario:plugin-graphql-long-with-depth-and-collapse-off-24
scenario:plugin-graphql-long-with-depth-and-collapse-on-20
scenario:plugin-graphql-long-with-depth-and-collapse-on-22
scenario:plugin-graphql-long-with-depth-and-collapse-on-24
scenario:plugin-graphql-long-with-depth-on-max-20
scenario:plugin-graphql-long-with-depth-on-max-22
scenario:plugin-graphql-long-with-depth-on-max-24
|
IAST taint-tracking and the AppSec WAF subscribe to apm:graphql:resolve:start and run on every resolver call so taint from the request body flows into resolver arguments. The depth gate in `assertField` short-circuited before the publish, so arguments at depths below `config.depth` silently stopped getting tainted. Move the gate into `GraphQLResolvePlugin#start` so the publish fires for every resolver; `updateField` and `finishResolvers` skip the span-only work when `start` short-circuited (`fieldCtx.currentStore === undefined`). Drive-by fix: * Drop the `field !== undefined` guards in `resolveAsync`. `assertField` no longer returns `undefined`.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b36b93a191
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
`taintObject` mutates the args object in place, so IAST needs a publish on `apm:graphql:resolve:start` per resolver call, not per unique collapsed path. graphql-js builds a fresh args object for every list sibling, and the prior shape returned the shared field before publishing, so siblings 2..N of a `friends.*.name` resolver never had their args tainted and a sink reached through them missed the taint. The publish now fires before `assertField` consults `collapsedFields`, so every call surfaces on the channel. Span dedupe moves to the resolve plugin's `start`, which short-circuits when a previous sibling already owns the collapsed span -- the same shape the depth gate already uses. Refs: #8498 (comment)
AppSec aborts execution by calling `ctx.abortController.abort()` on the published execute / resolve channels to block a malicious request. The abort branches in `wrapResolve` and `callInAsyncScope` had no coverage, so the WAF-blocking contract -- "an aborted ctx throws `AbortError` out of the next resolver / executor entry" -- was not regression-tested. Two specs pin it: 1. Aborting on `apm:graphql:execute:start` makes `callInAsyncScope` throw `AbortError` before the executor runs and skips every resolver. 2. Aborting from `apm:graphql:resolve:updateField` after the first resolver finishes makes the next resolver hit `resolveAsync`'s own signal check and surface the `AbortError` through `result.errors`.
|
A new linting rule just landed in |
Co-authored-by: Ruben Bridgewater <ruben@bridgewater.de>
Co-authored-by: Ruben Bridgewater <ruben@bridgewater.de>
Summary
Second round of
perf(graphql): trim per-resolver allocations(#8309). Three O(depth) string operations still ran per resolver call against the Object-keyedrootCtx.fields—pathToArray+path.join('.')inassertField,lastIndexOf('.')+sliceingetParentField, plus the collapse-dedupe inresolve.js#startwalking the path a third time only to early-return for collapsed duplicates. Five coordinated pieces cut the work:rootCtx.fieldsswitches toMap<Path, TrackedField>keyed on the graphql-js Path linked-list node identity ({ prev, key, typename }is fresh peraddPathand never collides).assertFieldis oneMap.set(path, …);getParentFieldwalkspath.prevand reads each level viaMap.get— O(1) per step, no substring, no array allocation.field.ctx.info = infois dropped fromresolveAsync's completion path. Nothing read it post-start, but the assignment pinned the entireGraphQLResolveInfo(schema, fragments, rootValue, operation, variableValues, fieldNodes, parentType, returnType, path) on every field-ctx for the request's lifetime — heap-snapshot analysis attributed 30–60 % of per-request peak retention to it.assertFieldbuilds thefieldCtxwith the four fields the plugin actually reads —fieldName,returnType,fieldNodes[0],variableValues— so the plugin drops its remaining reach intoinfooncestartreturns.shouldInstrumentfolds its depth-count and length-count path walks into one linked-list traversal, and the plugin'scollapseflag rides onrootCtxso the instrumentation-layer path-string builder picks the same*segments the plugin's dedupe will key on.Map<Path, string>onrootCtx, so siblings of a collapsed list reuse the parent's cached string and only payparent + '.' + key. V8'sConsStringkeeps the concat cheap, and thedepth: -1default skips the depth-count walk entirely.resolveAsyncinlines the abort check,try/catch, and.thenchain that used to forward tocallInAsyncScopethrougharguments. Each resolver call drops one Arguments materialisation and one function frame;callInAsyncScopeis now only called fromwrapExecute(always with acband anAbortController), so itscb || (() => {})andabortController?.signal.aborteddefensive slack go too.Why
Three behaviour-preserving fixes ride on top so the perf rework cannot regress IAST / AppSec:
finishTime(7abeebb). Moving collapse dedupe intoassertFieldinitially made siblings 2..N short-circuit beforeupdateFieldfired, so the span closed with the first sibling's finish time on a list of N.collapsedFieldsis nowMap<pathString, field>and siblings get the shared field, soupdateFieldadvancesfinishTimeon every call and the span tracks the last sibling's completion.b36b93a). The depth gate sits in the resolve plugin'sstart; the IASTapm:graphql:resolve:startpublish lives inassertFieldand fires for every call. Resolver-arguments taint that used to silently stop flowing atdepth >= config.depthis preserved.4ed0165). The publish moves aboveassertField'scollapsedFieldslookup, so siblings 2..N of afriends.*.nameresolver still get their freshly-built args object surfaced on the channel. IAST taint that depends on per-call args mutation is preserved across collapsed lists.A
test(graphql)commit (04575b8) pins the AppSec-abort contract forwrapResolve's andcallInAsyncScope'sAbortErrorbranches — both previously uncovered.Per-step microbench numbers (Node 24.15 / V8 13.6, deep 5x5x4 author/posts/comments query, methodology in each commit body):
a39f889: 753.7 → 447.1 us/op at theexecute()boundary (-40.7 %); stddev 14.2 → 3.2 us as path-array allocations stop driving GC.f3826ce: 444.8 → 321.2 us/op (-27.8 %).313da65: 285.7 → 276.4 us/op median (-3.3 %).7abeebb: 276.4 → 302.3 us/op median (+9.4 %), the cost of restoring per-siblingupdateFieldpublishes.5182e89: 291.4 → 289.1 us/op median (-0.8 %); run-to-run stddev 6.5 → 3.0 us (-54 %).Drive-by
shouldInstrument's non-collapse depth branch with a new "with collapsing disabled and a depth >=1" describe. The existing spec exercised depth filtering only with defaultcollapse: true.Test plan
CI alone is sufficient; the existing graphql plugin specs plus the new ones (
updateFieldChper collapsed sibling,startResolveChper depth-gated call,startResolveChper collapsed sibling, twoAbortErrorcases) cover the new branches end-to-end.Refs: #8309