Summary
On any deployment with TALE_AUDIT_SIGNING_KEY configured, every GDPR erasure deterministically breaks audit-chain verification and raises the critical "hash chain broken at log <id>" tamper alert. The erasure pipeline's pass-2 PII scrub blanks admin-authored rows about the erased subject, but the verifier's scrub-trust window is keyed on the row's actorId — which for pass-2 rows is the admin, not the subject the signed checkpoint attests. The verifier therefore recomputes the hash over the blanked body and reports a mismatch.
No tampering is involved at any step; this is the product's own Art 17 feature tripping its own tamper detection.
All line references at 08ca62581 (main).
Mechanism
scrubSubjectAuditLogs (services/platform/convex/audit_logs/internal_mutations.ts:301-449) runs two passes:
- Pass 1 (
:327-367) — rows where the subject is the actor (by_organizationId_and_actorId). Blanks actorEmail/actorRole/ipAddress/userAgent/previousState/newState/metadata/actorEmailHash/actorIpHash, sets piiScrubbed: true.
- Pass 2 (
:369-403) — rows where the subject is the resource (by_org_resourceType_resourceId, resourceType: 'user'). Blanks resourceName/previousState/newState/errorMessage/metadata, sets piiScrubbed: true. The actor on these rows is the admin who performed the action — deliberately left intact (:369-374).
Both passes leave the original integrityHash in place and rely on a signed pii_scrub checkpoint (:420-441, scrubbedSubjectId = args.userId) to tell the verifier the divergence is intentional.
The verifier builds its trust windows keyed by scrubbedSubjectId and looks them up by the row's actorId only (services/platform/convex/audit_logs/verify_integrity.ts:332-343, :403-408):
const windows = subjectScrubWindows.get(actorId);
if (windows && windows.some((w) => entry.timestamp <= w.maxTimestamp)) {
isScrubbed = true;
} else if (!hasSigningKey) { ... }
For a pass-2 row, actorId (admin) ≠ scrubbedSubjectId (subject) → no window matches. On a signed deployment the !hasSigningKey legacy branch is unreachable, so isScrubbed stays false, the hash is recomputed over the blanked body (:478-495), mismatches the stored integrityHash, and verifyAuditChain returns valid: false with firstBrokenAt = the scrubbed row.
All five blanked pass-2 fields are hash-covered (buildAuditRecordHashInput, audit_logs/helpers.ts:149-187); only piiScrubbedAt (in EXCLUDED_FIELDS) and piiScrubbed (destructured out by the verifier) are hash-neutral, so the mismatch is guaranteed whenever pass 2 blanked anything.
Trigger — guaranteed, not probabilistic
The erasure flow itself always creates at least one pass-2 row: requestGdprErasure writes a gdpr_erasure_requested audit row with actorId = <admin>, resourceType: 'user', resourceId = <subject>, with resourceName and newState populated (governance/erasure.ts:573-592). The erasure processor then calls scrubSubjectAuditLogs (erasure.ts:2140-2147), whose pass 2 selects that very row and blanks it. The next 02:00 UTC integrity cron reports the break. Other admin-authored rows about the subject (member invites, password resets via users/set_member_password.ts, role changes) widen the blast radius.
The whole scrub (both passes + checkpoint) commits in one mutation, so this is a permanent steady state, not a transient.
Test gap
No test covers pii_scrub + verify interaction: integrity_check.test.ts has zero scrub coverage, and append_only.test.ts only exercises sequential appends and out-of-band tampering. That is why this shipped.
Suggested fix
Make the verifier's coverage test mirror pass-2's selection criteria: a row is covered by a signed scrub window when
actorId === scrubbedSubjectId (pass 1), or
resourceType === 'user' && resourceId === scrubbedSubjectId (pass 2),
in both cases with entry.timestamp <= window.maxTimestamp. The signed checkpoint already binds scrubbedSubjectId into the HMAC (signature v2), so this widens trust only to rows the scrub actually attested — no new forgery surface beyond what pass 1 already accepts. Add a regression test: erase a user on a signed deployment, then assert verifyAuditChain returns valid: true.
Adjacent gap noticed while reading (can be split out if preferred): scrubSubjectAuditLogs performs no legal-hold check — every other per-table eraser re-checks holds mid-flight (countOrSkip, erasure.ts:980-994), but a custodian hold placed between erasure scheduling and the processor run still gets its audit-log PII blanked.
How this was found
Investigating a production "Audit log integrity check failed" notification. Related umbrella: #1803. Sibling issues from the same investigation: #1842, #1844, #1845, #1846.
Summary
On any deployment with
TALE_AUDIT_SIGNING_KEYconfigured, every GDPR erasure deterministically breaks audit-chain verification and raises the critical "hash chain broken at log <id>" tamper alert. The erasure pipeline's pass-2 PII scrub blanks admin-authored rows about the erased subject, but the verifier's scrub-trust window is keyed on the row'sactorId— which for pass-2 rows is the admin, not the subject the signed checkpoint attests. The verifier therefore recomputes the hash over the blanked body and reports a mismatch.No tampering is involved at any step; this is the product's own Art 17 feature tripping its own tamper detection.
All line references at
08ca62581(main).Mechanism
scrubSubjectAuditLogs(services/platform/convex/audit_logs/internal_mutations.ts:301-449) runs two passes::327-367) — rows where the subject is the actor (by_organizationId_and_actorId). BlanksactorEmail/actorRole/ipAddress/userAgent/previousState/newState/metadata/actorEmailHash/actorIpHash, setspiiScrubbed: true.:369-403) — rows where the subject is the resource (by_org_resourceType_resourceId,resourceType: 'user'). BlanksresourceName/previousState/newState/errorMessage/metadata, setspiiScrubbed: true. The actor on these rows is the admin who performed the action — deliberately left intact (:369-374).Both passes leave the original
integrityHashin place and rely on a signedpii_scrubcheckpoint (:420-441,scrubbedSubjectId = args.userId) to tell the verifier the divergence is intentional.The verifier builds its trust windows keyed by
scrubbedSubjectIdand looks them up by the row'sactorIdonly (services/platform/convex/audit_logs/verify_integrity.ts:332-343,:403-408):For a pass-2 row,
actorId(admin) ≠scrubbedSubjectId(subject) → no window matches. On a signed deployment the!hasSigningKeylegacy branch is unreachable, soisScrubbedstays false, the hash is recomputed over the blanked body (:478-495), mismatches the storedintegrityHash, andverifyAuditChainreturnsvalid: falsewithfirstBrokenAt= the scrubbed row.All five blanked pass-2 fields are hash-covered (
buildAuditRecordHashInput,audit_logs/helpers.ts:149-187); onlypiiScrubbedAt(inEXCLUDED_FIELDS) andpiiScrubbed(destructured out by the verifier) are hash-neutral, so the mismatch is guaranteed whenever pass 2 blanked anything.Trigger — guaranteed, not probabilistic
The erasure flow itself always creates at least one pass-2 row:
requestGdprErasurewrites agdpr_erasure_requestedaudit row withactorId = <admin>,resourceType: 'user',resourceId = <subject>, withresourceNameandnewStatepopulated (governance/erasure.ts:573-592). The erasure processor then callsscrubSubjectAuditLogs(erasure.ts:2140-2147), whose pass 2 selects that very row and blanks it. The next 02:00 UTC integrity cron reports the break. Other admin-authored rows about the subject (member invites, password resets viausers/set_member_password.ts, role changes) widen the blast radius.The whole scrub (both passes + checkpoint) commits in one mutation, so this is a permanent steady state, not a transient.
Test gap
No test covers
pii_scrub+ verify interaction:integrity_check.test.tshas zero scrub coverage, andappend_only.test.tsonly exercises sequential appends and out-of-band tampering. That is why this shipped.Suggested fix
Make the verifier's coverage test mirror pass-2's selection criteria: a row is covered by a signed scrub window when
actorId === scrubbedSubjectId(pass 1), orresourceType === 'user' && resourceId === scrubbedSubjectId(pass 2),in both cases with
entry.timestamp <= window.maxTimestamp. The signed checkpoint already bindsscrubbedSubjectIdinto the HMAC (signature v2), so this widens trust only to rows the scrub actually attested — no new forgery surface beyond what pass 1 already accepts. Add a regression test: erase a user on a signed deployment, then assertverifyAuditChainreturnsvalid: true.Adjacent gap noticed while reading (can be split out if preferred):
scrubSubjectAuditLogsperforms no legal-hold check — every other per-table eraser re-checks holds mid-flight (countOrSkip,erasure.ts:980-994), but a custodian hold placed between erasure scheduling and the processor run still gets its audit-log PII blanked.How this was found
Investigating a production "Audit log integrity check failed" notification. Related umbrella: #1803. Sibling issues from the same investigation: #1842, #1844, #1845, #1846.