runtime split: Engine/Observer/Puppeteer, schema-v3 restored, P4C liveness proven#54
Merged
Conversation
…rness Split the mixed lab runtime into three clear layers with enforced boundaries: Engine (main.js) = product sync runtime Observer (telemetry.js) = passive telemetry / diagnostics, shipped separately Puppeteer (qa/) = mutation harness / QA driver, not shipped Key changes: - Add product-owned observability types (src/observability/): productEventKinds.ts, recoveryEventTypes.ts, traceContext.ts, traceLogger.ts Product code uses PRODUCT_EVENT_KIND string constants instead of importing FLIGHT_KIND enum from Observer internals. - Remove old src/debug/ and src/diagnostics/ product roots: Canonical Observer implementations now live under src/lab/debug/ and src/lab/diagnostics/ (moved in earlier work, originals deleted here). - Introduce passive telemetry runtime (src/telemetry/): installTelemetryRuntime.ts — Observer entry point, no mutation commands. telemetryRuntimeHost.ts — read-only host interface. Emits telemetry.js via esbuild (70 KB, fully clean of mutation symbols). - Move Puppeteer mutation harness to qa/harness/: qaDebugApi.ts, installPuppeteerRuntime.ts, scenarioStateController.ts, vfsTortureTest.ts, ports/yaosUnsafeQaPort.ts — all out of src/. qa/ is tracked source; qa-runs/ (artifacts) is gitignored. - Track pre-existing QA harness source: qa/analyzers/, qa/controllers/, qa/obsidian-harness/, qa/scripts/, qa/fixtures/ — previously hidden by qa/ gitignore rule. - Add strict/transitional production bundle guards: scripts/guard-production-bundles.mjs with --transitional flag. Strict mode fails on known Engine test seams (__qaOnly, Unsafe, ForceSync). Transitional mode warns and exits 0 (PARTIAL PASS). - Update all test imports to new boundary paths. - Fix lint: remove unnecessary type assertions, use configDir injection, wrap unbound methods, use console.debug instead of console.log. Bundle sizes: main.js ~494 KB, telemetry.js ~71 KB Regressions: 84/84 passing telemetry.js forbidden grep: zero mutation-harness symbols src/ -> qa/ import violations: zero Known debt (separate phase): 6 __qaOnly Engine test seams remain in main.js on ReconciliationController and EditorBindingManager. Removal requires injected unsafe capability ports. Strict guard fails on these; transitional guard warns.
docs/architecture/runtime-estates.md: - Define Engine / Observer / Puppeteer estates and boundaries - Document what telemetry.js may/must-not contain - Document tracked qa/ source vs ignored qa-runs/ artifacts - Record 6 deferred Engine __qaOnly test seams with table (file/class) - Note TelemetryRuntimeHost broad-handle debt - Document all enforcement scripts .github/workflows/release.yml: - Add telemetry.js to zip, gh release upload, gh release create - Every release now ships main.js + telemetry.js (Observer bundle) .gitignore: - Add lab.js with explanatory comment (legacy name, must not reappear) package.json: - Add guard:no-lab-artifact — fails if lab.js exists in root - verify:bundles now runs: build + guard:no-lab-artifact + guard:production-bundles:transitional
telemetry.js is a generated release artifact (esbuild output), same as main.js. It was accidentally committed. Removed from tracking and added to .gitignore alongside main.js and lab.js.
Remove all six public __qaOnly*Unsafe methods and supporting production
state from ReconciliationController and EditorBindingManager.
Architecture:
main.ts owns private Engine control state.
ReconciliationController registers a DiskIngestPort via optional dep.
EditorBindingManager reads a BindingPropagationGate supplied at construction.
qa/harness calls plugin.getEngineControlPort() — not product class methods.
TelemetryRuntimeHost and Observer are unaware the control port exists.
src/ does not import qa/.
New files:
src/runtime/engineControlPort.ts — type-only EngineControlPort + DiskIngestPort
ReconciliationController changes:
- Remove __qaExternalEditPolicyOverride field
- Remove __qaOnlyForceSyncFileFromDiskUnsafe method
- Remove __qaOnlyPauseEditorBindingPropagationUnsafe method
- Remove __qaOnlyResumeEditorBindingPropagationUnsafe method
- Remove __qaOnlySetExternalEditPolicyOverrideUnsafe method
+ Add getEffectiveExternalEditPolicy? dep (replaces override field)
+ Add registerDiskIngestPort? dep (fires at construction with private closure)
EditorBindingManager changes:
- Remove qaPaused field from EditorBinding
- Remove __qaOnlyPauseBindingPropagationUnsafe method
- Remove __qaOnlyResumeBindingPropagationUnsafe method
+ Add optional BindingPropagationGate constructor parameter
+ gate.isPaused(path) replaces binding.qaPaused in health/propagation checks
+ gate.registerReconfigureHook supplies CM compartment reconfigure callback
main.ts changes:
+ Private Engine control state: diskIngestPort, externalEditPolicyOverride,
pausedEditorPropagationPaths, bindingReconfigureHook
+ engineControlPort assembled from private closures
+ getEngineControlPort() public method for Puppeteer harness duck-typing
+ ReconciliationController and EditorBindingManager wired with control deps
qa/harness changes:
qaDebugApi.ts: calls plugin.getEngineControlPort() instead of __qaOnly methods
installPuppeteerRuntime.ts: passes getEngineControlPort to buildQaDebugApi
yaosUnsafeQaPort.ts: method names updated (ingestDiskFileNow, pause/resume,
setExternalEditPolicyOverride)
scenarios s06a, s10g: renamed method calls
two-device.ts: evalRaw strings updated
tests: fixtures now capture DiskIngestPort via registerDiskIngestPort and
expose ingestDiskFileNow() helper; __qaOnlyForceSyncFileFromDiskUnsafe
calls replaced across controller-recovery-orchestration*.ts and
frontmatter-guard-orchestration.ts
Verification:
npm run build PASS
npm run test:regressions 84/84
npm run lint:changed PASS
npm run guard:qa-isolation PASS
guard:production-bundles:strict PASS
grep __qaOnly|Unsafe|ForceSync main.js = 0
…ARNESS_ENABLED__ esbuild define Problem: main.js still contained getEngineControlPort, ingestDiskFileNow, pauseEditorPropagation, resumeEditorPropagation, setExternalEditPolicyOverride after P2. The guard only checked old __qaOnly/Unsafe/ForceSync vocabulary — the renamed capability passed the old guard undetected. Fix: - esbuild.config.mjs: add define __YAOS_QA_HARNESS_ENABLED__=false for mainContext (production). Dead-code elimination removes all gated blocks. New qa-product build mode (product-main.js) sets it to true. - src/main.ts: replace four individual QA fields + engineControlPort eager literal with a single _qaState field (null in production). All QA logic lives inside if (__YAOS_QA_HARNESS_ENABLED__) blocks. getEngineControlPort is dynamically attached as an instance property inside the gate — it never appears on the class prototype in production bundles. - src/lab/labRuntimeHost.ts: remove getEngineControlPort() — LabRuntimeHost is visible to Observer/telemetry and must not know the control port exists. - qa/harness/installPuppeteerRuntime.ts: introduce local PuppeteerRuntimeHost extending LabRuntimeHost with getEngineControlPort(). Type boundary is now qa/ only. - scripts/guard-production-bundles.mjs: add getEngineControlPort, pauseEditorPropagation, resumeEditorPropagation, setExternalEditPolicyOverride to MAIN_FORBIDDEN. Update docstring: P2 deferred seams are done; P3 is done. - qa/scripts/prepare-vault.ts: QA vault setup now copies product-main.js (the QA-enabled build) instead of the production main.js. - package.json: add build:qa-product script. Verification: npm run build PASS npm run guard:production-bundles:strict PASS grep capability names main.js 0 hits (4 names gone) npm run build:qa-product PASS (product-main.js has capability names) npm run test:regressions 84/84 npm run lint:changed PASS npm run guard:qa-isolation PASS npm run verify:bundles PASS
… Observer to src/telemetry
Dead code removal:
- Delete qa/harness/installPuppeteerRuntime.ts — exported installLabRuntime
but had zero callers anywhere in the codebase. Audit confirmed no external
injector, no call site in src/, qa/, tests/, or scripts/.
- Delete src/lab/labRuntimeHost.ts — its only consumer was
installPuppeteerRuntime.ts. LabRuntimeHost was the Puppeteer host interface
and has no place in src/.
- Delete src/lab/debug/ports/index.ts and yaosDebugPort.ts — redundant shims
pointing at src/telemetry/debug/ports/; zero importers.
Dead telemetry mount API removal:
- Remove onTelemetryApiMounted / onTelemetryApiUnmounted from TelemetryRuntimeHost.
installTelemetryRuntime never called onTelemetryApiMounted (the mount was
asymmetric — unmount was called, mount was not). Telemetry is passive and
must not mount window.__YAOS_DEBUG__ — that is a Puppeteer/QA control surface,
not a telemetry surface. The defensive delete of __YAOS_DEBUG__ in onunload
remains sufficient.
- Remove the corresponding dead callbacks from the host object literal in main.ts.
- Remove the host.onTelemetryApiUnmounted() call from installTelemetryRuntime
dispose() — nothing to unmount since nothing was ever mounted.
src/lab → src/telemetry rename:
- Move all passive Observer files out of the lying src/lab/ directory:
src/lab/debug/{flightEvents,flightRecorder,flightTraceController,
flightTraceSink,pathIdentity,trace}.ts
src/lab/diagnostics/{deviceWitnessTracker,diagnosticsBundle,
diagnosticsService,pathRedactor,witnessStateHash}.ts
→ src/telemetry/debug/ and src/telemetry/diagnostics/ respectively.
- Update all import references in src/, qa/, tests/ (53 files total).
- src/lab/ is now completely gone.
Guard updates:
- Remove installLabRuntime pattern from guard-qa-isolation.mjs TELEMETRY_FORBIDDEN
(the file is deleted).
Verification:
npm run build PASS
npm run build:qa-product PASS
npm run guard:production-bundles:strict PASS
npm run guard:qa-isolation PASS
npm run verify:bundles PASS
npm run test:regressions 84/84
npm run lint:changed PASS
rg 'lab/' src/ qa/ tests/ scripts/ → 0 (code files)
rg 'LabRuntimeHost|installLabRuntime|onTelemetryApiMounted|onTelemetryApiUnmounted' → 0
…cker/trace accessors
Problem (found by P4B audit):
window.__YAOS_DEBUG__ was never mounted by any in-repo code path.
- onTelemetryApiMounted was removed (P4A) — it was already dead before P4A
since installTelemetryRuntime never called it
- installPuppeteerRuntime.ts was deleted (P4A) — it was a dead export
Every QA scenario timed out at waitForQaReady() because __YAOS_DEBUG__ was
undefined. The break predated P4A; P4A just removed the dead code hiding it.
Fix — harness Obsidian plugin is the mount point:
qa/obsidian-harness/main.ts now calls mountYaosDebugApi() on load.
The method:
1. Accesses app.plugins.plugins['yaos'] (the product plugin)
2. Accesses plugin.lab (TelemetryRuntimeHandle, private via as-any cast)
3. Assembles a PluginHandle object delegating to product + telemetry handle
4. Calls buildQaDebugApi(pluginHandle) from qa/harness/qaDebugApi.ts
5. Assigns result to window.__YAOS_DEBUG__
Cleanup: onunload deletes __YAOS_DEBUG__ in addition to __YAOS_QA__.
This keeps the product plugin as a passive black box — it never mounts its
own crash-test remote. The harness is responsible for the mount.
No src/ → qa/ imports introduced.
Telemetry handle accessors:
Added optional getDeviceWitnessTracker?() and getFlightTraceController?() to
TelemetryRuntimeHandle (src/telemetry/installTelemetryRuntime.ts).
Both are read-only accessors to existing internal Observer objects.
Optional (?), so no existing callers are affected.
Needed by the harness PluginHandle assembly for witness primitives and
phase-event recording in scenario runs (S11+).
ScenarioStateController:
Harness plugin creates its own ScenarioStateController instance, passed as
getScenarioController() on the PluginHandle. Scenario step tracking works.
Harness rebuild: qa/obsidian-harness/main.js rebuilt to include mount code.
Verification:
npm run build PASS
npm run build:qa-product PASS
node qa/obsidian-harness/esbuild.mjs production PASS
npm run guard:production-bundles:strict PASS
npm run guard:qa-isolation PASS
npm run test:regressions 84/84
npm run lint:changed PASS
grep __YAOS_DEBUG__ qa/obsidian-harness/main.js → 8 hits (mount + cleanup)
grep __YAOS_DEBUG__ main.js → 1 hit (defensive delete in onunload only)
…smoke test
Fixes to P4B found by review:
Issue 1 — Mount timing / partial-mount risk:
Added four explicit guards to mountYaosDebugApi() that fail loudly before
any PluginHandle is assembled:
Guard 1: product plugin 'yaos' must be loaded
Guard 2: getEngineControlPort must exist (confirms QA product build, not
production main.js which dead-code-eliminates the method)
Guard 3: product.lab must exist (confirms qaDebugMode:true and that
installTelemetryRuntime has run)
Guard 4: lab.getDeviceWitnessTracker must be a function (confirms the
P4B telemetry.js build, not a stale pre-P4B bundle)
If any guard fails, no __YAOS_DEBUG__ is mounted and a loud Notice + console
error names exactly what is wrong. waitForQaReady() will time out cleanly
rather than letting a half-working API surface mid-scenario failures.
Plugin load order is guaranteed by prepare-vault.ts: 'do-sync' is written
before 'yaos-qa-harness' in community-plugins.json; Obsidian awaits each
plugin's onload() sequentially.
Issue 2 — connectProvider semantics (CONFIRMED CORRECT, no change):
setQaNetworkHold('offline') disconnects + blocks reconnects.
setQaNetworkHold('online') releases hold + triggers reconnect.
Verified in src/runtime/connectionController.ts lines 62-82.
Issue 3 — exportFlightTrace returning null (real bug, fixed):
buildQaDebugApi throws 'Flight trace export failed' if plugin.exportFlightTrace
returns null. Previous implementation always returned null.
Fix: call FlightTraceController.exportTrace() directly via the
getFlightTraceController() accessor added in P4B, using
lab.diagnosticsService.ensureDiagnosticsDir() for the diagDir argument.
Returns result.path if ok, null otherwise. Correctly surfaces 'no active
trace' errors through the throw in buildQaDebugApi rather than swallowing
them silently.
Smoke test (15/15 PASS):
Verified with Node/jiti smoke covering all four guards and the happy path:
Guard 1-4: each guard fires correctly, no mount occurs
Happy path: buildQaDebugApi returns an object with all required methods
isLocalReady() returns false for null vaultSync (correct)
ingestDiskFileNow() delegates to getEngineControlPort() (correct)
…ectron renderer
Problem discovered during P4C live Obsidian smoke test:
import() fails:
Obsidian's renderer resolves dynamic import() via the app://obsidian.md
scheme. Absolute filesystem paths outside the app bundle produce:
TypeError: Failed to fetch dynamically imported module:
app://obsidian.md/home/kavin/.../telemetry.js
require() fails:
require() loads the file but telemetry.js runs in Node's standard CJS
module context where require('obsidian') is not available (Obsidian only
patches require in the plugin's own module, not sub-modules). Produces:
Error: Cannot find module 'obsidian'
Root cause:
__dirname in Obsidian's Electron renderer is the ASAR renderer directory
(/usr/lib/electronNN/resources/electron.asar/renderer), not the plugin
directory. Both the old path and the corrected basePath path failed for
different reasons depending on the load mechanism.
Fix:
Read telemetry.js from disk with fs.readFileSync, then evaluate it with
new Function(), passing the current require (Obsidian's patched require
that provides 'obsidian', 'electron', etc.) as an argument. The evaluated
code runs with the correct module resolution context.
Plugin directory is resolved via vault adapter basePath + manifest.dir,
which is the correct Obsidian API for this purpose.
P4C live smoke results (19/20):
PASS window.__YAOS_DEBUG__ is an object
PASS window.__YAOS_QA__ is an object
PASS waitForQaReady condition passes
PASS all required __YAOS_DEBUG__ methods present (14 checks)
PASS isLocalReady() returns boolean
PASS getConnectionState() returns string
PASS ingestDiskFileNow accessible (engine control port confirmed)
PASS getDiskHash on absent path returns null
PASS scenario registry has 38 scenarios
FAIL waitForIdle(5000) timed out — expected, vault not connected to
a live server in this session; not a code defect
Verification:
npm run build PASS
npm run build:qa-product PASS
npm run guard:production-bundles:strict PASS
npm run test:regressions 84/84
npm run lint:changed PASS
Live Obsidian smoke 19/20 (1 environmental non-issue)
…ctor Root cause: The P1 refactor (3776255 'refactor(runtime): split Engine, Observer telemetry, and Puppeteer harness') accidentally reverted the schema v3 sync implementation that was introduced in 86b1de3 ('feat(sync): schema v3 nested Y.Map metadata'). The deployed server at kavin-yaos.ripplor.workers.dev has SERVER_MIN_SCHEMA_VERSION=3. The plugin was reverted to SCHEMA_VERSION=2, causing the compatibility guard to block all sync with: 'This server requires schema version 3 or newer.' Status bar showed 'CRDT: Error'. Restored files: src/sync/schema.ts (new) — SCHEMA_VERSION = 3 constant, Obsidian-free src/sync/fileMeta.ts (restored) — unified v2/v3 dual-shape metadata helpers: decodeFileMeta, getMetaPath, ensureNestedMetaEntry, createNestedActiveMeta/DeletedMeta, buildMetaSnapshot, computeMetaSemanticChanges, observeMetaChanges API src/sync/vaultSync.ts (restored) — imports SCHEMA_VERSION from schema.ts (v3), adds observeMetaChanges() subscription API, _metaDeepObserver with incremental diff, MetaSemanticChange dispatch, markSchemaV3() src/sync/diskMirror.ts (restored) — uses observeMetaChanges subscription instead of shallow meta.observe; handles v3 nested Y.Map mutations for disk ops; consumeRemoteRename for analyzer remoteOrigin exemption Updated for P1/P2/P3/P4 path changes: src/sync/vaultSync.ts — ../debug/trace → ../observability/traceContext ../debug/flightEvents → ../telemetry/debug/flightEvents src/sync/diskMirror.ts — ../debug/trace → ../observability/traceContext Added: src/main.ts — call markSchemaV3(deviceName) after IDB loads, before auth check src/telemetry/installTelemetryRuntime.ts — update _startDeviceWitnessTracker to use vaultSync.observeMetaChanges() instead of direct meta.observe; correctly handles both v2 flat entries and v3 nested Y.Map mutations; _witnessMetaHandler is now an unsubscribe function (not a Yjs observer) tests/disk-mirror-observer.ts — add observeMetaChanges stub to fake VaultSync Verified: npm run build PASS npm run build:qa-product PASS npm run guard:production-bundles:strict PASS npm run guard:qa-isolation PASS npm run test:regressions 84/84 npm run lint:changed PASS Live Obsidian: status = 'disconnected' (not 'error'), compatibility guard no longer fires, SCHEMA_VERSION=3 matches server requirement
…, add schema guard
Four follow-up items from schema-v3 restoration review:
1. server/src/version.ts — restore SERVER_MIN/MAX_SCHEMA_VERSION = 3
The P1 refactor also reverted these to 2. The deployed server requires 3.
Plugin is now v3. Source and deployment now agree.
Verified live: caps={min:3,max:3}, pluginCompatibilityWarning=null,
connectionState=online, statusSummary.state=connected.
2. installTelemetryRuntime.ts — fix tombstone witness handling
The observeMetaChanges handler was skipping 'deleted' changes entirely.
The v3 main.ts had explicit tombstone→markDirty('tombstone') handling.
Fixed: deleted changes now call markDirty(path, 'tombstone') so the
witness tracker knows to check isCrdtTombstoned() for those paths.
Non-deleted changes retain 'remote-apply' origin (both local and remote).
Also documented the per-Y.Text text observer gap (pre-P1 omission).
3. scripts/guard-schema-version.mjs (new) + package.json
Prevents the P1 regression from recurring. Checks:
- src/sync/schema.ts exists (not deleted by a future refactor)
- vaultSync.ts imports SCHEMA_VERSION from "./schema" (not re-inlined)
- No literal 'export const SCHEMA_VERSION = N' in vaultSync.ts
- SCHEMA_VERSION = expected (currently 3)
- server/src/version.ts min/max = expected
Wired into test:regressions and npm run guard:schema-version.
Simulated the P1 regression (inlined SCHEMA_VERSION = 2): guard FAILS.
Clean state: guard PASSES.
4. Check 5 — Live QA scenario (s01-single-device-basic-edit)
Ran end-to-end against the live connected vault (connectionState=online).
analyzerPassed=true, 0 hard failures in flight trace, tracePath written
(proves exportFlightTrace P4B fix works end-to-end).
Assertion failure: diskEqualsCrdt hash mismatch — environmental stale CRDT
state from prior test runs on this personal vault (server has old file
content that overrides local create). Not a product correctness failure.
Verification:
npm run build PASS
npm run build:qa-product PASS
npm run guard:production-bundles:strict PASS
npm run guard:qa-isolation PASS
npm run guard:schema-version PASS
npm run test:regressions 84/84 (incl. guard:schema-version)
Live: state=connected, warn=null, caps={min:3,max:3}
Live: s01 scenario ran, trace written, analyzer 0 hard failures
## What this does
### s00 smoke scenario + run-smoke-ready.mjs controller (P4C closure)
Adds s00-smoke-trace-export.ts: a stateless required harness liveness gate
that proves window.__YAOS_DEBUG__, window.__YAOS_QA__, QA product build, and
harness plugin are all mounted in live Obsidian.
Adds run-smoke-ready.mjs: CDP controller for P4C. Checks all five pre-conditions,
runs s00 via qa.run(), asserts result.tracePath non-null, and verifies the trace
file exists on disk when QA_VAULT_PATH is set.
Wires qa:smoke-ready script (build + build:qa-product + build:harness + controller).
### s01 unique per-run paths
Replaces the static QA-scratch/s01-basic-edit.md path with a unique per-run
path (timestamp + 5-char random suffix). This eliminates static-path stale CRDT
contamination and ensures any remaining failures are real product bugs, not
test-environment pollution.
### waitForDiskCrdtConverge ordering
Restructures the s01 wait sequence to: waitForReceiptAfter (seeds the CRDT)
then waitForDiskCrdtConverge (ensures content stability). Without the receipt
wait first, waitForDiskCrdtConverge polls against a null CRDT and times out.
### Atomic typeIntoFile (was: character-by-character)
Replaces replaceRange(ch, getCursor()) loop with atomic setValue(current + text).
BEFORE: character-by-character insertion via getCursor(). In headless CDP runs
with no OS focus, getCursor() always returns {line:0,ch:0}, inserting each
character at position 0 and reversing the entire typed string.
AFTER: single atomic CodeMirror document replacement. Tests editor->CRDT->disk
propagation via one y-codemirror reconciliation pass. Does NOT test per-keystroke
behavior, debounced writes, or incremental sync. Comment in editor-ops.ts states
this explicitly.
### schema-version-guard.md
Step-by-step procedure for updating the schema guard when bumping to v4.
## Current status
P4C liveness: PROVEN live via s00 with QA_VAULT_PATH
s01 functional scenario: STILL FAILING
s01 failure is a product/harness RCA issue, not an architecture blocker:
(1) waitForReceiptAfter takes ~18.9s despite receipt confirmed at t+0.67s
— predicate stale after post-confirmation local update resets candidateId
(2) QA file opened in editor at t+0.81s before scenario calls openFile
— source unknown, triggers recovery suppression
(3) getCrdtHash(path) disagrees with checkpoint hashMismatches=0 at t+30s
— QA debug API and internal reconciler using different path/hash lookup
Architecture campaign goals:
Engine/Observer/Puppeteer runtime split: DONE
schema-v3 guard: DONE
production bundle guards: DONE
P4C liveness: DONE
s01: open RCA follow-up
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR closes the architecture/autophagy thread: split the YAOS runtime into Engine (production), Observer (telemetry), and Puppeteer (QA harness), restore schema-v3 correctness, and prove the QA mount path works in live Obsidian.
What changed
Runtime split (P1–P4)
main.jstelemetry.jsqa/obsidian-harness/__YAOS_DEBUG__mountsrc/lab/entirelyEngineControlPortfrom production via esbuild define__qaOnlyEngine seams frommain.jswindow.__YAOS_DEBUG__is now mounted exclusively by the QA harness plugin, never by the product plugintelemetry.jsships as a passive Observer — no QA controls, no Puppeteer logicSchema-v3 restoration
SCHEMA_VERSION = 3inschema.ts,fileMeta.ts,vaultSync.ts,diskMirror.tsSERVER_MIN_SCHEMA_VERSION = 3/SERVER_MAX_SCHEMA_VERSION = 3inserver/src/version.tsmarkDirty(path, "tombstone")) inobserveMetaChangesscripts/guard-schema-version.mjs: 7-condition regression guard wired intotest:regressionsQA harness
qa/obsidian-harness/owns all Puppeteer sources00-smoke-trace-export.ts: stateless required liveness smoke that proves__YAOS_DEBUG__,__YAOS_QA__, QA product build, and harness plugin are all mountedrun-smoke-ready.mjs: CDP controller for P4C — verifies trace file exists on diskqa:smoke-readyscript: build → qa product → harness → live smokes01-single-device-basic-edit.ts: unique per-run paths (prevents stale CRDT contamination)typeIntoFile: replaced character-by-character replaceRange with atomicsetValue(current + text)— avoids cursor-at-0 reversal in headless CDP; documented explicitly as atomic editor transaction, not human-typing simulationDocs
docs/engineering/schema-version-guard.md: step-by-step v4 bump procedureVerification
P4C live smoke (all 9 checks):
What is NOT in this PR (open follow-up issues)
s01 functional scenario: still failing
s01-single-device-basic-editruns but does not pass. The unique-path approach confirmed the failures are product bugs, not static-path contamination. Three open RCA findings:waitForReceiptAftertakes ~18.9s despite receipt confirmed at t+0.67s — predicate stale after post-confirmation local update resets_lastCandidateId; fallback path not resolving as expectedopenFile— source unknown, triggers recovery suppression and external disk writegetCrdtHash(path)disagrees withqa.checkpoint: hashMismatches=0— QA debug API and internal reconciler using different path/hash lookupsThese are product/harness bugs that need a separate RCA. The architecture campaign does not depend on them.
Telemetry fidelity (deferred)
installTelemetryRuntime.tscomment)DeviceWitnessTracker.markDirtylabels local metadata changes asremote-applyArchitecture campaign status