Skip to content

CBG-5460: Re-enable windows toplogy tests#8375

Open
gregns1 wants to merge 14 commits into
mainfrom
CBG-5460
Open

CBG-5460: Re-enable windows toplogy tests#8375
gregns1 wants to merge 14 commits into
mainfrom
CBG-5460

Conversation

@gregns1

@gregns1 gregns1 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

CBG-5460

Adds change to windows builds to get HLC time with GetSystemTimePreciseAsFileTime, reading the high-resolution system clock live on every call.

The Hybrid Logical Clock derives a version's physical component from the wall clock, then clears the low 16 logical bits — making each physical "slot" ~65 µs wide. On Linux/macOS, time.Now() resolves to ~1 ns, so this is a non-issue. On Windows, time.Now() is backed by the coarse system timer (~0.5–15 ms). In topology tests, peers writing in quick succession produced identical HLC versions, because many writes fell inside a single coarse tick.

The initial fix used QueryPerformanceCounter (QPC): snapshot a (QPC ticks, time.Now()) anchor pair once at init(), then compute anchorNanos + elapsedTicks. This resolves to ~100 ns which resolves the resolution issue.

The two clocks ended up with a fixed, never-correcting offset of up to ~15 ms between them — and because it depends on startup timing, the offset is random per process run. Constantly tripping the cv.ver <= cas re-stamp path and producing intermittent topology-test failures.

Switched to use GetSystemTimePreciseAsFileTime which fixes the issues as offers same precision but both base and rosmar read the same system clock

Sync Gateway's HLC and Couchbase Server's CAS are genuinely independent clocks on separate nodes, and the system is designed to converge under that skew — the CAS re-stamp exists precisely for it. When the topology test failed, the data still converged correctly (every peer agreed on the same cv and pv); only the test harness's predicted version was wrong.

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

Integration Tests

Copilot AI review requested due to automatic review settings June 17, 2026 15:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make HLV-based topology tests deterministic on Windows by spacing peer writes to avoid ties caused by Windows’ lower wall-clock precision (independent HLCs producing identical versions).

Changes:

  • Add Windows-only time.Sleep delays before peer document mutations (create/update/delete) to help ensure unique HLV versions across peers.
  • Introduce runtime/time usage in the topology HLV test helpers to gate this behavior by OS.

Comment thread topologytest/hlv_test.go Outdated
Comment thread topologytest/hlv_test.go Outdated
Comment thread topologytest/hlv_test.go Outdated
Comment thread topologytest/hlv_test.go Outdated
@gregns1 gregns1 self-assigned this Jun 17, 2026
@torcolvin

Copy link
Copy Markdown
Collaborator

As discussed offline, see if we can use QueryPerformanceCounter to get a high resolution timestamp to replace for use in HybridLogicalClock. https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps

@gregns1 gregns1 assigned torcolvin and unassigned gregns1 Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants