Skip to content

HOLD FOR TESTING - fix(audio): reduce Windows USB audio latency for RX and CW sidetone (#3193)#3194

Closed
M7HNF-Ian wants to merge 1 commit into
aethersdr:mainfrom
M7HNF-Ian:fix/3193-wasapi-usb-latency
Closed

HOLD FOR TESTING - fix(audio): reduce Windows USB audio latency for RX and CW sidetone (#3193)#3194
M7HNF-Ian wants to merge 1 commit into
aethersdr:mainfrom
M7HNF-Ian:fix/3193-wasapi-usb-latency

Conversation

@M7HNF-Ian
Copy link
Copy Markdown
Contributor

Problem

On Windows with class-compliant USB audio interfaces (Scarlett Solo Gen4, Focusrite, etc.), two independent code paths contribute to audible latency:

1. RX audio — AudioEngine::startRxStream() (AudioEngine.cpp)

QAudioSink is constructed and start() called with no prior setBufferSize(). WASAPI shared mode then applies the device's default ring buffer, which for USB class-compliant interfaces is typically 100–300 ms. This stacks on top of m_rxBufferCapMs (200 ms), producing 500 ms+ speaker latency.

2. CW sidetone — CwSidetonePortAudioSink (CwSidetonePortAudioSink.cpp)

Pa_GetDefaultOutputDevice() on Windows returns an MME device (the first enumerated host API), which carries 50–150 ms OS-level buffering. When a user selects a specific output and the name is matched by substring against PortAudio's device list, the same physical device appears under MME, DirectSound, and WASAPI — the old code treated this as an ambiguous match and fell back to paNoDevice. The result is CW timing jitter on fast keying.

Fix

AudioEngine.cpp — add setBufferSize(100 ms) before start() for both the 48 kHz path and the 24 kHz fallback path inside #ifdef Q_OS_WIN, mirroring the explicit 50 ms buffer already set in CwSidetoneQAudioSink.

CwSidetonePortAudioSink.cpp — two changes:

  • defaultPortAudioOutputDevice(): prefer the WASAPI host API on Windows (mirrors the existing Linux JACK preference).
  • findPortAudioOutputDevice(): collect all partial-match candidates; when multiple host APIs expose the same device, resolve to the WASAPI candidate instead of returning paNoDevice with an "ambiguous" warning. Falls back to the existing warning if multiple WASAPI matches exist.

Testing

Tested locally on macOS (no Windows-path regression — #ifdef Q_OS_WIN guards both changes). Windows testing with a Scarlett Solo Gen4 is needed to confirm latency improvement; the reporter in #3193 should be able to verify.

Fixes #3193

…ethersdr#3193)

Two root causes contributed to audible latency on class-compliant USB
interfaces (Scarlett Solo Gen4 and similar) on Windows:

1. AudioEngine::startRxStream() created the QAudioSink without calling
   setBufferSize(), so WASAPI shared mode applied its default ring buffer
   (100–300 ms for USB devices), stacking on top of m_rxBufferCapMs and
   producing 500 ms+ speaker latency. Fix: cap the WASAPI ring to 100 ms
   (≈38 KB at 48 kHz Float32 stereo) before start(), mirroring the
   explicit 50 ms buffer already set in CwSidetoneQAudioSink.

2. CwSidetonePortAudioSink picked devices by first-match, which on Windows
   resolves to an MME device (50–150 ms OS buffering) rather than WASAPI
   (≈10 ms). This caused CW timing jitter on fast keying.
   Fix: defaultPortAudioOutputDevice() now prefers the WASAPI host API on
   Windows (mirroring the Linux JACK preference already in place), and
   findPortAudioOutputDevice() resolves ambiguous multi-API partial matches
   by selecting the WASAPI candidate over MME/DirectSound alternatives.

Fixes aethersdr#3193
@M7HNF-Ian M7HNF-Ian requested a review from a team as a code owner May 26, 2026 17:33
@jensenpat jensenpat changed the title fix(audio): reduce Windows USB audio latency for RX and CW sidetone (#3193) HOLD FOR TESTING - fix(audio): reduce Windows USB audio latency for RX and CW sidetone (#3193) May 26, 2026
@jensenpat
Copy link
Copy Markdown
Collaborator

Thanks Ian, since this touches the audio stack for all users on Windows, I want to give this some soak time for testing.

@jensenpat jensenpat marked this pull request as draft May 26, 2026 19:09
@jensenpat
Copy link
Copy Markdown
Collaborator

jensenpat commented May 26, 2026

@AetherClaude please provide a defensive review of this fix with full Cw and audio sink history on Windows. include a table of previous issues and PRs. because of the audio stack impact I'm especially concerned about fix 1 and it's impact on system and USB devices.

@aethersdr-agent
Copy link
Copy Markdown
Contributor

Defensive review — Windows audio sink history & risk assessment

Pat, here's a defensive look at PR #3194 against the full Win audio/CW history. TL;DR: Fix #2 (PortAudio sidetone) is low-risk and well-scoped. Fix #1 (RX WASAPI ring cap) is correct in intent but has device-specific failure modes worth shaking out before merge.

Historical context — Windows RX / CW sink issues & PRs

# Title Date Outcome Touches
#1149 Crash: UAF in onTxAudioReady after long runtime (macOS) Apr 2026 closed inspired the WASAPI silent-stop watchdog referenced at AudioEngine.cpp:1073-1078
#1303 Audio Loss After Idle (Win) Apr 2026 closed original WASAPI IdleState restart logic
#1405 No audio in PC Audio v0.8.12.2 Apr 2026 closed removed IdleState restart loop that broke Win playback; replaced with zombie-sink watchdog
#1505 RX buffer cap merged shipped introduced m_rxBufferCapMs (default 200 ms, 50–1000 ms) — the engine-side ring, separate from the WASAPI device ring this PR targets
#1705 Telephony (HFP/SCO) output substitution merged shipped macOS-only telephony heuristic at lines 991-1033 — unaffected here
#2120 NR artifact on Win 24 kHz WASAPI merged shipped the existing #ifdef Q_OS_WIN 48 kHz preference block this PR edits
#2528 Ver 0.9.8 TX→RX delay when stop sending CW (Win) May 10 open overlaps symptoms with #3193's "blip on TX→RX" — this PR may partially address
#2583 Audio device hotplug merged shipped every device change re-runs startRxStream(), so the new setBufferSize cap applies on every hotplug
#2629 CWX keying: no sidetone monitor heard (Win/VB-Cable) May 13 closed introduced the Int16 fallback in CwSidetoneQAudioSink:53 — relevant precedent for Win sink negotiation oddities
#2694 CWX sidetone latency + distorted tone, FLEX-8400 (Win) May 15 open distortion + latency — this PR may help latency, will not fix distortion
#2703 USB CW not working on Mac May 15 closed mac-only
#2899 / #2901 Route CW sidetone to selected audio output May 21 merged introduced the PortAudio sidetone sink and the substring-matching device resolver this PR fixes
#2929 WASAPI mono-only USB PnP mic silent-open recovery merged shipped precedent for Win-specific WASAPI quirk handling; pattern is "guard with #ifdef Q_OS_WIN, fall back cleanly" — this PR follows it
#2973 Default audio summary logging merged shipped new logging surface that should now report the 100 ms cap so we can verify field behaviour
#3071 No PC sidetone for HWCW source May 24 closed sidetone routing edge case
#3193 External USB audio interface + Win 11 latency (Scarlett Solo Gen4) May 26 open the root report this PR addresses

Fix #1 (RX setBufferSize(100ms)) — concerns

The change is correctly placed before start() (Qt requires this) and is correctly inside the existing #ifdef Q_OS_WIN block. Buffer math is right for both 48 kHz and 24 kHz paths. But:

  1. setBufferSize is a hint, not a contract. Qt's WASAPI backend may round it up to the device period or ignore it for devices that report a fixed minimum. So worst case it's a no-op, not a regression — but means we cannot guarantee the latency win advertised in the PR description without per-device testing.

  2. 100 ms is aggressive for the long tail of Windows audio endpoints. WASAPI shared-mode period is typically 10 ms on Win10/11 (10× headroom is reasonable), but several endpoint classes can struggle:

    • Bluetooth A2DP outputs (LDAC/aptX) routinely need 150–250 ms.
    • Virtual cables (VB-Cable, VoiceMeeter) — these were the cause of CWX keying with PC Audio: No CW sidetone monitor heard #2629's Int16 fallback. They should tolerate 100 ms but warrant explicit testing.
    • USB 1.1 / UAC1 class-compliant devices (older USB-audio dongles, some headsets) often need 200 ms+ to avoid dropouts.
    • Realtek HDA onboard — typically OK at 100 ms but varies by chipset/driver.

    If any of these underrun, the user gets choppy RX with no obvious diagnostic.

  3. No fallback / no exposure to the existing user-tunable cap. m_rxBufferCapMs is user-adjustable (50–1000 ms, default 200) precisely because users on VPN/SmartLink need to raise it. The new WASAPI ring cap is a hard-coded 100 ms with no escape hatch. If a user raises m_rxBufferCapMs to 500 ms hoping to fix dropouts, the 100 ms WASAPI ring underneath may still underrun. Worth considering tying the WASAPI cap to a fraction of m_rxBufferCapMs, or at least making it adjustable.

  4. No regression for the Audio Loss After Idle #1303/No audio in PC Audio V0.8.12.2 #1405/idle-watchdog path. The state-changed handler at line 1079+ is intact; setBufferSize doesn't change idle/sleep behaviour. Good.

  5. 24 kHz fallback path: math correct, but the fallback exists for a reason — devices that reject 48 kHz are by definition fussy. Applying a 100 ms cap to them as well is the riskier of the two new calls.

Recommended pre-merge test matrix:

Fix #2 (PortAudio sidetone WASAPI preference) — concerns

Much lower risk. Mirrors the existing Linux JACK preference cleanly. Two minor notes:

  • paInDevelopment is used as an "unknown" sentinel for apiType when Pa_GetHostApiInfo returns null. Functionally fine (it will never equal paWASAPI), but a bit opaque — a comment or explicit sentinel would read better.
  • The partials.size() - 1 count in the qCInfo log is safe in the only branch that reaches it (size ≥ 2 by construction).

This change should also help #2528 (TX→RX delay on CW stop) and the latency component of #2694, since both flow through the same sidetone path. Worth cross-linking in the PR body.

Recommendation

Fix #2: ship it after the minor logging tidy.

Fix #1: I'd hold for the test matrix above, and consider whether the WASAPI cap should be derived from m_rxBufferCapMs (e.g., min(m_rxBufferCapMs / 2, 100ms)) rather than hard-coded — that way users on jittery links can still buy themselves headroom by raising the existing knob. The hard-coded 100 ms is fine for the Scarlett/Focusrite class the reporter is on, but the audio stack serves a much broader hardware population than that.

73, Jeremy KK7GWY & Claude (AI dev partner)


🤖 aethersdr-agent · cost: $7.2190 · model: claude-opus-4-7

@jensenpat
Copy link
Copy Markdown
Collaborator

Ian, please roll up CW fix 2 in a standalone PR and I will approve.

I am not going to apply fix 1 for now because it impacts buffer sizes all for interface types, not just USB.

On these sensitive audio stack changes, it's important to roll separate PRs as changes to the audio stack require extensive testing.

Thanks for the fixes!

@jensenpat jensenpat closed this May 28, 2026
ten9876 pushed a commit that referenced this pull request May 28, 2026
…3241)

Standalone split of Fix #2 from #3194, as requested by @jensenpat.

## Problem

`CwSidetonePortAudioSink` picks audio output devices via PortAudio, but
on Windows:

1. **`defaultPortAudioOutputDevice()`** falls straight through to
`Pa_GetDefaultOutputDevice()`, which returns an MME device (the first
host API PortAudio enumerates). MME carries 50–150 ms OS-level
buffering, producing audible CW timing jitter on fast keying.

2. **`findPortAudioOutputDevice()`** matches devices by name substring.
On Windows, the same physical device is enumerated three times under
MME, DirectSound, and WASAPI. The old code treated this as an ambiguous
match and returned `paNoDevice` — causing the sidetone to fall back to
`CwSidetoneQAudioSink` and lose low-latency timing. When it did resolve,
it picked whichever host API appeared first (MME).

## Fix

**`defaultPortAudioOutputDevice()`** — add a WASAPI preference pass
before the `Pa_GetDefaultOutputDevice()` fallback, mirroring the
existing Linux JACK preference exactly. WASAPI shared mode runs at ~10
ms, vs 50–150 ms for MME.

**`findPortAudioOutputDevice()`** — collect all partial-match
candidates, then resolve to the WASAPI entry when exactly one exists.
Logs a clear `qCInfo` message showing which host API won. Falls back to
the existing "matched multiple" warning only if multiple WASAPI entries
are somehow present (pathological case).

## Scope

`CwSidetonePortAudioSink.cpp` only. No changes to `AudioEngine.cpp`, no
RX path changes, no buffer size changes.

## Related

- Partially addresses #3193 (CW sidetone latency component)
- May also help #2528 (TX→RX delay when stopping CW) and the latency
component of #2694
- Sibling of the held #3194 Fix #1 (RX WASAPI ring cap — separate review
track)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

External USB Audio Interface and Windows 11, latency is a huge issue

2 participants