Skip to content

fw/services/hrm: drop events when subscriber queue is full#1377

Merged
jplexer merged 1 commit into
coredevices:mainfrom
teslabs:hrm-drop-on-full-queue
May 26, 2026
Merged

fw/services/hrm: drop events when subscriber queue is full#1377
jplexer merged 1 commit into
coredevices:mainfrom
teslabs:hrm-drop-on-full-queue

Conversation

@gmarull
Copy link
Copy Markdown
Member

@gmarull gmarull commented May 25, 2026

hrm_manager_new_data_cb asserted that every HRM sample was delivered to every subscriber via PBL_ASSERTN(prv_event_put(...)). For an app/worker subscriber, prv_event_put uses xQueueSendToBack() with a zero timeout, so it returns false the instant the app's event queue is full. A subscribed app that fails to drain its event queue fast enough therefore panicked the whole firmware (observed as an assert at prv_populate_hrm_event in a coredump, due to the noreturn assert's return address landing in the next inlined block).

Drop the sample and bump dropped_events instead of asserting, mirroring the existing behavior for the KernelBG circular buffer. The zero timeout is kept on purpose: the callback holds s_manager_state.lock, so blocking here would risk deadlock/watchdog. The expiring-subscription event now only sets sent_expiration_event on success so it is retried rather than lost.

Re-enable test_hrm_manager (it builds and passes) and add a regression test covering the full-queue drop path.

hrm_manager_new_data_cb asserted that every HRM sample was delivered to
every subscriber via PBL_ASSERTN(prv_event_put(...)). For an app/worker
subscriber, prv_event_put uses xQueueSendToBack() with a zero timeout, so
it returns false the instant the app's event queue is full. A subscribed
app that fails to drain its event queue fast enough therefore panicked
the whole firmware (observed as an assert at prv_populate_hrm_event in a
coredump, due to the noreturn assert's return address landing in the next
inlined block).

Drop the sample and bump dropped_events instead of asserting, mirroring
the existing behavior for the KernelBG circular buffer. The zero timeout
is kept on purpose: the callback holds s_manager_state.lock, so blocking
here would risk deadlock/watchdog. The expiring-subscription event now
only sets sent_expiration_event on success so it is retried rather than
lost.

Re-enable test_hrm_manager (it builds and passes) and add a regression
test covering the full-queue drop path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Gerard Marull-Paretas <gerard@teslabs.com>
@gmarull gmarull requested a review from jplexer as a code owner May 25, 2026 21:06
@jplexer jplexer merged commit 181f844 into coredevices:main May 26, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants