Skip to content

Fix ESP32 RSA watchdog timeout during key generation#3018

Open
valientegaston wants to merge 8 commits into
toitlang:masterfrom
valientegaston:fix-esp32-rsa-deadlock
Open

Fix ESP32 RSA watchdog timeout during key generation#3018
valientegaston wants to merge 8 commits into
toitlang:masterfrom
valientegaston:fix-esp32-rsa-deadlock

Conversation

@valientegaston
Copy link
Copy Markdown
Contributor

Problem
On ESP32, generating an RSA 2048-bit key takes longer than 10 seconds, causing the watchdog timer to kill the process, mistakenly believing it is hung.

Root cause
The mbedtls RSA operations (rsa_generate, rsa_sign, rsa_encrypt, rsa_decrypt) were passing NULL as the context to the rsa_rng callback. The scheduler watchdog monitors the run_timestamp of each process and kills it if it hasn't been updated within the timeout window. Since the RSA key generation is a long-running blocking operation, the timestamp was never updated during its execution.

Fix
The rsa_rng callback is called many times by mbedtls during RSA operations — in particular during key generation, it is called repeatedly to generate prime candidates and random bases for Miller-Rabin primality tests. This makes it a natural and low-overhead place to reset the watchdog by updating the process run_timestamp.
The fix passes the Process* as the context to all rsa_rng calls, and inside rsa_rng the scheduler mutex is taken to safely update the timestamp.

Safety
We verified that primitives execute inside an Unlocker block in Scheduler::run_process, meaning the scheduler mutex is not held during RSA operations. Therefore, taking it inside rsa_rng is safe and will not cause a deadlock.

@valientegaston
Copy link
Copy Markdown
Contributor Author

I also noticed that CONFIG_MBEDTLS_HARDWARE_MPI is explicitly set to n across all ESP32 toolchain configurations (esp32, esp32s2, esp32s3, esp32c3, esp32c6). Was this intentional? I understand it wouldn't help with key generation (as the prime generation and Miller-Rabin tests run in software regardless), but it could potentially speed up everyday RSA operations like signing, verification, and encryption/decryption.

@floitsch
Copy link
Copy Markdown
Member

floitsch commented Apr 6, 2026

Primitives shouldn't be blocking. We want to run other tasks when waiting for a primitive. As such, the scheduler revealed a real issue.
I think the best way is to run the key generation in a separate thread.
Have a look at src/event_sources/async_posix.cc and src/resources/spi_linux.cc (which uses it). With a bit of luck the same approach works on the ESP32.
It would be a bit more complicated:

  • create a "RsaGeneration" resource that contains the thread, and then do the creation of the key using that resource.

I'm guessing we set the MBEDTLS_HARDWARE_MPI to 'n' to reduce the image size. We can definitely enable it again.
I don't remember anymore if the RSA-key generation is behind an sdkconfig flag or not. I think we already have a TOIT_CRYPTO_EXTRA flag, though. So that's probably enough. The new resource should definitely be behind that one.

@valientegaston
Copy link
Copy Markdown
Contributor Author

Hi! I have implemented the RSA key generation following your suggestions:

Non-blocking: The primitives are now non-blocking. The key generation runs in a separate thread, and the Toit Task is suspended while waiting for the resource event.

RsaGeneration Resource: I created a new RsaGeneration resource (contained in StructTag as RsaGenerationResourceTag and RsaGenerationResourceGroupTag) to manage the lifecycle of the asynchronous task.

Architecture: I followed the approach in async_posix.cc, using the AsyncEventThread to coordinate the background generation and the completion event.

Flags & Configuration: The new resource and primitives are correctly guarded by the CONFIG_TOIT_CRYPTO_EXTRA sdkconfig flag as you proposed. I've also updated the sdkconfig.defaults for ESP32 targets to enable this by default.

Binary Compatibility: I made sure to append the new tags at the end of the StructTag enum to avoid any issues with existing bootstrap snapshots.

Please let me know if there's any part of the implementation you'd like me to refine further. Thanks!

- Add RsaGenerationResource_ class with finalizer/close pattern to
  ensure proper cleanup of resource group and state on exceptions
  (e.g. with-timeout).
- Add rsa-generate-close primitive to tear down the resource group.
- Fix dangling proxy in rsa_generate_finish error paths by adding
  missing clear_external_address calls.
- Add no-op Thread::cancel() for ESP32 (required since async_posix
  is now compiled for ESP32).
@floitsch
Copy link
Copy Markdown
Member

floitsch commented Apr 7, 2026

Opened a fix PR on your fork: valientegaston#1

It addresses a few resource cleanup issues:

  • The resource group and state were not cleaned up on exceptions (e.g. with-timeout wrapping the generate call). Added a RsaGenerationResource_ class with the standard finalizer/close pattern.
  • Missing resource_proxy->clear_external_address() on error paths in rsa_generate_finish (dangling pointer).
  • Missing Thread::cancel() on ESP32 (linker error since async_posix is now compiled for ESP32).

floitsch and others added 2 commits April 7, 2026 22:25
Comment thread src/primitive_crypto.cc Outdated
free(prv); free(pub);
} else {
// mbedtls writes from the end. Move to start.
unsigned char* prv_start = (unsigned char*)malloc(prv_len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have enough memory here, we would have to throw away everything we did so far and start again.
Also: I'm not even sure things would behave correctly: By marking the error as MBEDTLS_ERR_PK_ALLOC_FAILED the rsa_generate_finish would return a "malloc failed". This would indicate to the system that the primitive ran out of memory and that we should do a GC, then try again. However, the rsa_generate_finish would just fail now as the resource_proxy just has been cleared.

-> We need to allocate the memory in the main-thread (that has called the primitive) and not inside the event-thread. Unfortunately, this means that we need to know the prv_len und pub_len already before we actually call mbedtls_pk_write_key_der.

That said: it looks to me like a resize is the better option anyway. The prv_len/pub_len just seem to say how much of the given buffer was written too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the two extra malloc calls with memmove on the same buffer to shift the data to the start, followed by realloc to shrink it to the exact size. If realloc fails it returns null without altering the original block, so the original pointer is preserved and passed to set_results correctly — no memory leak or dangling pointer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same still applies to the 'malloc' in line 986/987. (the ret goes to the rsa_generate_finish, leading to an unrecoverable OOM). The mallocs need to be done in the main thread. Here we just use constants (RSA_PRV_DER_MAX_BYTES), so that's relatively easy to fix.

Comment thread src/scheduler.h Outdated
bool is_locked() const { return OS::is_locked(mutex_); }
bool is_boot_process(Process* process) const { return boot_process_ == process; }

Mutex* mutex() const { return mutex_; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary?
If yes, we should try to find a better approach than to make the scheduler's mutex public.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer necessary. In the previous approach, the rsa_rng callback needed to take the scheduler mutex to update the process run_timestamp. With the new async design, the Toit process is suspended during key generation, so there's no need to update the timestamp from the RNG callback at all. The mutex() accessor has been removed.

CONFIG_MBEDTLS_X509_TRUSTED_CERT_CALLBACK=y
CONFIG_MBEDTLS_ECP_RESTARTABLE=y
CONFIG_MBEDTLS_HARDWARE_MPI=n
CONFIG_MBEDTLS_HARDWARE_MPI=y
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any downside to enabling MPI?

For some esp32 variants we were fighting the partition sizes already. So if this increases the image size, we need to make sure we aren't too big now.

Copy link
Copy Markdown
Contributor Author

@valientegaston valientegaston Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I measured the binary size impact on ESP32:

  • Without HARDWARE_MPI: 0x1561f0 bytes (18% free of 0x1a0000 partition)
  • With HARDWARE_MPI: 0x156390 bytes (18% free of 0x1a0000 partition)

The difference is only ~416 bytes.

I only tested on ESP32 so far — still checking the other variants.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESP32S3:

Without HARDWARE_MPI: 0x14d5c0 bytes (20% free)
With HARDWARE_MPI: 0x14d840 bytes (20% free)
Difference: ~640 bytes

Still checking the remaining variants (esp32c3, esp32c6, esp32s2).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESP32S2:

Without HARDWARE_MPI: 0x116a50 bytes (33% free)
With HARDWARE_MPI: 0x116ce0 bytes (33% free)
Difference: ~656 bytes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESP32C3:

Without HARDWARE_MPI: 0x163e00 bytes (14% free)
With HARDWARE_MPI: 0x164560 bytes (14% free)
Difference: ~1888 bytes

ESP32C6:

Without HARDWARE_MPI: 0x1818c0 bytes (11% free)
With HARDWARE_MPI: 0x182020 bytes (11% free)
Difference: ~1888 bytes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.

@floitsch
Copy link
Copy Markdown
Member

@valientegaston do you think you can address the feedback?

@valientegaston
Copy link
Copy Markdown
Contributor Author

Hi! Yes, I'll be addressing the feedback in the coming days. We've been quite busy lately but I'll make sure to go through all the comments and make the necessary changes. Thanks for your patience!

Copy link
Copy Markdown
Member

@floitsch floitsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some subtle issues.
Not completely sure what the best way for the resource handling is.
Maybe one way is to keep track of whether the parallel thread is running and store that in the resource. When the resource gets the signal to destroy itself, it could then delay that action until the thread returns. Alternatively (if possible) killing the thread first, would also work.
We do something slightly similar in the BLE code.

Comment thread src/primitive_crypto.cc Outdated
}

if (ret == 0) {
unsigned char* prv = (unsigned char*)malloc(RSA_PRV_DER_MAX_BYTES);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we free the memory again? Shouldn't that happen in rsa_generate_finish?
But that's more complicated:

  • what if nothing ever calls 'rsa_generate_finish'? This means that the memory needs to be attached to the resource.
  • However, that immediately reveals another race condition (which was already present): what if the program stops while we are running in parallel. The res resource could already be freed at that point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on both points. Here's how they've been addressed:

Buffer ownership: prv_buf_ and pub_buf_ are now members of RsaGenerationResource, allocated in the main thread before launching the background thread. The resource destructor always frees them, so memory is never leaked even if rsa_generate_finish is never called.

Race condition: Addressed with a thread_running_ / destroy_when_done_ pattern protected by the EventSource mutex. If the resource is destroyed while the thread is still running, delete_or_mark_for_deletion sets destroy_when_done_ = true instead of freeing immediately. The actual delete this happens in on_thread_done once the thread finishes — and crucially, after releasing the mutex, to avoid use-after-free on the Locker destructor itself.

Comment thread src/primitive_crypto.cc Outdated
free(prv); free(pub);
} else {
// mbedtls writes from the end. Move to start.
unsigned char* prv_start = (unsigned char*)malloc(prv_len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same still applies to the 'malloc' in line 986/987. (the ret goes to the rsa_generate_finish, leading to an unrecoverable OOM). The mallocs need to be done in the main thread. Here we just use constants (RSA_PRV_DER_MAX_BYTES), so that's relatively easy to fix.

Comment thread src/primitive_crypto.cc Outdated
memmove(prv, prv + RSA_PRV_DER_MAX_BYTES - prv_len, prv_len);
memmove(pub, pub + RSA_PUB_DER_MAX_BYTES - pub_len, pub_len);

unsigned char* prv_resized = (unsigned char*)realloc(prv, prv_len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add the following information:
RSA_PRV_DER_MAX_BYTES is typically ~550 bytes.
Typically, keys only need ~300 bytes.
As such the realloc is useful.
However, if it fails, it's not the end of the world either.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added the following comment above the realloc calls:

// RSA_PRV_DER_MAX_BYTES is typically ~550 bytes, but generated keys
// usually only need ~300 bytes. The realloc shrinks the buffer to the
// actual size to recover the unused memory. If realloc fails, it returns
// null without freeing the original block, so the original pointer is
// preserved and no memory is leaked — it just means we keep the larger
// buffer, which is acceptable.

A second best-effort realloc is also attempted in rsa_generate_finish in case the one in the generation thread failed due to heap fragmentation at that point in time.

Comment thread src/primitive_crypto.cc Outdated
}

if (ret == 0) {
unsigned char* prv = (unsigned char*)malloc(RSA_PRV_DER_MAX_BYTES);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use unvoid_cast instead of C-style casts. Here and in all the other places.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced all C-style casts in the new and modified code with unvoid_cast. Resource type conversions use static_cast as is the convention in the rest of the codebase.

Comment thread src/primitive_crypto.cc Outdated
}

ByteArray* prv_der = process->allocate_byte_array(resource->prv_len());
ByteArray* pub_der = process->allocate_byte_array(resource->pub_len());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to use an external ByteArray and use the existing buffers. That would decrease the peak amount of memory.
I think for 250+ bytes it's ok to have an external byte-array.
However, at that point I would do another realloc, just in case the one in the other thread didn't succeed. If it still doesn't succeed, you can do a MALLOC_FAILED.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach was explored but ran into a safety issue with GC retries: if prv_der allocation succeeds but pub_der fails, the runtime does GC and retries the primitive — but prv_buf_ ownership has already been transferred to the external ByteArray, leaving the resource in an inconsistent state where the private key buffer is gone and cannot be recovered on retry.
Given this complexity and the fact that RSA key generation is a one-time operation, the allocate_byte_array + memcpy pattern was kept instead. A best-effort realloc is attempted in rsa_generate_finish before copying to reduce the buffer sizes as much as possible before the peak occurs. The original buffers are freed immediately after copying on the success path.

Comment thread src/primitive_crypto.cc Outdated
Comment on lines +1039 to +1040
resource->resource_group()->unregister_resource(resource);
resource_proxy->clear_external_address();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit too eager.
For OOMs the primitive will do a GC and then try again. If you destroy the resource the second attempt wouldn't work anymore.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. unregister_resource and clear_external_address are now only called on the success path and on non-recoverable mbedtls errors. All ALLOCATION_FAILED paths return early without touching the resource, keeping it intact for GC retry.

@valientegaston
Copy link
Copy Markdown
Contributor Author

Still some subtle issues. Not completely sure what the best way for the resource handling is. Maybe one way is to keep track of whether the parallel thread is running and store that in the resource. When the resource gets the signal to destroy itself, it could then delay that action until the thread returns. Alternatively (if possible) killing the thread first, would also work. We do something slightly similar in the BLE code.

Implemented the first approach — tracking whether the background thread is running via thread_running_ and destroy_when_done_ flags, protected by the EventSource mutex. If the resource receives the signal to destroy itself while the thread is still active, destruction is deferred until on_thread_done is called. The BLE code was used as a reference for the general pattern.

Comment thread src/primitive_crypto.cc
int error_ = 0;
int bits() const { return bits_; }

bool thread_running_ = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these fields should not be public.

Comment thread src/primitive_crypto.cc
mbedtls_pk_free(&pk);
res->set_error(ret);
res->error_ = ret;
return (word)1; // Indicate done.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need that cast? If yes, use 'static_cast'.

Comment thread src/primitive_crypto.cc
ARGS(RsaGenerationResource, resource);
if (resource->error() != 0) {
int err = resource->error();
if (resource->error_ != 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't expose the private fields.

Comment thread src/primitive_crypto.cc
int err = resource->error_;
resource->resource_group()->unregister_resource(resource);
resource_proxy->clear_external_address();
return tls_error(null, process, err);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably check whether this is an oom error. Not sure that can actually happen, but if it can, then just returning the oom error here would not do what we want.
Instead, we should replace the oom error with something recognizable, and then do a retry-loop from the Toit code. (with a GC in between).
That said: getting an OOM from mbedtls is maybe extremely rare (since we pass in the buffers), so maybe OOMing at that point is ok. But that would still require a few changes. Remember: if a primitive returns OOM the system runs the GC and tries again (up to 3?) times. With the current code that would yield an error as we cleared the resource-proxy. -> we can't repeat the primitive call.

# CONFIG_TOIT_INTERPRETER_HELPERS_IN_IRAM is not set
CONFIG_TOIT_CRYPTO=y
# CONFIG_TOIT_CRYPTO_EXTRA is not set
CONFIG_TOIT_CRYPTO_EXTRA=y
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a change in sdkconfig.defaults as well.

@floitsch
Copy link
Copy Markdown
Member

floitsch commented May 7, 2026

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Walkthrough

This PR adds asynchronous RSA key generation capabilities to the Toit language. RsaKey.generate() now attempts an async resource-based path using event-source threading and falls back to synchronous generation if unavailable. Updates include async platform support for ESP32, extended crypto error routing through Process*, new resource types and tags, and build-time enablement of the feature across all ESP32 variants via CONFIG_TOIT_CRYPTO_EXTRA and CONFIG_MBEDTLS_HARDWARE_MPI.

Changes

Async RSA Key Generation Feature

Layer / File(s) Summary
Resource Types & Tags
src/tags.h
New struct tags RsaGenerationResourceTag and RsaGenerationResourceGroupTag are added, and MbedTlsSocket is included in the NON_BLE_RESOURCE_CLASSES_DO macro.
Primitive Declarations & Argument Unpacking
src/primitive.h
MODULE_CRYPTO is extended with rsa_generate_init, rsa_generate_start, rsa_generate_finish, and rsa_generate_close primitives; argument-unpacking macros _A_T_RsaGenerationResourceGroup and _A_T_RsaGenerationResource are added.
Platform Threading & Async Support
src/event_sources/async_posix.h, src/event_sources/async_posix.cc, src/os_esp32.cc
Async event source headers and implementations are broadened to compile on both POSIX and ESP32 via TOIT_POSIX || TOIT_ESP32 guard; Thread::cancel() is implemented as a no-op on ESP32.
Crypto Implementation & Error Routing
src/primitive_crypto.cc
RNG callback rsa_rng is simplified; rsa_parse_key_from_blob is extended with Process* parameter for error routing via tls_error; async RSA generation subsystem is introduced under CONFIG_TOIT_CRYPTO_EXTRA with DER buffer allocation, event-threaded key generation, and safe GC-aware copying on finish. All RSA operations (sign, verify, encrypt, decrypt, DER export) are updated to pass Process* for error propagation.
Compiler Type Propagation
src/compiler/propagation/type_primitive_crypto.cc
Type primitive registrations for the rsa_generate_* family are added.
Toit Library API & Resource Management
lib/crypto/rsa.toit
ResourceState_ monitor is imported; RsaKey.generate() prefers async resource-based generation (init/start/wait/finish with cleanup) and falls back to sync when async is unavailable; new RsaGenerationResource_ class wraps the generation lifecycle and guarantees cleanup via finalizer.
Build Configuration
toolchains/esp32*/sdkconfig, toolchains/esp32*/sdkconfig.defaults
All five ESP32 targets (esp32, esp32c3, esp32c6, esp32s2, esp32s3) have CONFIG_TOIT_CRYPTO_EXTRA enabled and CONFIG_MBEDTLS_HARDWARE_MPI switched from disabled to enabled, with corresponding MPI interrupt settings configured where applicable.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • toitlang/toit#2984: Introduces foundational RSA support that this PR builds upon by adding asynchronous key-generation primitives, resource types, and updated crypto error handling for async operations.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main problem solved by this PR: fixing the ESP32 RSA watchdog timeout during key generation by implementing async RSA operations.
Description check ✅ Passed The description clearly explains the problem, root cause, and the chosen solution with safety considerations, and is fully related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/tags.h`:
- Around line 145-147: RsaGenerationResourceTag and
RsaGenerationResourceGroupTag are declared outside the Resource/ResourceGroup
tag ranges so MAKE_UNPACKING_MACRO's strict checks (using ARGS(Resource, ...)
and ARGS(ResourceGroup, ...)) will reject RSA objects with WRONG_OBJECT_TYPE;
either confirm RSA tags must remain separate or move them inside the proper
ranges to be accepted by generic handlers. Fix by relocating the
RsaGenerationResourceTag and RsaGenerationResourceGroupTag declarations into the
ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag
ranges (or alternatively add explicit handling in MAKE_UNPACKING_MACRO for these
symbols), then run tests around primitive_events.cc usages (read_state,
register_monitor_notifier) to ensure RSA resources are accepted or intentionally
rejected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4394fb89-3cfb-432c-8de4-692fe86ee959

📥 Commits

Reviewing files that changed from the base of the PR and between 6de7f07 and d5ab232.

📒 Files selected for processing (18)
  • lib/crypto/rsa.toit
  • src/compiler/propagation/type_primitive_crypto.cc
  • src/event_sources/async_posix.cc
  • src/event_sources/async_posix.h
  • src/os_esp32.cc
  • src/primitive.h
  • src/primitive_crypto.cc
  • src/tags.h
  • toolchains/esp32/sdkconfig
  • toolchains/esp32/sdkconfig.defaults
  • toolchains/esp32c3/sdkconfig
  • toolchains/esp32c3/sdkconfig.defaults
  • toolchains/esp32c6/sdkconfig
  • toolchains/esp32c6/sdkconfig.defaults
  • toolchains/esp32s2/sdkconfig
  • toolchains/esp32s2/sdkconfig.defaults
  • toolchains/esp32s3/sdkconfig
  • toolchains/esp32s3/sdkconfig.defaults

Comment thread src/primitive_crypto.cc
Comment on lines +1055 to +1074
PRIMITIVE(rsa_generate_finish) {
ARGS(RsaGenerationResource, resource);
if (resource->error_ != 0) {
int err = resource->error_;
resource->resource_group()->unregister_resource(resource);
resource_proxy->clear_external_address();
return tls_error(null, process, err);
}

// Attempt a best-effort final realloc in case the one in the generation
// thread failed due to heap fragmentation at that point in time.
// If this also fails, we harmlessly keep the original (larger) buffer.
if (resource->prv_buf_ != null) {
unsigned char* p = unvoid_cast<unsigned char*>(realloc(resource->prv_buf_, resource->prv_len_));
if (p != null) resource->prv_buf_ = p;
}
if (resource->pub_buf_ != null) {
unsigned char* p = unvoid_cast<unsigned char*>(realloc(resource->pub_buf_, resource->pub_len_));
if (p != null) resource->pub_buf_ = p;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard rsa_generate_finish against unfinished work.

If this primitive is called before the worker has completed, prv_len_/pub_len_ are still 0, so the best-effort realloc(..., 0) can free the buffers while the background thread is still writing into them. Please reject or block unfinished resources before touching error_ or the DER buffers.

Suggested fix
 PRIMITIVE(rsa_generate_finish) {
   ARGS(RsaGenerationResource, resource);
+  {
+    Locker locker(resource->resource_group()->event_source()->mutex());
+    if (resource->thread_running_) FAIL(INVALID_STATE);
+  }
+
   if (resource->error_ != 0) {
     int err = resource->error_;
     resource->resource_group()->unregister_resource(resource);
     resource_proxy->clear_external_address();
     return tls_error(null, process, err);

Comment thread src/tags.h
Comment on lines +145 to +147
ChannelTag,
RsaGenerationResourceTag,
RsaGenerationResourceGroupTag
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Tag declarations and direct references"
rg -n "RsaGenerationResource(Tag|GroupTag)|ResourceMinTag|ResourceMaxTag|ResourceGroupMinTag|ResourceGroupMaxTag" src

echo
echo "2) Range-based resource checks that could exclude appended RSA tags"
rg -nP ">=\\s*ResourceMinTag.*<=\\s*ResourceMaxTag|>=\\s*ResourceGroupMinTag.*<=\\s*ResourceGroupMaxTag|is_resource|is_resource_group" src

echo
echo "3) RSA generation resource class/tag declarations"
rg -n "class\\s+RsaGenerationResource|class\\s+RsaGenerationResourceGroup|TAG\\(RsaGenerationResource\\)|TAG\\(RsaGenerationResourceGroup\\)" src

Repository: toitlang/toit

Length of output: 560


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Context around tag declarations ==="
sed -n '120,150p' src/tags.h

echo
echo "=== 2) Search for enum/tag validation functions ==="
rg -n "is_resource|is_resource_group|ResourceTag|resource_tag" src --type-list | head -20

echo
echo "=== 3) Usage of RsaGenerationResourceTag ==="
rg -n "RsaGenerationResourceTag" src -B 2 -A 2

echo
echo "=== 4) Resource type validation patterns ==="
rg -n "switch.*tag|case.*ResourceTag|case.*Resource" src | head -20

Repository: toitlang/toit

Length of output: 1209


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Search for resource type validation/checking ==="
rg -n "is_resource|is_resource_group|ResourceMinTag|ResourceMaxTag" src -A 2 -B 2

echo
echo "=== 2) Usage context for RsaGenerationResourceTag ==="
rg -n "RsaGenerationResourceTag\|RsaGenerationResourceGroupTag" src -B 3 -A 3

echo
echo "=== 3) Resource class definitions and lifecycles ==="
rg -n "class.*Resource\|RsaGenerationResource" src --type-list=cpp --type-list=h | head -30

Repository: toitlang/toit

Length of output: 754


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Search for RsaGenerationResource class (without Tag) ==="
rg -n "RsaGenerationResource" src -B 2 -A 5

echo
echo "=== 2) Files that import or reference tags.h ==="
rg -l "tags.h|#include.*tags" src | head -10

echo
echo "=== 3) Check what happens with Misc section tags ==="
rg -n "FontTag|ImageOutputStreamTag|ChannelTag" src | head -20

echo
echo "=== 4) Search for resource tag enum usage patterns ==="
rg -n "case.*Tag:|switch.*tag" src | head -30

Repository: toitlang/toit

Length of output: 8882


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Find TAG() macro definition ==="
rg -n "^#define TAG|^TAG(" src -B 2 -A 5 | head -50

echo
echo "=== 2) Find ARGS macro and how it validates types ==="
rg -n "^#define ARGS\|MAKE_UNPACKING_MACRO" src -B 2 -A 8 | head -80

echo
echo "=== 3) Search for any validation using ResourceMinTag/ResourceMaxTag ==="
rg -n "ResourceMinTag|ResourceMaxTag|ResourceGroupMinTag|ResourceGroupMaxTag" src -B 3 -A 3

echo
echo "=== 4) Check resource.h for type checking logic ==="
grep -n "class Resource\|class ResourceGroup\|tag\|Tag" src/resource.h | head -40

Repository: toitlang/toit

Length of output: 201


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Find TAG macro definition ==="
rg -n "#define TAG" src -A 5

echo
echo "=== 2) Type validation in Resource classes ==="
cat src/resource.h | head -100

echo
echo "=== 3) Search for where resource types are validated/checked ==="
rg -n "class_tag\|is_resource\|get_tag" src | head -30

echo
echo "=== 4) Check if any code switches on tag ranges ==="
rg -n "ResourceMinTag\|ResourceMaxTag" src -B 2 -A 2

Repository: toitlang/toit

Length of output: 5390


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Search for actual usage of tag_min/tag_max ==="
rg -n "tag_min\|tag_max" src -B 2 -A 2 | head -60

echo
echo "=== 2) Look for proxy validation in objects.h ==="
rg -n "is_resource\|class_tag\|get_tag" src/objects.h | head -40

echo
echo "=== 3) Check how MAKE_UNPACKING_MACRO validates types ==="
rg -n "MAKE_UNPACKING_MACRO" src/primitive.h -B 5 -A 15 | head -80

Repository: toitlang/toit

Length of output: 111


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Find MAKE_UNPACKING_MACRO definition ==="
rg -n "MAKE_UNPACKING_MACRO" src/primitive.h -B 3 -A 20 | head -100

echo
echo "=== 2) Search for class_tag checks in objects.h ==="
rg -n "class_tag\|get_tag" src/objects.h -B 2 -A 2 | head -60

echo
echo "=== 3) Find proxy validation (how ARGS validates) ==="
rg -n "PROXY_CLASS\|proxy->class_tag" src -B 2 -A 2 | head -50

Repository: toitlang/toit

Length of output: 9191


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Find all ARGS(Resource, usages ==="
rg -n "ARGS.*Resource" src -B 2 -A 2 | grep -E "ARGS\(Resource|RsaGeneration"

echo
echo "=== 2) Check if Resource base class is used generically ==="
rg -n "Resource\*|Resource::tag" src | head -30

echo
echo "=== 3) Verify tag value assignments in tags.h ==="
sed -n '120,150p' src/tags.h

Repository: toitlang/toit

Length of output: 4102


🏁 Script executed:

#!/bin/bash
set -euo pipefixo

echo "=== 1) Check primitive_events.cc line 28 context ==="
sed -n '20,40p' src/primitive_events.cc

echo
echo "=== 2) Find all Resource base class usage in lifecycle ==="
rg -n "register_resource\|unregister_resource" src | head -20

echo
echo "=== 3) Check if RsaGenerationResource registers with base Resource handling ==="
rg -n "register_resource.*Resource\|resource_group\(\)" src/primitive_crypto.cc -B 2 -A 2

echo
echo "=== 4) Verify Resource class tag range ==="
rg -n "TAGS(Resource)" src/tags.h -B 5 -A 5

Repository: toitlang/toit

Length of output: 111


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Check primitive_events.cc line 28 context ==="
sed -n '20,40p' src/primitive_events.cc

echo
echo "=== 2) Find all Resource base class usage in lifecycle ==="
rg -n "register_resource\|unregister_resource" src | head -20

echo
echo "=== 3) Check if RsaGenerationResource registers with base Resource handling ==="
rg -n "register_resource.*Resource\|resource_group\(\)" src/primitive_crypto.cc -B 2 -A 2

echo
echo "=== 4) Verify Resource class tag range ==="
rg -n "TAGS.Resource" src/tags.h -B 5 -A 5

Repository: toitlang/toit

Length of output: 714


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Check if RsaGenerationResource goes through events/monitoring ==="
rg -n "read_state\|register_monitor\|event_source\|on_event" src/primitive_crypto.cc -B 2 -A 2

echo
echo "=== 2) Look at RsaGenerationResourceGroup base class ==="
sed -n '898,920p' src/primitive_crypto.cc

echo
echo "=== 3) Check what ResourceGroup::event_source returns ==="
rg -n "class.*ResourceGroup\|event_source\(\)" src/resource.h -B 2 -A 5 | head -60

echo
echo "=== 4) Verify actual tag enum values ==="
python3 << 'EOF'
# Count tag positions to understand if RsaGenerationResourceTag is truly outside ResourceMaxTag
tags = [
    "RawByteTag",
    "NullStructTag",
    "MappedFileTag",
    "ResourceMinTag",
    # NON_BLE expanded (need to check)
    "BleResourceMinTag",
    "BleCallbackResourceMinTag",
    # BLE expanded
    "BleReadWriteElementMinTag",
    # BLE_READ_WRITE expanded
    "BleReadWriteElementMaxTag",
    "BleCallbackResourceMaxTag",
    "BleResourceMaxTag",
    "ResourceMaxTag",
    "ResourceGroupMinTag",
    # RESOURCE_GROUP expanded
    "ResourceGroupMaxTag",
    "FontTag",
    "ImageOutputStreamTag",
    "ChannelTag",
    "RsaGenerationResourceTag",
    "RsaGenerationResourceGroupTag"
]
for i, tag in enumerate(tags):
    print(f"{i}: {tag}")
EOF

Repository: toitlang/toit

Length of output: 130


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Look at ResourceGroup implementation ==="
sed -n '100,150p' src/resource.h

echo
echo "=== 2) Check RsaGenerationResourceGroup constructor and event source ==="
sed -n '898,910p' src/primitive_crypto.cc

echo
echo "=== 3) Search for event monitoring of RSA resources ==="
rg -n "RsaGenerationResource\|event_source" src/primitive_crypto.cc | head -30

echo
echo "=== 4) Direct check: Does RsaGenerationResourceGroup have an event source? ==="
rg -n "EventSource\|event_source\|on_event" src/primitive_crypto.cc -B 1 -A 1

Repository: toitlang/toit

Length of output: 2083


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Find RsaGenerationEventSource ==="
rg -n "RsaGenerationEventSource" src -B 3 -A 10

echo
echo "=== 2) Check if ResourceGroup::register_resource is called with RsaGenerationResource ==="
rg -n "register_resource" src/primitive_crypto.cc -B 2 -A 2

echo
echo "=== 3) Look for event dispatch on RSA resources ==="
rg -n "dispatch\|notify.*resource" src -B 2 -A 2 | grep -A 4 -B 4 "Rsa\|crypto"

echo
echo "=== 4) See the actual tag values ==="
sed -n '120,160p' src/tags.h | cat -n

Repository: toitlang/toit

Length of output: 5764


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Check tag range ordering in enum ==="
python3 << 'EOF'
# Trace through the enum to understand actual numeric values
lines = """  RawByteTag = 0,
  NullStructTag = 1,
  MappedFileTag = 2,

  // Resource subclasses.
  ResourceMinTag,
  NON_BLE_RESOURCE_CLASSES_DO(MAKE_ENUM)
  BleResourceMinTag,
  BleCallbackResourceMinTag,
  BLE_CLASSES_DO(MAKE_ENUM)
  BleReadWriteElementMinTag,
  BLE_READ_WRITE_CLASSES_DO(MAKE_ENUM)
  BleReadWriteElementMaxTag,
  BleCallbackResourceMaxTag,
  BleResourceMaxTag,
  ResourceMaxTag,

  // ResourceGroup subclasses.
  ResourceGroupMinTag,
  RESOURCE_GROUP_CLASSES_DO(MAKE_ENUM)
  ResourceGroupMaxTag,

  // Misc.
  FontTag,
  ImageOutputStreamTag,
  ChannelTag,
  RsaGenerationResourceTag,
  RsaGenerationResourceGroupTag"""

print("Key observation:")
print("- ResourceMaxTag marks end of Resource subtypes")
print("- ResourceGroupMaxTag marks end of ResourceGroup subtypes")
print("- RsaGenerationResourceTag is in Misc section AFTER both ranges")
print("\nIf Resource::tag_max = ResourceMaxTag, then:")
print("  RsaGenerationResourceTag > ResourceMaxTag")
print("  → ARGS(Resource, rsa_res) would reject RsaGenerationResource instances")
EOF

echo
echo "=== 2) See if there are any generic Resource usages that could hit RSA ==="
rg -n "read_state\|register_monitor" src -B 1 -A 5

Repository: toitlang/toit

Length of output: 527


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Confirm Resource base class tag validation mechanism ==="
sed -n '1061,1068p' src/primitive.h

echo
echo "=== 2) Check if there are any other misc-section resources and how they're handled ==="
rg -n "FontTag\|ImageOutputStreamTag\|ChannelTag" src -B 2 -A 2 | head -50

echo
echo "=== 3) Search for how these misc types bypass resource validation ==="
rg -n "Font\|ImageOutputStream\|Channel" src/primitive.h | grep "_A_T_" 

echo
echo "=== 4) Final check: any code path where Resource base validation might filter RSA ==="
rg -n "is_byte_array\|has_external_address\|external_tag" src -B 1 -A 1 | grep -v "third_party"

Repository: toitlang/toit

Length of output: 745


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== 1) Check how Font, ImageOutputStream, Channel are handled ==="
rg -n "_A_T_Font\|_A_T_ImageOutputStream\|_A_T_Channel" src/primitive.h -B 1 -A 1

echo
echo "=== 2) Verify if Font/ImageOutputStream/Channel are also in Misc section ==="
rg -n "FontTag\|ImageOutputStreamTag\|ChannelTag" src/tags.h -B 3 -A 3

echo
echo "=== 3) Check if these misc types have unpacking macros (type validation) ==="
rg -n "_A_T_Font\|_A_T_ImageOutputStream\|_A_T_Channel\|_A_T_RsaGenerationResource" src/primitive.h

echo
echo "=== 4) Summary: Are RSA tags truly outside Resource range validation? ==="
python3 << 'EOF'
print("VALIDATION LOGIC:")
print("================")
print()
print("1. Resource class:")
print("   TAG macro: TAGS(Resource)")
print("   → tag_min = ResourceMinTag")
print("   → tag_max = ResourceMaxTag")
print()
print("2. RsaGenerationResource class:")
print("   TAG macro: TAG(RsaGenerationResource)")
print("   → tag_min = RsaGenerationResourceTag")
print("   → tag_max = RsaGenerationResourceTag")
print()
print("3. Enum ordering (from tags.h):")
print("   - ResourceMaxTag is the LAST Resource subtype")
print("   - RsaGenerationResourceTag is in Misc section AFTER")
print("   → RsaGenerationResourceTag > ResourceMaxTag numerically")
print()
print("4. Type checking at runtime:")
print("   ARGS(Resource, obj) validates: ResourceMinTag <= obj.tag <= ResourceMaxTag")
print("   → RsaGenerationResourceTag FAILS this check")
print()
print("5. Consequence:")
print("   If RsaGenerationResource passed to any generic Resource handler")
print("   (e.g., read_state, register_monitor_notifier from primitive_events.cc)")
print("   → WRONG_OBJECT_TYPE error")
print()
print("VERIFICATION RESULT: The review comment's concern is VALID.")
print("The tags ARE outside the ranges, and there ARE type-check paths.")
EOF

Repository: toitlang/toit

Length of output: 123


Verify RSA generation tags are covered by resource-type checks.

RsaGenerationResourceTag and RsaGenerationResourceGroupTag are appended in the misc block, outside ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag. The tag validation in MAKE_UNPACKING_MACRO enforces strict range checking: any code using ARGS(Resource, ...) or ARGS(ResourceGroup, ...) will reject RSA instances with WRONG_OBJECT_TYPE. This includes event system primitives like read_state and register_monitor_notifier (primitive_events.cc:28, 35). While RSA resources are currently isolated to crypto-specific primitives, the type system doesn't prevent them from being passed to generic handlers. Confirm this separation is intentional or move the RSA tags inside the appropriate ranges.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tags.h` around lines 145 - 147, RsaGenerationResourceTag and
RsaGenerationResourceGroupTag are declared outside the Resource/ResourceGroup
tag ranges so MAKE_UNPACKING_MACRO's strict checks (using ARGS(Resource, ...)
and ARGS(ResourceGroup, ...)) will reject RSA objects with WRONG_OBJECT_TYPE;
either confirm RSA tags must remain separate or move them inside the proper
ranges to be accepted by generic handlers. Fix by relocating the
RsaGenerationResourceTag and RsaGenerationResourceGroupTag declarations into the
ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag
ranges (or alternatively add explicit handling in MAKE_UNPACKING_MACRO for these
symbols), then run tests around primitive_events.cc usages (read_state,
register_monitor_notifier) to ensure RSA resources are accepted or intentionally
rejected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants