Fix ESP32 RSA watchdog timeout during key generation#3018
Fix ESP32 RSA watchdog timeout during key generation#3018valientegaston wants to merge 8 commits into
Conversation
|
I also noticed that CONFIG_MBEDTLS_HARDWARE_MPI is explicitly set to n across all ESP32 toolchain configurations (esp32, esp32s2, esp32s3, esp32c3, esp32c6). Was this intentional? I understand it wouldn't help with key generation (as the prime generation and Miller-Rabin tests run in software regardless), but it could potentially speed up everyday RSA operations like signing, verification, and encryption/decryption. |
|
Primitives shouldn't be blocking. We want to run other tasks when waiting for a primitive. As such, the scheduler revealed a real issue.
I'm guessing we set the MBEDTLS_HARDWARE_MPI to 'n' to reduce the image size. We can definitely enable it again. |
… MPI support across ESP32 targets.
|
Hi! I have implemented the RSA key generation following your suggestions: Non-blocking: The primitives are now non-blocking. The key generation runs in a separate thread, and the Toit Task is suspended while waiting for the resource event. RsaGeneration Resource: I created a new RsaGeneration resource (contained in StructTag as RsaGenerationResourceTag and RsaGenerationResourceGroupTag) to manage the lifecycle of the asynchronous task. Architecture: I followed the approach in async_posix.cc, using the AsyncEventThread to coordinate the background generation and the completion event. Flags & Configuration: The new resource and primitives are correctly guarded by the CONFIG_TOIT_CRYPTO_EXTRA sdkconfig flag as you proposed. I've also updated the sdkconfig.defaults for ESP32 targets to enable this by default. Binary Compatibility: I made sure to append the new tags at the end of the StructTag enum to avoid any issues with existing bootstrap snapshots. Please let me know if there's any part of the implementation you'd like me to refine further. Thanks! |
- Add RsaGenerationResource_ class with finalizer/close pattern to ensure proper cleanup of resource group and state on exceptions (e.g. with-timeout). - Add rsa-generate-close primitive to tear down the resource group. - Fix dangling proxy in rsa_generate_finish error paths by adding missing clear_external_address calls. - Add no-op Thread::cancel() for ESP32 (required since async_posix is now compiled for ESP32).
|
Opened a fix PR on your fork: valientegaston#1 It addresses a few resource cleanup issues:
|
Fix RSA resource cleanup and ESP32 Thread::cancel
| free(prv); free(pub); | ||
| } else { | ||
| // mbedtls writes from the end. Move to start. | ||
| unsigned char* prv_start = (unsigned char*)malloc(prv_len); |
There was a problem hiding this comment.
If we don't have enough memory here, we would have to throw away everything we did so far and start again.
Also: I'm not even sure things would behave correctly: By marking the error as MBEDTLS_ERR_PK_ALLOC_FAILED the rsa_generate_finish would return a "malloc failed". This would indicate to the system that the primitive ran out of memory and that we should do a GC, then try again. However, the rsa_generate_finish would just fail now as the resource_proxy just has been cleared.
-> We need to allocate the memory in the main-thread (that has called the primitive) and not inside the event-thread. Unfortunately, this means that we need to know the prv_len und pub_len already before we actually call mbedtls_pk_write_key_der.
That said: it looks to me like a resize is the better option anyway. The prv_len/pub_len just seem to say how much of the given buffer was written too.
There was a problem hiding this comment.
Replaced the two extra malloc calls with memmove on the same buffer to shift the data to the start, followed by realloc to shrink it to the exact size. If realloc fails it returns null without altering the original block, so the original pointer is preserved and passed to set_results correctly — no memory leak or dangling pointer.
There was a problem hiding this comment.
The same still applies to the 'malloc' in line 986/987. (the ret goes to the rsa_generate_finish, leading to an unrecoverable OOM). The mallocs need to be done in the main thread. Here we just use constants (RSA_PRV_DER_MAX_BYTES), so that's relatively easy to fix.
| bool is_locked() const { return OS::is_locked(mutex_); } | ||
| bool is_boot_process(Process* process) const { return boot_process_ == process; } | ||
|
|
||
| Mutex* mutex() const { return mutex_; } |
There was a problem hiding this comment.
Necessary?
If yes, we should try to find a better approach than to make the scheduler's mutex public.
There was a problem hiding this comment.
No longer necessary. In the previous approach, the rsa_rng callback needed to take the scheduler mutex to update the process run_timestamp. With the new async design, the Toit process is suspended during key generation, so there's no need to update the timestamp from the RNG callback at all. The mutex() accessor has been removed.
| CONFIG_MBEDTLS_X509_TRUSTED_CERT_CALLBACK=y | ||
| CONFIG_MBEDTLS_ECP_RESTARTABLE=y | ||
| CONFIG_MBEDTLS_HARDWARE_MPI=n | ||
| CONFIG_MBEDTLS_HARDWARE_MPI=y |
There was a problem hiding this comment.
Is there any downside to enabling MPI?
For some esp32 variants we were fighting the partition sizes already. So if this increases the image size, we need to make sure we aren't too big now.
There was a problem hiding this comment.
I measured the binary size impact on ESP32:
- Without
HARDWARE_MPI: 0x1561f0 bytes (18% free of 0x1a0000 partition) - With
HARDWARE_MPI: 0x156390 bytes (18% free of 0x1a0000 partition)
The difference is only ~416 bytes.
I only tested on ESP32 so far — still checking the other variants.
There was a problem hiding this comment.
ESP32S3:
Without HARDWARE_MPI: 0x14d5c0 bytes (20% free)
With HARDWARE_MPI: 0x14d840 bytes (20% free)
Difference: ~640 bytes
Still checking the remaining variants (esp32c3, esp32c6, esp32s2).
There was a problem hiding this comment.
ESP32S2:
Without HARDWARE_MPI: 0x116a50 bytes (33% free)
With HARDWARE_MPI: 0x116ce0 bytes (33% free)
Difference: ~656 bytes
There was a problem hiding this comment.
ESP32C3:
Without HARDWARE_MPI: 0x163e00 bytes (14% free)
With HARDWARE_MPI: 0x164560 bytes (14% free)
Difference: ~1888 bytes
ESP32C6:
Without HARDWARE_MPI: 0x1818c0 bytes (11% free)
With HARDWARE_MPI: 0x182020 bytes (11% free)
Difference: ~1888 bytes
|
@valientegaston do you think you can address the feedback? |
|
Hi! Yes, I'll be addressing the feedback in the coming days. We've been quite busy lately but I'll make sure to go through all the comments and make the necessary changes. Thanks for your patience! |
…d remove unused scheduler mutex accessor
floitsch
left a comment
There was a problem hiding this comment.
Still some subtle issues.
Not completely sure what the best way for the resource handling is.
Maybe one way is to keep track of whether the parallel thread is running and store that in the resource. When the resource gets the signal to destroy itself, it could then delay that action until the thread returns. Alternatively (if possible) killing the thread first, would also work.
We do something slightly similar in the BLE code.
| } | ||
|
|
||
| if (ret == 0) { | ||
| unsigned char* prv = (unsigned char*)malloc(RSA_PRV_DER_MAX_BYTES); |
There was a problem hiding this comment.
When do we free the memory again? Shouldn't that happen in rsa_generate_finish?
But that's more complicated:
- what if nothing ever calls 'rsa_generate_finish'? This means that the memory needs to be attached to the resource.
- However, that immediately reveals another race condition (which was already present): what if the program stops while we are running in parallel. The
resresource could already be freed at that point.
There was a problem hiding this comment.
Good catch on both points. Here's how they've been addressed:
Buffer ownership: prv_buf_ and pub_buf_ are now members of RsaGenerationResource, allocated in the main thread before launching the background thread. The resource destructor always frees them, so memory is never leaked even if rsa_generate_finish is never called.
Race condition: Addressed with a thread_running_ / destroy_when_done_ pattern protected by the EventSource mutex. If the resource is destroyed while the thread is still running, delete_or_mark_for_deletion sets destroy_when_done_ = true instead of freeing immediately. The actual delete this happens in on_thread_done once the thread finishes — and crucially, after releasing the mutex, to avoid use-after-free on the Locker destructor itself.
| free(prv); free(pub); | ||
| } else { | ||
| // mbedtls writes from the end. Move to start. | ||
| unsigned char* prv_start = (unsigned char*)malloc(prv_len); |
There was a problem hiding this comment.
The same still applies to the 'malloc' in line 986/987. (the ret goes to the rsa_generate_finish, leading to an unrecoverable OOM). The mallocs need to be done in the main thread. Here we just use constants (RSA_PRV_DER_MAX_BYTES), so that's relatively easy to fix.
| memmove(prv, prv + RSA_PRV_DER_MAX_BYTES - prv_len, prv_len); | ||
| memmove(pub, pub + RSA_PUB_DER_MAX_BYTES - pub_len, pub_len); | ||
|
|
||
| unsigned char* prv_resized = (unsigned char*)realloc(prv, prv_len); |
There was a problem hiding this comment.
Maybe add the following information:
RSA_PRV_DER_MAX_BYTES is typically ~550 bytes.
Typically, keys only need ~300 bytes.
As such the realloc is useful.
However, if it fails, it's not the end of the world either.
There was a problem hiding this comment.
Done. Added the following comment above the realloc calls:
// RSA_PRV_DER_MAX_BYTES is typically ~550 bytes, but generated keys
// usually only need ~300 bytes. The realloc shrinks the buffer to the
// actual size to recover the unused memory. If realloc fails, it returns
// null without freeing the original block, so the original pointer is
// preserved and no memory is leaked — it just means we keep the larger
// buffer, which is acceptable.
A second best-effort realloc is also attempted in rsa_generate_finish in case the one in the generation thread failed due to heap fragmentation at that point in time.
| } | ||
|
|
||
| if (ret == 0) { | ||
| unsigned char* prv = (unsigned char*)malloc(RSA_PRV_DER_MAX_BYTES); |
There was a problem hiding this comment.
Use unvoid_cast instead of C-style casts. Here and in all the other places.
There was a problem hiding this comment.
Replaced all C-style casts in the new and modified code with unvoid_cast. Resource type conversions use static_cast as is the convention in the rest of the codebase.
| } | ||
|
|
||
| ByteArray* prv_der = process->allocate_byte_array(resource->prv_len()); | ||
| ByteArray* pub_der = process->allocate_byte_array(resource->pub_len()); |
There was a problem hiding this comment.
Another option is to use an external ByteArray and use the existing buffers. That would decrease the peak amount of memory.
I think for 250+ bytes it's ok to have an external byte-array.
However, at that point I would do another realloc, just in case the one in the other thread didn't succeed. If it still doesn't succeed, you can do a MALLOC_FAILED.
There was a problem hiding this comment.
This approach was explored but ran into a safety issue with GC retries: if prv_der allocation succeeds but pub_der fails, the runtime does GC and retries the primitive — but prv_buf_ ownership has already been transferred to the external ByteArray, leaving the resource in an inconsistent state where the private key buffer is gone and cannot be recovered on retry.
Given this complexity and the fact that RSA key generation is a one-time operation, the allocate_byte_array + memcpy pattern was kept instead. A best-effort realloc is attempted in rsa_generate_finish before copying to reduce the buffer sizes as much as possible before the peak occurs. The original buffers are freed immediately after copying on the success path.
| resource->resource_group()->unregister_resource(resource); | ||
| resource_proxy->clear_external_address(); |
There was a problem hiding this comment.
That's a bit too eager.
For OOMs the primitive will do a GC and then try again. If you destroy the resource the second attempt wouldn't work anymore.
There was a problem hiding this comment.
Fixed. unregister_resource and clear_external_address are now only called on the success path and on non-recoverable mbedtls errors. All ALLOCATION_FAILED paths return early without touching the resource, keeping it intact for GC retry.
Implemented the first approach — tracking whether the background thread is running via thread_running_ and destroy_when_done_ flags, protected by the EventSource mutex. If the resource receives the signal to destroy itself while the thread is still active, destruction is deferred until on_thread_done is called. The BLE code was used as a reference for the general pattern. |
| int error_ = 0; | ||
| int bits() const { return bits_; } | ||
|
|
||
| bool thread_running_ = false; |
There was a problem hiding this comment.
these fields should not be public.
| mbedtls_pk_free(&pk); | ||
| res->set_error(ret); | ||
| res->error_ = ret; | ||
| return (word)1; // Indicate done. |
There was a problem hiding this comment.
Do we really need that cast? If yes, use 'static_cast'.
| ARGS(RsaGenerationResource, resource); | ||
| if (resource->error() != 0) { | ||
| int err = resource->error(); | ||
| if (resource->error_ != 0) { |
There was a problem hiding this comment.
don't expose the private fields.
| int err = resource->error_; | ||
| resource->resource_group()->unregister_resource(resource); | ||
| resource_proxy->clear_external_address(); | ||
| return tls_error(null, process, err); |
There was a problem hiding this comment.
we should probably check whether this is an oom error. Not sure that can actually happen, but if it can, then just returning the oom error here would not do what we want.
Instead, we should replace the oom error with something recognizable, and then do a retry-loop from the Toit code. (with a GC in between).
That said: getting an OOM from mbedtls is maybe extremely rare (since we pass in the buffers), so maybe OOMing at that point is ok. But that would still require a few changes. Remember: if a primitive returns OOM the system runs the GC and tries again (up to 3?) times. With the current code that would yield an error as we cleared the resource-proxy. -> we can't repeat the primitive call.
| # CONFIG_TOIT_INTERPRETER_HELPERS_IN_IRAM is not set | ||
| CONFIG_TOIT_CRYPTO=y | ||
| # CONFIG_TOIT_CRYPTO_EXTRA is not set | ||
| CONFIG_TOIT_CRYPTO_EXTRA=y |
There was a problem hiding this comment.
I would expect a change in sdkconfig.defaults as well.
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
WalkthroughThis PR adds asynchronous RSA key generation capabilities to the Toit language. ChangesAsync RSA Key Generation Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/tags.h`:
- Around line 145-147: RsaGenerationResourceTag and
RsaGenerationResourceGroupTag are declared outside the Resource/ResourceGroup
tag ranges so MAKE_UNPACKING_MACRO's strict checks (using ARGS(Resource, ...)
and ARGS(ResourceGroup, ...)) will reject RSA objects with WRONG_OBJECT_TYPE;
either confirm RSA tags must remain separate or move them inside the proper
ranges to be accepted by generic handlers. Fix by relocating the
RsaGenerationResourceTag and RsaGenerationResourceGroupTag declarations into the
ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag
ranges (or alternatively add explicit handling in MAKE_UNPACKING_MACRO for these
symbols), then run tests around primitive_events.cc usages (read_state,
register_monitor_notifier) to ensure RSA resources are accepted or intentionally
rejected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4394fb89-3cfb-432c-8de4-692fe86ee959
📒 Files selected for processing (18)
lib/crypto/rsa.toitsrc/compiler/propagation/type_primitive_crypto.ccsrc/event_sources/async_posix.ccsrc/event_sources/async_posix.hsrc/os_esp32.ccsrc/primitive.hsrc/primitive_crypto.ccsrc/tags.htoolchains/esp32/sdkconfigtoolchains/esp32/sdkconfig.defaultstoolchains/esp32c3/sdkconfigtoolchains/esp32c3/sdkconfig.defaultstoolchains/esp32c6/sdkconfigtoolchains/esp32c6/sdkconfig.defaultstoolchains/esp32s2/sdkconfigtoolchains/esp32s2/sdkconfig.defaultstoolchains/esp32s3/sdkconfigtoolchains/esp32s3/sdkconfig.defaults
| PRIMITIVE(rsa_generate_finish) { | ||
| ARGS(RsaGenerationResource, resource); | ||
| if (resource->error_ != 0) { | ||
| int err = resource->error_; | ||
| resource->resource_group()->unregister_resource(resource); | ||
| resource_proxy->clear_external_address(); | ||
| return tls_error(null, process, err); | ||
| } | ||
|
|
||
| // Attempt a best-effort final realloc in case the one in the generation | ||
| // thread failed due to heap fragmentation at that point in time. | ||
| // If this also fails, we harmlessly keep the original (larger) buffer. | ||
| if (resource->prv_buf_ != null) { | ||
| unsigned char* p = unvoid_cast<unsigned char*>(realloc(resource->prv_buf_, resource->prv_len_)); | ||
| if (p != null) resource->prv_buf_ = p; | ||
| } | ||
| if (resource->pub_buf_ != null) { | ||
| unsigned char* p = unvoid_cast<unsigned char*>(realloc(resource->pub_buf_, resource->pub_len_)); | ||
| if (p != null) resource->pub_buf_ = p; | ||
| } |
There was a problem hiding this comment.
Guard rsa_generate_finish against unfinished work.
If this primitive is called before the worker has completed, prv_len_/pub_len_ are still 0, so the best-effort realloc(..., 0) can free the buffers while the background thread is still writing into them. Please reject or block unfinished resources before touching error_ or the DER buffers.
Suggested fix
PRIMITIVE(rsa_generate_finish) {
ARGS(RsaGenerationResource, resource);
+ {
+ Locker locker(resource->resource_group()->event_source()->mutex());
+ if (resource->thread_running_) FAIL(INVALID_STATE);
+ }
+
if (resource->error_ != 0) {
int err = resource->error_;
resource->resource_group()->unregister_resource(resource);
resource_proxy->clear_external_address();
return tls_error(null, process, err);| ChannelTag, | ||
| RsaGenerationResourceTag, | ||
| RsaGenerationResourceGroupTag |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Tag declarations and direct references"
rg -n "RsaGenerationResource(Tag|GroupTag)|ResourceMinTag|ResourceMaxTag|ResourceGroupMinTag|ResourceGroupMaxTag" src
echo
echo "2) Range-based resource checks that could exclude appended RSA tags"
rg -nP ">=\\s*ResourceMinTag.*<=\\s*ResourceMaxTag|>=\\s*ResourceGroupMinTag.*<=\\s*ResourceGroupMaxTag|is_resource|is_resource_group" src
echo
echo "3) RSA generation resource class/tag declarations"
rg -n "class\\s+RsaGenerationResource|class\\s+RsaGenerationResourceGroup|TAG\\(RsaGenerationResource\\)|TAG\\(RsaGenerationResourceGroup\\)" srcRepository: toitlang/toit
Length of output: 560
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Context around tag declarations ==="
sed -n '120,150p' src/tags.h
echo
echo "=== 2) Search for enum/tag validation functions ==="
rg -n "is_resource|is_resource_group|ResourceTag|resource_tag" src --type-list | head -20
echo
echo "=== 3) Usage of RsaGenerationResourceTag ==="
rg -n "RsaGenerationResourceTag" src -B 2 -A 2
echo
echo "=== 4) Resource type validation patterns ==="
rg -n "switch.*tag|case.*ResourceTag|case.*Resource" src | head -20Repository: toitlang/toit
Length of output: 1209
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Search for resource type validation/checking ==="
rg -n "is_resource|is_resource_group|ResourceMinTag|ResourceMaxTag" src -A 2 -B 2
echo
echo "=== 2) Usage context for RsaGenerationResourceTag ==="
rg -n "RsaGenerationResourceTag\|RsaGenerationResourceGroupTag" src -B 3 -A 3
echo
echo "=== 3) Resource class definitions and lifecycles ==="
rg -n "class.*Resource\|RsaGenerationResource" src --type-list=cpp --type-list=h | head -30Repository: toitlang/toit
Length of output: 754
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Search for RsaGenerationResource class (without Tag) ==="
rg -n "RsaGenerationResource" src -B 2 -A 5
echo
echo "=== 2) Files that import or reference tags.h ==="
rg -l "tags.h|#include.*tags" src | head -10
echo
echo "=== 3) Check what happens with Misc section tags ==="
rg -n "FontTag|ImageOutputStreamTag|ChannelTag" src | head -20
echo
echo "=== 4) Search for resource tag enum usage patterns ==="
rg -n "case.*Tag:|switch.*tag" src | head -30Repository: toitlang/toit
Length of output: 8882
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Find TAG() macro definition ==="
rg -n "^#define TAG|^TAG(" src -B 2 -A 5 | head -50
echo
echo "=== 2) Find ARGS macro and how it validates types ==="
rg -n "^#define ARGS\|MAKE_UNPACKING_MACRO" src -B 2 -A 8 | head -80
echo
echo "=== 3) Search for any validation using ResourceMinTag/ResourceMaxTag ==="
rg -n "ResourceMinTag|ResourceMaxTag|ResourceGroupMinTag|ResourceGroupMaxTag" src -B 3 -A 3
echo
echo "=== 4) Check resource.h for type checking logic ==="
grep -n "class Resource\|class ResourceGroup\|tag\|Tag" src/resource.h | head -40Repository: toitlang/toit
Length of output: 201
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Find TAG macro definition ==="
rg -n "#define TAG" src -A 5
echo
echo "=== 2) Type validation in Resource classes ==="
cat src/resource.h | head -100
echo
echo "=== 3) Search for where resource types are validated/checked ==="
rg -n "class_tag\|is_resource\|get_tag" src | head -30
echo
echo "=== 4) Check if any code switches on tag ranges ==="
rg -n "ResourceMinTag\|ResourceMaxTag" src -B 2 -A 2Repository: toitlang/toit
Length of output: 5390
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Search for actual usage of tag_min/tag_max ==="
rg -n "tag_min\|tag_max" src -B 2 -A 2 | head -60
echo
echo "=== 2) Look for proxy validation in objects.h ==="
rg -n "is_resource\|class_tag\|get_tag" src/objects.h | head -40
echo
echo "=== 3) Check how MAKE_UNPACKING_MACRO validates types ==="
rg -n "MAKE_UNPACKING_MACRO" src/primitive.h -B 5 -A 15 | head -80Repository: toitlang/toit
Length of output: 111
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Find MAKE_UNPACKING_MACRO definition ==="
rg -n "MAKE_UNPACKING_MACRO" src/primitive.h -B 3 -A 20 | head -100
echo
echo "=== 2) Search for class_tag checks in objects.h ==="
rg -n "class_tag\|get_tag" src/objects.h -B 2 -A 2 | head -60
echo
echo "=== 3) Find proxy validation (how ARGS validates) ==="
rg -n "PROXY_CLASS\|proxy->class_tag" src -B 2 -A 2 | head -50Repository: toitlang/toit
Length of output: 9191
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Find all ARGS(Resource, usages ==="
rg -n "ARGS.*Resource" src -B 2 -A 2 | grep -E "ARGS\(Resource|RsaGeneration"
echo
echo "=== 2) Check if Resource base class is used generically ==="
rg -n "Resource\*|Resource::tag" src | head -30
echo
echo "=== 3) Verify tag value assignments in tags.h ==="
sed -n '120,150p' src/tags.hRepository: toitlang/toit
Length of output: 4102
🏁 Script executed:
#!/bin/bash
set -euo pipefixo
echo "=== 1) Check primitive_events.cc line 28 context ==="
sed -n '20,40p' src/primitive_events.cc
echo
echo "=== 2) Find all Resource base class usage in lifecycle ==="
rg -n "register_resource\|unregister_resource" src | head -20
echo
echo "=== 3) Check if RsaGenerationResource registers with base Resource handling ==="
rg -n "register_resource.*Resource\|resource_group\(\)" src/primitive_crypto.cc -B 2 -A 2
echo
echo "=== 4) Verify Resource class tag range ==="
rg -n "TAGS(Resource)" src/tags.h -B 5 -A 5Repository: toitlang/toit
Length of output: 111
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check primitive_events.cc line 28 context ==="
sed -n '20,40p' src/primitive_events.cc
echo
echo "=== 2) Find all Resource base class usage in lifecycle ==="
rg -n "register_resource\|unregister_resource" src | head -20
echo
echo "=== 3) Check if RsaGenerationResource registers with base Resource handling ==="
rg -n "register_resource.*Resource\|resource_group\(\)" src/primitive_crypto.cc -B 2 -A 2
echo
echo "=== 4) Verify Resource class tag range ==="
rg -n "TAGS.Resource" src/tags.h -B 5 -A 5Repository: toitlang/toit
Length of output: 714
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check if RsaGenerationResource goes through events/monitoring ==="
rg -n "read_state\|register_monitor\|event_source\|on_event" src/primitive_crypto.cc -B 2 -A 2
echo
echo "=== 2) Look at RsaGenerationResourceGroup base class ==="
sed -n '898,920p' src/primitive_crypto.cc
echo
echo "=== 3) Check what ResourceGroup::event_source returns ==="
rg -n "class.*ResourceGroup\|event_source\(\)" src/resource.h -B 2 -A 5 | head -60
echo
echo "=== 4) Verify actual tag enum values ==="
python3 << 'EOF'
# Count tag positions to understand if RsaGenerationResourceTag is truly outside ResourceMaxTag
tags = [
"RawByteTag",
"NullStructTag",
"MappedFileTag",
"ResourceMinTag",
# NON_BLE expanded (need to check)
"BleResourceMinTag",
"BleCallbackResourceMinTag",
# BLE expanded
"BleReadWriteElementMinTag",
# BLE_READ_WRITE expanded
"BleReadWriteElementMaxTag",
"BleCallbackResourceMaxTag",
"BleResourceMaxTag",
"ResourceMaxTag",
"ResourceGroupMinTag",
# RESOURCE_GROUP expanded
"ResourceGroupMaxTag",
"FontTag",
"ImageOutputStreamTag",
"ChannelTag",
"RsaGenerationResourceTag",
"RsaGenerationResourceGroupTag"
]
for i, tag in enumerate(tags):
print(f"{i}: {tag}")
EOFRepository: toitlang/toit
Length of output: 130
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Look at ResourceGroup implementation ==="
sed -n '100,150p' src/resource.h
echo
echo "=== 2) Check RsaGenerationResourceGroup constructor and event source ==="
sed -n '898,910p' src/primitive_crypto.cc
echo
echo "=== 3) Search for event monitoring of RSA resources ==="
rg -n "RsaGenerationResource\|event_source" src/primitive_crypto.cc | head -30
echo
echo "=== 4) Direct check: Does RsaGenerationResourceGroup have an event source? ==="
rg -n "EventSource\|event_source\|on_event" src/primitive_crypto.cc -B 1 -A 1Repository: toitlang/toit
Length of output: 2083
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Find RsaGenerationEventSource ==="
rg -n "RsaGenerationEventSource" src -B 3 -A 10
echo
echo "=== 2) Check if ResourceGroup::register_resource is called with RsaGenerationResource ==="
rg -n "register_resource" src/primitive_crypto.cc -B 2 -A 2
echo
echo "=== 3) Look for event dispatch on RSA resources ==="
rg -n "dispatch\|notify.*resource" src -B 2 -A 2 | grep -A 4 -B 4 "Rsa\|crypto"
echo
echo "=== 4) See the actual tag values ==="
sed -n '120,160p' src/tags.h | cat -nRepository: toitlang/toit
Length of output: 5764
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check tag range ordering in enum ==="
python3 << 'EOF'
# Trace through the enum to understand actual numeric values
lines = """ RawByteTag = 0,
NullStructTag = 1,
MappedFileTag = 2,
// Resource subclasses.
ResourceMinTag,
NON_BLE_RESOURCE_CLASSES_DO(MAKE_ENUM)
BleResourceMinTag,
BleCallbackResourceMinTag,
BLE_CLASSES_DO(MAKE_ENUM)
BleReadWriteElementMinTag,
BLE_READ_WRITE_CLASSES_DO(MAKE_ENUM)
BleReadWriteElementMaxTag,
BleCallbackResourceMaxTag,
BleResourceMaxTag,
ResourceMaxTag,
// ResourceGroup subclasses.
ResourceGroupMinTag,
RESOURCE_GROUP_CLASSES_DO(MAKE_ENUM)
ResourceGroupMaxTag,
// Misc.
FontTag,
ImageOutputStreamTag,
ChannelTag,
RsaGenerationResourceTag,
RsaGenerationResourceGroupTag"""
print("Key observation:")
print("- ResourceMaxTag marks end of Resource subtypes")
print("- ResourceGroupMaxTag marks end of ResourceGroup subtypes")
print("- RsaGenerationResourceTag is in Misc section AFTER both ranges")
print("\nIf Resource::tag_max = ResourceMaxTag, then:")
print(" RsaGenerationResourceTag > ResourceMaxTag")
print(" → ARGS(Resource, rsa_res) would reject RsaGenerationResource instances")
EOF
echo
echo "=== 2) See if there are any generic Resource usages that could hit RSA ==="
rg -n "read_state\|register_monitor" src -B 1 -A 5Repository: toitlang/toit
Length of output: 527
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Confirm Resource base class tag validation mechanism ==="
sed -n '1061,1068p' src/primitive.h
echo
echo "=== 2) Check if there are any other misc-section resources and how they're handled ==="
rg -n "FontTag\|ImageOutputStreamTag\|ChannelTag" src -B 2 -A 2 | head -50
echo
echo "=== 3) Search for how these misc types bypass resource validation ==="
rg -n "Font\|ImageOutputStream\|Channel" src/primitive.h | grep "_A_T_"
echo
echo "=== 4) Final check: any code path where Resource base validation might filter RSA ==="
rg -n "is_byte_array\|has_external_address\|external_tag" src -B 1 -A 1 | grep -v "third_party"Repository: toitlang/toit
Length of output: 745
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== 1) Check how Font, ImageOutputStream, Channel are handled ==="
rg -n "_A_T_Font\|_A_T_ImageOutputStream\|_A_T_Channel" src/primitive.h -B 1 -A 1
echo
echo "=== 2) Verify if Font/ImageOutputStream/Channel are also in Misc section ==="
rg -n "FontTag\|ImageOutputStreamTag\|ChannelTag" src/tags.h -B 3 -A 3
echo
echo "=== 3) Check if these misc types have unpacking macros (type validation) ==="
rg -n "_A_T_Font\|_A_T_ImageOutputStream\|_A_T_Channel\|_A_T_RsaGenerationResource" src/primitive.h
echo
echo "=== 4) Summary: Are RSA tags truly outside Resource range validation? ==="
python3 << 'EOF'
print("VALIDATION LOGIC:")
print("================")
print()
print("1. Resource class:")
print(" TAG macro: TAGS(Resource)")
print(" → tag_min = ResourceMinTag")
print(" → tag_max = ResourceMaxTag")
print()
print("2. RsaGenerationResource class:")
print(" TAG macro: TAG(RsaGenerationResource)")
print(" → tag_min = RsaGenerationResourceTag")
print(" → tag_max = RsaGenerationResourceTag")
print()
print("3. Enum ordering (from tags.h):")
print(" - ResourceMaxTag is the LAST Resource subtype")
print(" - RsaGenerationResourceTag is in Misc section AFTER")
print(" → RsaGenerationResourceTag > ResourceMaxTag numerically")
print()
print("4. Type checking at runtime:")
print(" ARGS(Resource, obj) validates: ResourceMinTag <= obj.tag <= ResourceMaxTag")
print(" → RsaGenerationResourceTag FAILS this check")
print()
print("5. Consequence:")
print(" If RsaGenerationResource passed to any generic Resource handler")
print(" (e.g., read_state, register_monitor_notifier from primitive_events.cc)")
print(" → WRONG_OBJECT_TYPE error")
print()
print("VERIFICATION RESULT: The review comment's concern is VALID.")
print("The tags ARE outside the ranges, and there ARE type-check paths.")
EOFRepository: toitlang/toit
Length of output: 123
Verify RSA generation tags are covered by resource-type checks.
RsaGenerationResourceTag and RsaGenerationResourceGroupTag are appended in the misc block, outside ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag. The tag validation in MAKE_UNPACKING_MACRO enforces strict range checking: any code using ARGS(Resource, ...) or ARGS(ResourceGroup, ...) will reject RSA instances with WRONG_OBJECT_TYPE. This includes event system primitives like read_state and register_monitor_notifier (primitive_events.cc:28, 35). While RSA resources are currently isolated to crypto-specific primitives, the type system doesn't prevent them from being passed to generic handlers. Confirm this separation is intentional or move the RSA tags inside the appropriate ranges.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/tags.h` around lines 145 - 147, RsaGenerationResourceTag and
RsaGenerationResourceGroupTag are declared outside the Resource/ResourceGroup
tag ranges so MAKE_UNPACKING_MACRO's strict checks (using ARGS(Resource, ...)
and ARGS(ResourceGroup, ...)) will reject RSA objects with WRONG_OBJECT_TYPE;
either confirm RSA tags must remain separate or move them inside the proper
ranges to be accepted by generic handlers. Fix by relocating the
RsaGenerationResourceTag and RsaGenerationResourceGroupTag declarations into the
ResourceMinTag..ResourceMaxTag and ResourceGroupMinTag..ResourceGroupMaxTag
ranges (or alternatively add explicit handling in MAKE_UNPACKING_MACRO for these
symbols), then run tests around primitive_events.cc usages (read_state,
register_monitor_notifier) to ensure RSA resources are accepted or intentionally
rejected.
Problem
On ESP32, generating an RSA 2048-bit key takes longer than 10 seconds, causing the watchdog timer to kill the process, mistakenly believing it is hung.
Root cause
The mbedtls RSA operations (rsa_generate, rsa_sign, rsa_encrypt, rsa_decrypt) were passing NULL as the context to the rsa_rng callback. The scheduler watchdog monitors the run_timestamp of each process and kills it if it hasn't been updated within the timeout window. Since the RSA key generation is a long-running blocking operation, the timestamp was never updated during its execution.
Fix
The rsa_rng callback is called many times by mbedtls during RSA operations — in particular during key generation, it is called repeatedly to generate prime candidates and random bases for Miller-Rabin primality tests. This makes it a natural and low-overhead place to reset the watchdog by updating the process run_timestamp.
The fix passes the Process* as the context to all rsa_rng calls, and inside rsa_rng the scheduler mutex is taken to safely update the timestamp.
Safety
We verified that primitives execute inside an Unlocker block in Scheduler::run_process, meaning the scheduler mutex is not held during RSA operations. Therefore, taking it inside rsa_rng is safe and will not cause a deadlock.