Skip to content

🐛🎨Alignment of required/computed resources when running dynamic services#9237

Draft
sanderegg wants to merge 4 commits into
ITISFoundation:masterfrom
sanderegg:uniformize-dynamic-services-resources-allocation
Draft

🐛🎨Alignment of required/computed resources when running dynamic services#9237
sanderegg wants to merge 4 commits into
ITISFoundation:masterfrom
sanderegg:uniformize-dynamic-services-resources-allocation

Conversation

@sanderegg

@sanderegg sanderegg commented Jun 11, 2026

Copy link
Copy Markdown
Member

…able resources

What do these changes do?

it was found by analysing https://github.com/ITISFoundation/private-issues/issues/455 that resources are not computed similarly between autoscaling and director-v2/webserver with regard to dynamic services.
This led to the issue described in https://github.com/ITISFoundation/private-issues/issues/455.

The director-v2/webserver is responsible for computing how much resources a user service needs (with the addition of the dynamic-sidecar own resources and the dynamic-proxy own resources). This is then set on the services.
The autoscaling services reads these resources and takes the largest between docker reservations and limits. If this is above what is available on a selected EC2 machine type, then it raises an error which is correct.
The issue above appears when using small EC2 machine types, where the director/webserver uses some flooring mechanism that actually does not account for the fact that the machine type is not large enough to accomodate the service. Instead of failing fast, then the the service is started and fails later with confusing errors or even none.

This PR aims at aligning both subsystems by having a clear location where the computations of dynamic-sidecar/-proxy services resources are done.

Related issue/s

How to test

Dev-ops

@sanderegg sanderegg added this to the RokosBasilisk milestone Jun 11, 2026
@sanderegg sanderegg self-assigned this Jun 11, 2026
@sanderegg sanderegg added a:webserver webserver's codebase. Assigning the area is particularly useful for bugs a:director-v2 issue related with the director-v2 service a:autoscaling autoscaling service in simcore's stack labels Jun 11, 2026
@github-actions github-actions Bot added a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:services-library issues on packages/service-libs labels Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 72.72727% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.08%. Comparing base (569e9a8) to head (c119980).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9237      +/-   ##
==========================================
- Coverage   87.04%   86.08%   -0.96%     
==========================================
  Files        2019     2059      +40     
  Lines       79080    81709    +2629     
  Branches     1502     1502              
==========================================
+ Hits        68832    70340    +1508     
- Misses       9837    10958    +1121     
  Partials      411      411              
Flag Coverage Δ
integrationtests 63.60% <50.00%> (-0.03%) ⬇️
unittests 84.93% <72.72%> (-0.97%) ⬇️
Components Coverage Δ
pkg_aws_library 95.29% <ø> (ø)
pkg_celery_library 77.72% <ø> (ø)
pkg_dask_task_models_library 79.65% <ø> (ø)
pkg_models_library 92.69% <ø> (ø)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database 89.91% <ø> (ø)
pkg_service_integration 72.97% <ø> (ø)
pkg_service_library 69.92% <33.33%> (-0.02%) ⬇️
pkg_settings_library 91.10% <100.00%> (+0.07%) ⬆️
pkg_simcore_sdk 85.92% <ø> (ø)
agent 93.43% <ø> (ø)
api_server 91.45% <ø> (ø)
autoscaling 95.35% <100.00%> (∅)
catalog 92.16% <ø> (ø)
clusters_keeper 98.61% <ø> (ø)
dask_sidecar 91.74% <ø> (ø)
datcore_adapter 97.95% <ø> (ø)
director 79.06% <ø> (ø)
director_v2 91.34% <100.00%> (+0.02%) ⬆️
dynamic_scheduler 95.85% <ø> (-0.21%) ⬇️
dynamic_sidecar 88.04% <ø> (ø)
efs_guardian 89.77% <ø> (ø)
invitations 91.63% <ø> (ø)
payments 92.44% <ø> (ø)
resource_usage_tracker 91.90% <ø> (-0.47%) ⬇️
storage 86.86% <ø> (-0.05%) ⬇️
webclient ∅ <ø> (∅)
webserver 82.57% <33.33%> (-4.23%) ⬇️

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 569e9a8...c119980. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sanderegg sanderegg force-pushed the uniformize-dynamic-services-resources-allocation branch from 3cc7b25 to c119980 Compare June 11, 2026 11:12
@sonarqubecloud

Copy link
Copy Markdown

@sanderegg sanderegg removed this from the RokosBasilisk milestone Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a:autoscaling autoscaling service in simcore's stack a:director-v2 issue related with the director-v2 service a:infra+ops maintenance of infrastructure or operations (discussed in retro) a:services-library issues on packages/service-libs a:webserver webserver's codebase. Assigning the area is particularly useful for bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant