Emit job usage data through IBM Cloud Event Streams instances by ismaelRozasRamallal · Pull Request #2209 · Qiskit/qiskit-serverless

ismaelRozasRamallal · 2026-06-09T15:33:19Z

Summary

Fleets jobs now report their classical compute usage to downstream consumers in real time. From the moment a job starts running until it completes, consumers receive a continuous stream of usage events they can use to track how long the job has been executing and react when it finishes. This enables billing, monitoring, and any other business logic that depends on knowing a job's runtime to operate without polling the API.

One key point here is, Ray-based jobs are not affected by this change. Event publishing is scoped exclusively to Fleets (Code Engine) jobs.

🔨 related with #https://github.ibm.com/IBM-Q-Software/qiskit-serverless/issues/1507

Details and comments

Add job usage event publishing to IBM Cloud Event Streams for Fleets (Code Engine) jobs.
When a job transitions to RUNNING, a job_started event is published; subsequent scheduler iterations emit job_in_progress events with the elapsed classical time in nanoseconds; and when the job reaches a terminal state, a job_ended event is published.
Events follow the CloudEvents 1.0 spec and are published to a topic named quantum.{ENVIRONMENT}.function-usage.v1. Publishing is gated by an EVENT_STREAMS_ENABLED feature flag — when disabled, a no-op client logs at DEBUG level instead.
Publish failures block the DB status update and are retried on the next scheduler iteration.
⚠️ Consumers should handle duplicate events keyed on job_id + event_type, particularly for job_ended which can be re-emitted if the scheduler restarts between publish and DB commit. It should be that way anywhere since that's one principle for Event Driven architectures but just in case to take it into account
There is a DB migration to add running_started_at field. We need to know when the job starts to run in order to keep it as base for the calculated delta we will be publishing as usage. As a consequence, each job will sabe a timestamp on when it changed to running status

More context

Still missing other events to be published. This PR just covers the Kafka integration and the very basic publishing, let's take it as reference for next iterations.

It also misses adding an instance in docker-compose to make acceptance tests. Didn't want to add it at first to avoid make the PR even bigger.

Records the timestamp when a Fleets job transitions to RUNNING status, enabling computation of job duration for IBM Cloud Event Streams usage events. Generated with AI Co-Authored-By: AI <ai@example.com>

Adds a Kafka producer client that publishes CloudEvents 1.0 usage events (job_started, job_in_progress, job_ended) to IBM Cloud Event Streams for Fleets jobs, configured via environment variables. Generated with AI Co-Authored-By: AI <ai@example.com>

Wires IBMEventStreamsClient into UpdateFleetsJobsStatuses: emit_job_started before PENDING→RUNNING DB write, emit_job_ended before terminal DB write, and emit_job_in_progress for already-RUNNING jobs each scheduler tick. All emit calls are wrapped in try/except so broker failures never block DB updates. Generated with AI Co-Authored-By: AI <ai@example.com>

Extracts IBMEventStreamsClient from gateway/core/ibm_cloud/clients.py into its own gateway/core/ibm_cloud/event_streams/event_streams_client.py module, following the same pattern as the cos/ and code_engine/ submodules. Updates all imports and moves the corresponding tests to a matching test subpackage.

Generated with AI Co-Authored-By: AI <ai@example.com>

…tion Generated with AI Co-Authored-By: AI <ai@example.com>

Generated with AI Co-Authored-By: AI <ai@example.com>

marceloamaral

Fleets will also need the compute profile (actually the allocated resources, number of CPU, RAM and GPUs) to calculate the cost. Even though this might be out of scope for this PR, it would be good to start thinking about how this information will be added later...

Generated with AI Co-Authored-By: AI <ai@example.com>

ismaelRozasRamallal · 2026-06-11T14:58:57Z

Fleets will also need the compute profile (actually the allocated resources, number of CPU, RAM and GPUs) to calculate the cost. Even though this might be out of scope for this PR, it would be good to start thinking about how this information will be added later...

Thanks for the comment Marcelo!! you're right, there is a potential metric there which can be good to use. The compute profile can be also took as metric for complexity, but we need to figure out if we can use since it's easy the "cheat" (since the profile is decided when adding a function, we need to avoid somebody deciding to pick up the highest compute profile to force users paying more on their function execution)

ismaelRozasRamallal and others added 8 commits June 8, 2026 12:59

feat: add running_started_at field to Job for usage tracking

75431f4

Records the timestamp when a Fleets job transitions to RUNNING status, enabling computation of job duration for IBM Cloud Event Streams usage events. Generated with AI Co-Authored-By: AI <ai@example.com>

feat: add function_id to event data payload

fd0f79b

Generated with AI Co-Authored-By: AI <ai@example.com>

fix: raise on publish failure to block DB update and retry next itera…

1ce4cff

…tion Generated with AI Co-Authored-By: AI <ai@example.com>

refactor: move emit_job_in_progress into update_job_status

c70e0ce

Generated with AI Co-Authored-By: AI <ai@example.com>

feat: add EVENT_STREAMS_ENABLED feature flag with no-op client fallback

d5a14d4

Generated with AI Co-Authored-By: AI <ai@example.com>

ismaelRozasRamallal requested a review from a team as a code owner June 9, 2026 15:33

ismaelRozasRamallal changed the title ~~Integrate kafka~~ Emit job usage data through IBM Cloud Event Streams instances Jun 9, 2026

Merge branch 'main' into integrate-kafka

15ebf4b

marceloamaral reviewed Jun 9, 2026

View reviewed changes

ismaelRozasRamallal and others added 5 commits June 10, 2026 09:37

style: apply Black formatting

1e6de33

Generated with AI Co-Authored-By: AI <ai@example.com>

style: fix pylint docstring and implicit string concat warnings

3fcb579

Generated with AI Co-Authored-By: AI <ai@example.com>

Merge branch 'main' into integrate-kafka

ef7a9bf

style: apply Black formatting

d3d2053

Generated with AI Co-Authored-By: AI <ai@example.com>

Merge branch 'main' into integrate-kafka

bd230f5

Merge branch 'main' into integrate-kafka

ff1425c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit job usage data through IBM Cloud Event Streams instances#2209

Emit job usage data through IBM Cloud Event Streams instances#2209
ismaelRozasRamallal wants to merge 15 commits into
mainfrom
integrate-kafka

ismaelRozasRamallal commented Jun 9, 2026 •

edited

Loading

Uh oh!

marceloamaral left a comment •

edited

Loading

Uh oh!

ismaelRozasRamallal commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ismaelRozasRamallal commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details and comments

More context

Uh oh!

marceloamaral left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ismaelRozasRamallal commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ismaelRozasRamallal commented Jun 9, 2026 •

edited

Loading

marceloamaral left a comment •

edited

Loading