Feature: Add on_group_complete callback for streaming eval results. #632

d42me · 2025-12-14T05:14:14Z

Description

Add on_group_complete callback for streaming eval results.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

…ai#755) Replace three separate callbacks (on_start, on_progress, on_log) with a single event-based system using TypedDict events with Literal discriminators. - Remove: StartCallback, ProgressCallback, LogCallback - Add 7 new event TypedDicts with type discriminator: - StartEvent: emitted at start with resolved counts - ProgressEvent: emitted after each rollout/group completes - GroupCompleteEvent: for PR PrimeIntellect-ai#632, includes State objects - LogEvent: for log messages with level/source/timestamp - LogStreamEvent: infrastructure for PrimeIntellect-ai#753 log streaming - SaveEvent: when results are saved (intermediate or final) - CompleteEvent: when generation finishes - Add EvalEvent union type and EventHandler callback type - LogStreamFileWriter: writes log_stream events to files for tailing - Update generate() and evaluate() signatures: single on_event parameter - Add _emit_event() helper using maybe_await() - Add _run_group_with_states() to preserve State objects - Emit events at all key points: - StartEvent with accurate num_examples calculation - ProgressEvent after each completion - GroupCompleteEvent for non-independent scoring (with States) - SaveEvent for intermediate and final saves - LogEvent for important messages - CompleteEvent at the end - Update run_evaluation() signature to use on_event - Rewrite run_evaluations_tui() with match/case event handler - Preserve all metric accumulation logic - Update comments to reflect new event system - Add TODO for full log_streams implementation - Event type structure tests - LogStreamFileWriter tests (creation, appending, custom paths) - All 35 eval-related tests pass Removed callback parameters from: - Environment.generate() - Environment.evaluate() - run_evaluation() Replaced with: on_event: EventHandler - Issue PrimeIntellect-ai#755: Consolidate callbacks for eval TUI - PR PrimeIntellect-ai#632: GroupCompleteEvent with State objects (infrastructure) - Issue PrimeIntellect-ai#753: Log streaming infrastructure (LogStreamEvent + writer)

Add on_group_complete callback for streaming eval results.

3593992

estsauver mentioned this pull request Feb 6, 2026

Consolidate eval callbacks into unified event system (#755) #837

Open

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Add on_group_complete callback for streaming eval results. #632

Feature: Add on_group_complete callback for streaming eval results. #632

Uh oh!

d42me commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature: Add on_group_complete callback for streaming eval results. #632

Are you sure you want to change the base?

Feature: Add on_group_complete callback for streaming eval results. #632

Uh oh!

Conversation

d42me commented Dec 14, 2025

Description

Type of Change

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant