Skip to content

Conversation

@d42me
Copy link
Collaborator

@d42me d42me commented Dec 14, 2025

Description

Add on_group_complete callback for streaming eval results.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

estsauver pushed a commit to estsauver/verifiers that referenced this pull request Feb 5, 2026
…ai#755)

Replace three separate callbacks (on_start, on_progress, on_log) with a
single event-based system using TypedDict events with Literal discriminators.

- Remove: StartCallback, ProgressCallback, LogCallback
- Add 7 new event TypedDicts with type discriminator:
  - StartEvent: emitted at start with resolved counts
  - ProgressEvent: emitted after each rollout/group completes
  - GroupCompleteEvent: for PR PrimeIntellect-ai#632, includes State objects
  - LogEvent: for log messages with level/source/timestamp
  - LogStreamEvent: infrastructure for PrimeIntellect-ai#753 log streaming
  - SaveEvent: when results are saved (intermediate or final)
  - CompleteEvent: when generation finishes
- Add EvalEvent union type and EventHandler callback type

- LogStreamFileWriter: writes log_stream events to files for tailing

- Update generate() and evaluate() signatures: single on_event parameter
- Add _emit_event() helper using maybe_await()
- Add _run_group_with_states() to preserve State objects
- Emit events at all key points:
  - StartEvent with accurate num_examples calculation
  - ProgressEvent after each completion
  - GroupCompleteEvent for non-independent scoring (with States)
  - SaveEvent for intermediate and final saves
  - LogEvent for important messages
  - CompleteEvent at the end

- Update run_evaluation() signature to use on_event
- Rewrite run_evaluations_tui() with match/case event handler
- Preserve all metric accumulation logic

- Update comments to reflect new event system
- Add TODO for full log_streams implementation

- Event type structure tests
- LogStreamFileWriter tests (creation, appending, custom paths)
- All 35 eval-related tests pass

Removed callback parameters from:
- Environment.generate()
- Environment.evaluate()
- run_evaluation()

Replaced with: on_event: EventHandler

- Issue PrimeIntellect-ai#755: Consolidate callbacks for eval TUI
- PR PrimeIntellect-ai#632: GroupCompleteEvent with State objects (infrastructure)
- Issue PrimeIntellect-ai#753: Log streaming infrastructure (LogStreamEvent + writer)
estsauver pushed a commit to estsauver/verifiers that referenced this pull request Feb 6, 2026
…ai#755)

Replace three separate callbacks (on_start, on_progress, on_log) with a
single event-based system using TypedDict events with Literal discriminators.

- Remove: StartCallback, ProgressCallback, LogCallback
- Add 7 new event TypedDicts with type discriminator:
  - StartEvent: emitted at start with resolved counts
  - ProgressEvent: emitted after each rollout/group completes
  - GroupCompleteEvent: for PR PrimeIntellect-ai#632, includes State objects
  - LogEvent: for log messages with level/source/timestamp
  - LogStreamEvent: infrastructure for PrimeIntellect-ai#753 log streaming
  - SaveEvent: when results are saved (intermediate or final)
  - CompleteEvent: when generation finishes
- Add EvalEvent union type and EventHandler callback type

- LogStreamFileWriter: writes log_stream events to files for tailing

- Update generate() and evaluate() signatures: single on_event parameter
- Add _emit_event() helper using maybe_await()
- Add _run_group_with_states() to preserve State objects
- Emit events at all key points:
  - StartEvent with accurate num_examples calculation
  - ProgressEvent after each completion
  - GroupCompleteEvent for non-independent scoring (with States)
  - SaveEvent for intermediate and final saves
  - LogEvent for important messages
  - CompleteEvent at the end

- Update run_evaluation() signature to use on_event
- Rewrite run_evaluations_tui() with match/case event handler
- Preserve all metric accumulation logic

- Update comments to reflect new event system
- Add TODO for full log_streams implementation

- Event type structure tests
- LogStreamFileWriter tests (creation, appending, custom paths)
- All 35 eval-related tests pass

Removed callback parameters from:
- Environment.generate()
- Environment.evaluate()
- run_evaluation()

Replaced with: on_event: EventHandler

- Issue PrimeIntellect-ai#755: Consolidate callbacks for eval TUI
- PR PrimeIntellect-ai#632: GroupCompleteEvent with State objects (infrastructure)
- Issue PrimeIntellect-ai#753: Log streaming infrastructure (LogStreamEvent + writer)
estsauver pushed a commit to estsauver/verifiers that referenced this pull request Feb 6, 2026
…ai#755)

Replace three separate callbacks (on_start, on_progress, on_log) with a
single event-based system using TypedDict events with Literal discriminators.

- Remove: StartCallback, ProgressCallback, LogCallback
- Add 7 new event TypedDicts with type discriminator:
  - StartEvent: emitted at start with resolved counts
  - ProgressEvent: emitted after each rollout/group completes
  - GroupCompleteEvent: for PR PrimeIntellect-ai#632, includes State objects
  - LogEvent: for log messages with level/source/timestamp
  - LogStreamEvent: infrastructure for PrimeIntellect-ai#753 log streaming
  - SaveEvent: when results are saved (intermediate or final)
  - CompleteEvent: when generation finishes
- Add EvalEvent union type and EventHandler callback type

- LogStreamFileWriter: writes log_stream events to files for tailing

- Update generate() and evaluate() signatures: single on_event parameter
- Add _emit_event() helper using maybe_await()
- Add _run_group_with_states() to preserve State objects
- Emit events at all key points:
  - StartEvent with accurate num_examples calculation
  - ProgressEvent after each completion
  - GroupCompleteEvent for non-independent scoring (with States)
  - SaveEvent for intermediate and final saves
  - LogEvent for important messages
  - CompleteEvent at the end

- Update run_evaluation() signature to use on_event
- Rewrite run_evaluations_tui() with match/case event handler
- Preserve all metric accumulation logic

- Update comments to reflect new event system
- Add TODO for full log_streams implementation

- Event type structure tests
- LogStreamFileWriter tests (creation, appending, custom paths)
- All 35 eval-related tests pass

Removed callback parameters from:
- Environment.generate()
- Environment.evaluate()
- run_evaluation()

Replaced with: on_event: EventHandler

- Issue PrimeIntellect-ai#755: Consolidate callbacks for eval TUI
- PR PrimeIntellect-ai#632: GroupCompleteEvent with State objects (infrastructure)
- Issue PrimeIntellect-ai#753: Log streaming infrastructure (LogStreamEvent + writer)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant