Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 10, 2025

Overview

Fixes a flaky test (test_flow_run_lateness) that occasionally fails in CI due to timing sensitivity. The test compares client-side request_time against server-side sa.func.now() (evaluated during query execution), and the original tolerance calculation didn't properly account for this drift.

Root Cause

The test was calculating tolerance as (now - start_of_test), which included unrelated test setup time. When API calls took longer due to CI runner contention, the tolerance became insufficient. Additionally, since the test sums lateness for 2 scheduled runs, the request→DB "now" drift applies per run and must be multiplied by 2.

Changes

  1. Changed tolerance basis: From (now - start_of_test) to (now - request_time) to measure actual API latency
  2. Consolidated constant: Extracted 2.0 into fudge_seconds variable used in both tolerance calculation and expected value adjustment
  3. Account for per-run drift: Multiply test_elapsed by 2 since the test sums lateness for 2 scheduled runs
  4. Applied ruff formatting: Wrapped long line to comply with line length limits

Testing

  • ✅ Test passes locally with the fix
  • ⏳ Waiting for CI to verify fix works across all Python versions and databases

Review Focus

⚠️ Key assumption: The hardcoded 2 multiplier in tolerance = fudge_seconds + 2 * test_elapsed assumes this test always sums exactly 2 scheduled runs. If the test changes in the future, this will need updating. Consider whether this should be derived from interval["states"][2]["count_runs"] instead.

  • Verify the logic that drift applies per-run is correct
  • Confirm the fix is appropriate for both SQLite and PostgreSQL
  • Check if the tolerance is now too loose or still appropriate

Link to Devin run: https://app.devin.ai/sessions/bf389bc601d2448cbf9140fd612d43da
Requested by: Nate Nowack ([email protected]) / @zzstoatzz

Checklist

  • This pull request references any related issue by including "closes <link to issue>"
  • If this pull request adds new functionality, it includes unit tests that cover the changes
    • N/A - This is a test fix, not new functionality
  • If this pull request removes docs files, it includes redirect settings in mint.json.
    • N/A
  • If this pull request adds functions or classes, it includes helpful docstrings.
    • N/A

…tency

The test was calculating tolerance based on time since test start, which
included unrelated test setup time. This caused flakes when the API call
took longer than expected due to CI runner contention.

The fix:
- Changes tolerance calculation to use (now - request_time) instead of
  (now - start_of_test), which directly measures API request latency
- Consolidates the 2-second SQLite microsecond fudge factor into a single
  variable used in both the tolerance and expected value calculations
- Updates comments to clarify the tolerance accounts for API latency

This is a minimal, targeted fix that makes the test resilient to timing
variations without being heavy-handed.

Co-Authored-By: Nate Nowack <[email protected]>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration bot and others added 3 commits November 10, 2025 17:29
The test sums lateness for 2 scheduled runs. The drift between request_time
and sa.func.now() (evaluated during query execution) applies per run, so the
tolerance must multiply test_elapsed by 2 to account for both runs.

This fixes the CI failure where actual difference (2.045s) exceeded the
tolerance (2.031s) by properly accounting for the doubled drift.

Co-Authored-By: Nate Nowack <[email protected]>
Ruff format wrapped the long line with inline comment into multi-line format
to comply with line length limits.

Co-Authored-By: Nate Nowack <[email protected]>
@github-actions
Copy link
Contributor

This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

This pull request was closed because it has been stale for 14 days with no activity. If this pull request is important or you have more to add feel free to re-open it.

@github-actions github-actions bot closed this Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants