Problem
Entire currently stores committed checkpoints in one append-only Git branch, entire/checkpoints/v1. That has a few problems:
- The Git repository keeps growing without clear cleanup boundaries.
- GitHub treats
entire/checkpoints/v1 like a normal branch and may show “open a pull request” prompts after checkpoint pushes.
- In new repositories, GitHub has sometimes selected
entire/checkpoints/v1 as the default branch because no other branch existed yet.
- Pushes are broader than necessary because one branch represents all checkpoint history.
The proposed direction is to store each checkpoint under its own Git ref.
Related Work
This should build on the checkpoint-store refactor tracked in #1433.
Ideally the pluggable store boundary lands before the ref-store work starts. In practice, this work may happen in parallel across 2-4 people. The ref-store design should follow the direction of that issue and avoid introducing a second abstraction that would need to be reconciled later.
The migration approach can reuse ideas from #1397, but production checkpoint refs should point to commits, not raw tree objects.
Storage Model
Each checkpoint gets one stable ref: refs/entire/checkpoints/<shard>/<checkpoint-id>.
The ref points to a checkpoint commit containing the checkpoint tree. The first commit for a migrated checkpoint can be an orphan commit that reuses the existing entire/checkpoints/v1 subtree. Later updates can advance the same ref and preserve per-checkpoint history.
The ref should not point directly to a raw tree object. Commits work better with existing Git tooling and let us keep using Git commit signing.
For new CLI versions, the ref store is authoritative. If a checkpoint is written to refs, the CLI should read it from refs so ref-store issues are visible during rollout.
Metadata Versioning
Root checkpoint metadata should include both cli_version and checkpoints_version.
cli_version records the CLI that wrote the checkpoint. checkpoints_version records the checkpoint storage format, for example refs-1. Session metadata does not need a separate checkpoint storage version.
Checkpoint IDs
Existing checkpoint IDs cannot change because they are already referenced from Git history. Legacy IDs should keep the current prefix-shard layout. It would be possible to create a commit/checkpoint mapping using ref structure but that would add additional complexity.
Future checkpoint IDs are expected to be ULIDs. New ULID refs should use the last two ULID characters as the shard, for example refs/entire/checkpoints/ZN/01KVBJCWYA4YW6J5M9GP655HZN. This is to ensure an even spread of shards across checkpoints while keeping the checkpoint IDs lexicographically sortable.
The ref resolver should support both legacy IDs and ULIDs for the time being.
Rollout Control
Rollout should use strategy_options, consistent with existing settings in .entire/settings.json.
A possible setting is:
{
"strategy_options": {
"checkpoint_store": "refs-v1-mirror"
}
}
Expected modes:
- unset: current
entire/checkpoints/v1 behavior
refs-v1-mirror: refs are authoritative, and entire/checkpoints/v1 is still written as a temporary compatibility mirror
refs-v1: refs only, after the ref store has proven stable
The exact names can change, but rollout mode should stay separate from checkpoint metadata versioning.
Initial Rollout
During the initial rollout, the CLI should dual-write:
- authoritative per-checkpoint refs
- a temporary
entire/checkpoints/v1 compatibility mirror
This is based on prior rollout experience where some repositories moved to a new storage mode too early and reverse-migration tooling had to be built afterwards to avoid data loss.
The entire/checkpoints/v1 mirror should be removed once the ref store has proven stable.
Migration And Backfill
The first version should create one commit per existing checkpoint by reusing the checkpoint subtree from entire/checkpoints/v1. It does not need to reconstruct the full entire/checkpoints/v1 update history.
Migration should preserve useful ordering and timestamps where possible, because some listing flows sort by checkpoint or commit time.
Migration commits should use normal checkpoint signing behavior by default. The operator’s Git identity should be the default identity. The migration command may also allow an explicit author, such as Entire Checkpoint Migration <checkpoints@entire.io>, and an explicit signing key via --signing-key. If we wanted to simplify the migration command, skipping checkpoint signing completely would also be an option.
Migration should signal refs for push through the same mechanism as normal checkpoint writes.
Push Discovery
The CLI should not push every local checkpoint ref. That would push refs fetched for local reads, and deleting local refs after push would make normal local workflows worse.
Instead, checkpoint writes should enqueue refs that need to be pushed. A simple flock-protected JSONL queue in the Git common directory is enough for the first version. Batch pushes should be used so migrated or newly written refs are not pushed one by one.
Reads And Listings
Branch-scoped flows should continue to use code commit history (as opposed to checkpoint commit history) to find relevant checkpoint IDs, then resolve those IDs to checkpoint refs.
If needed refs are missing locally, the CLI should fetch those refs directly and efficiently rather than asking users to run Git commands.
Storage-level cleanup and maintenance operations can list local checkpoint refs only for now. No local checkpoint index is needed in the first version.
Out Of Scope
This proposal does not decide:
- exact retry or partial-failure behavior
- queue compaction or repair mechanics
- remote pruning behavior
- local indexing
- detailed implementation sequencing
- session-level refs
- imported-transcript association refs
Future Extensions
Later work may add refs such as refs/entire/sessions/... and refs/entire/commits/....
Session refs could expose session-level checkpoint data directly. Commit association refs could link imported transcripts to existing code commits without rewriting Git history.
Those are useful future directions, but they should not complicate the checkpoint-ref rollout.
Review Checklist
Before moving this proposal into implementation planning, we should confirm that it addresses the problems with the current monolithic branch design:
- Checkpoint storage has cleanup boundaries instead of one ever-growing branch.
- GitHub no longer treats
entire/checkpoints/v1 as a normal branch that invites pull requests.
- New repositories cannot accidentally get
entire/checkpoints/v1 selected as the default branch.
- Ref storage is authoritative during rollout, so new-storage bugs are visible.
- The temporary
entire/checkpoints/v1 mirror exists only for rollback and downgrade safety.
Problem
Entire currently stores committed checkpoints in one append-only Git branch,
entire/checkpoints/v1. That has a few problems:entire/checkpoints/v1like a normal branch and may show “open a pull request” prompts after checkpoint pushes.entire/checkpoints/v1as the default branch because no other branch existed yet.The proposed direction is to store each checkpoint under its own Git ref.
Related Work
This should build on the checkpoint-store refactor tracked in #1433.
Ideally the pluggable store boundary lands before the ref-store work starts. In practice, this work may happen in parallel across 2-4 people. The ref-store design should follow the direction of that issue and avoid introducing a second abstraction that would need to be reconciled later.
The migration approach can reuse ideas from #1397, but production checkpoint refs should point to commits, not raw tree objects.
Storage Model
Each checkpoint gets one stable ref:
refs/entire/checkpoints/<shard>/<checkpoint-id>.The ref points to a checkpoint commit containing the checkpoint tree. The first commit for a migrated checkpoint can be an orphan commit that reuses the existing
entire/checkpoints/v1subtree. Later updates can advance the same ref and preserve per-checkpoint history.The ref should not point directly to a raw tree object. Commits work better with existing Git tooling and let us keep using Git commit signing.
For new CLI versions, the ref store is authoritative. If a checkpoint is written to refs, the CLI should read it from refs so ref-store issues are visible during rollout.
Metadata Versioning
Root checkpoint metadata should include both
cli_versionandcheckpoints_version.cli_versionrecords the CLI that wrote the checkpoint.checkpoints_versionrecords the checkpoint storage format, for examplerefs-1. Session metadata does not need a separate checkpoint storage version.Checkpoint IDs
Existing checkpoint IDs cannot change because they are already referenced from Git history. Legacy IDs should keep the current prefix-shard layout. It would be possible to create a commit/checkpoint mapping using ref structure but that would add additional complexity.
Future checkpoint IDs are expected to be ULIDs. New ULID refs should use the last two ULID characters as the shard, for example
refs/entire/checkpoints/ZN/01KVBJCWYA4YW6J5M9GP655HZN. This is to ensure an even spread of shards across checkpoints while keeping the checkpoint IDs lexicographically sortable.The ref resolver should support both legacy IDs and ULIDs for the time being.
Rollout Control
Rollout should use
strategy_options, consistent with existing settings in.entire/settings.json.A possible setting is:
{ "strategy_options": { "checkpoint_store": "refs-v1-mirror" } }Expected modes:
entire/checkpoints/v1behaviorrefs-v1-mirror: refs are authoritative, andentire/checkpoints/v1is still written as a temporary compatibility mirrorrefs-v1: refs only, after the ref store has proven stableThe exact names can change, but rollout mode should stay separate from checkpoint metadata versioning.
Initial Rollout
During the initial rollout, the CLI should dual-write:
entire/checkpoints/v1compatibility mirrorThis is based on prior rollout experience where some repositories moved to a new storage mode too early and reverse-migration tooling had to be built afterwards to avoid data loss.
The
entire/checkpoints/v1mirror should be removed once the ref store has proven stable.Migration And Backfill
The first version should create one commit per existing checkpoint by reusing the checkpoint subtree from
entire/checkpoints/v1. It does not need to reconstruct the fullentire/checkpoints/v1update history.Migration should preserve useful ordering and timestamps where possible, because some listing flows sort by checkpoint or commit time.
Migration commits should use normal checkpoint signing behavior by default. The operator’s Git identity should be the default identity. The migration command may also allow an explicit author, such as
Entire Checkpoint Migration <checkpoints@entire.io>, and an explicit signing key via--signing-key. If we wanted to simplify the migration command, skipping checkpoint signing completely would also be an option.Migration should signal refs for push through the same mechanism as normal checkpoint writes.
Push Discovery
The CLI should not push every local checkpoint ref. That would push refs fetched for local reads, and deleting local refs after push would make normal local workflows worse.
Instead, checkpoint writes should enqueue refs that need to be pushed. A simple flock-protected JSONL queue in the Git common directory is enough for the first version. Batch pushes should be used so migrated or newly written refs are not pushed one by one.
Reads And Listings
Branch-scoped flows should continue to use code commit history (as opposed to checkpoint commit history) to find relevant checkpoint IDs, then resolve those IDs to checkpoint refs.
If needed refs are missing locally, the CLI should fetch those refs directly and efficiently rather than asking users to run Git commands.
Storage-level cleanup and maintenance operations can list local checkpoint refs only for now. No local checkpoint index is needed in the first version.
Out Of Scope
This proposal does not decide:
Future Extensions
Later work may add refs such as
refs/entire/sessions/...andrefs/entire/commits/....Session refs could expose session-level checkpoint data directly. Commit association refs could link imported transcripts to existing code commits without rewriting Git history.
Those are useful future directions, but they should not complicate the checkpoint-ref rollout.
Review Checklist
Before moving this proposal into implementation planning, we should confirm that it addresses the problems with the current monolithic branch design:
entire/checkpoints/v1as a normal branch that invites pull requests.entire/checkpoints/v1selected as the default branch.entire/checkpoints/v1mirror exists only for rollback and downgrade safety.