Milestones

0.2.0
Unified KV cache (epic): radix-tree-indexed paged attention with shared physical KV blocks, vLLM/SGLang-style. Merge the existing prompt-prefix radix cache and the paged block pool into one refcounted, copy-on-write storage system, then add a real block-pool tensor layout and a gathering attention path.
No due date
•2/11 issues closed
18% complete9 open 2 closed