Skip to content

Latest commit

 

History

History
194 lines (152 loc) · 7.17 KB

File metadata and controls

194 lines (152 loc) · 7.17 KB

lance-context

Multimodal, versioned context storage for agentic workflows built on top of Lance.

Lance Context gives AI agents a durable memory that can store text, binary payloads (images, Arrow tables, etc.), and semantic embeddings in a single columnar table. Every append produces a new Lance dataset version, so you can time-travel to prior checkpoints, branch off experiments, or reproduce conversations. The project ships with both a Rust API and a thin, Pythonic wrapper that integrates easily with orchestration frameworks.

Why another context store?

Key motivations inspired by the broader Lance roadmap1:

  • Multimodal first – store text, images, and structured data together, keeping the original bytes plus typed metadata.
  • Version aware – each append creates an immutable snapshot, enabling time-travel, branching, and auditability for long-running agents.
  • Searchable semantics – embeddings are managed alongside content so you can run Lance vector search without leaving the dataset.
  • Columnar performance – backed by the Lance file format, giving fast analytics, compaction, and cloud-friendly storage.

Features

  • Unified schema for agent messages (ContextRecord) with optional embeddings and metadata.
  • Automatic versioning via Lance manifests with checkout(version) support.
  • Background compaction to optimize storage and read performance.
  • Remote persistence on any object_store backend (S3, GCS, Azure Blob, ...) via the generic storage_options dict, aligned with lance and lance-graph.
  • Python API (lance_context.api.Context) aligned with the Rust implementation.
  • Integration tests that exercise real persistence, image serialization, and version rollbacks.

Project layout

crates/lance-context-core  # Pure Rust context engine (no Python deps)
crates/lance-context       # Re-export crate consumed by downstream clients/bindings
python/                    # PyO3 bindings, wheel build, and pytest suite
python/tests/              # High-level integration tests

Getting started

Install the Python package (wheel publishing coming soon):

pip install lance-context

Then follow the usage examples below to create a Context, append entries, and time-travel through versions.

Usage

Python

from pathlib import Path
from lance_context.api import Context

uri = Path("context.lance").as_posix()
ctx = Context.create(uri)

# Add multimodal entries
ctx.add("user", "Where should I travel in spring?")

from PIL import Image
image = Image.new("RGB", (2, 2), color="teal")
ctx.add("assistant", image)

print("Current version:", ctx.version())

# Time-travel to prior state
first_version = ctx.version()
ctx.add("assistant", "Let me fetch suggestions…")
ctx.checkout(first_version)

print("Entries after checkout:", ctx.entries())

# Remote persistence on any object_store backend uses a generic `storage_options`
# dict, matching the conventions used by `lance` and `lance-graph`.
#
# Amazon S3 (and S3-compatible endpoints like MinIO / moto):
ctx = Context.create(
    "s3://my-bucket/context.lance",
    storage_options={
        "aws_access_key_id": "minioadmin",
        "aws_secret_access_key": "minioadmin",
        "aws_region": "us-east-1",
        "aws_endpoint_url": "http://localhost:9000",  # optional
        "aws_allow_http": "true",                      # optional
    },
)
# Environment variables (AWS_ACCESS_KEY_ID, ...) are picked up by lance when
# `storage_options` isn't provided; pass overrides only when you need them.

# Google Cloud Storage:
ctx = Context.create(
    "gs://my-bucket/context.lance",
    storage_options={
        # Pick one: inline service-account JSON, path to the JSON file, or ADC.
        "google_service_account_key": service_account_json,
        # "google_service_account_path": "/path/to/sa.json",
        # "google_application_credentials": "/path/to/adc.json",
    },
)

# Azure Blob Storage:
ctx = Context.create(
    "az://my-container/context.lance",
    storage_options={
        "azure_storage_account_name": "...",
        "azure_storage_account_key": "...",
    },
)

# Background Compaction - optimize storage and read performance
ctx = Context.create(
    "context.lance",
    enable_background_compaction=True,  # Enable automatic compaction
    compaction_interval_secs=300,       # Check every 5 minutes
    compaction_min_fragments=10,        # Trigger when 10+ fragments exist
    quiet_hours=[(22, 6)],              # Skip compaction 10pm-6am
)

# Manual compaction control
for i in range(100):
    ctx.add("user", f"message {i}")  # Creates many small fragments

# Check compaction status
stats = ctx.compaction_stats()
print(f"Fragments: {stats['total_fragments']}")

# Manually trigger compaction
metrics = ctx.compact()
print(f"Compaction removed {metrics['fragments_removed']} fragments")

Rust

use lance_context::{ContextStore, ContextRecord, StateMetadata};
use chrono::Utc;

# tokio_test::block_on(async {
let mut store = ContextStore::open("context.lance").await?;
let record = ContextRecord {
    id: "run-1-1".into(),
    run_id: "run-1".into(),
    created_at: Utc::now(),
    role: "user".into(),
    state_metadata: Some(StateMetadata {
        step: Some(1),
        active_plan_id: None,
        tokens_used: None,
        custom: None,
    }),
    content_type: "text/plain".into(),
    text_payload: Some("hello world".into()),
    binary_payload: None,
    embedding: None,
};
store.add(&[record]).await?;
println!("Current version {}", store.version());
# Ok::<(), Box<dyn std::error::Error>>(())
# })?;

Testing

  • make test – Python pytest suite (including persistence integration tests).
  • cargo test --manifest-path crates/lance-context-core/Cargo.toml – Rust unit tests.
  • python/.venv/bin/ruff check python/ and python/.venv/bin/pyright – linting/type checks.

Roadmap

We are tracking future enhancements as GitHub issues:

Contributions are welcome—feel free to comment on the issues above or open your own proposals.

Contributing

  1. Fork and clone the repository.
  2. Create a feature branch off main.
  3. Set up the development environment:
    make venv      # creates python/.venv using uv
    make install   # installs the package in editable mode with test extras
    make test      # runs pytest (python/tests/)
    cargo test --manifest-path crates/lance-context-core/Cargo.toml
  4. Run linting/type checks: python/.venv/bin/ruff check python/, python/.venv/bin/pyright, and ~/.cargo/bin/cargo fmt -- --check.
  5. Open a Pull Request with a clear summary of the change.

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.