Shared working memory for collaborative AI agents
- Concept
- Architecture
- Prerequisites
- Installation
- Configuration
- Getting Started
- MCP Tools
- Graph Bridge
- Web Interface
- MCP Integration
- CLI and Shell
- Tests
- Security
- Project Structure
- Troubleshooting
Live Memory is an MCP (Model Context Protocol) server that provides Memory Bank as a Service for AI agents. Multiple agents collaborate on the same project by sharing a common working memory.
graph-memory = LONG-TERM memory (documents → Knowledge Graph → Vector RAG)
live-memory = WORKING memory (live notes → LLM → Structured Memory Bank)
| Mode | Description | Analogy |
|---|---|---|
| 🔴 Live | Real-time notes (observations, decisions, todos...) append-only | Shared whiteboard |
| 📘 Bank | LLM consolidation into structured Markdown files based on rules | Structured project log |
| Problem | Live Memory Solution |
|---|---|
| Agents lose context between sessions | bank_read_all → complete context in 1 call |
| Multi-agent collaboration is impossible | Append-only notes, no conflicts, cross-visibility |
| Manual consolidation is tedious | LLM transforms raw notes into structured documentation |
| Memory scattered in local files | Central S3 point, accessible from everywhere |
| No link with long-term memory | 🌉 Graph Bridge pushes the bank into a knowledge graph |
Recent research on LLM-based multi-agent systems (Tran et al., 2025 — Multi-Agent Collaboration Mechanisms: A Survey of LLMs) identifies shared memory as a fundamental component. In their formal framework, a multi-agent system is defined by agents (A), a shared environment (E), and collaboration channels (C). The authors emphasize that LLMs are inherently isolated algorithms, not designed to collaborate — they need a shared memory infrastructure to coordinate their actions.
Live Memory + Graph Memory directly implements this architecture:
┌─────────────────────────────────────────────────────────────┐
│ Shared Environment E │
│ │
│ ┌──────────────────┐ LLM ┌──────────────────────┐ │
│ │ Live │ ──────► │ Bank │ │
│ │ Real-time notes │ consolid│ Structured working │ │
│ │ (append-only) │ -ates │ memory │ │
│ └──────────────────┘ └──────────┬───────────┘ │
│ │ │
│ graph_push │
│ (MCP Streamable HTTP) │
│ │ │
│ ┌──────────▼───────────┐ │
│ │ 🌐 Graph Memory │ │
│ │ Knowledge Graph │ │
│ │ (entities, relations│ │
│ │ embeddings, RAG) │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
| Level | Service | Duration | Content | Usage |
|---|---|---|---|---|
| Working Memory | Live Memory | Session / project | Raw notes + consolidated Markdown bank | Operational context, daily coordination |
| Long-term Memory | Graph Memory | Permanent | Entities + relations + vector embeddings | Searchable knowledge base in natural language |
The Graph Bridge (graph_push) is the collaboration channel between these two levels. Following the late-stage collaboration pattern described in literature (sharing consolidated outputs as inputs to another system), it transforms working documentation (Markdown) into structured knowledge (entities/relations graph).
Why two levels? One level is not enough:
- Working memory alone is ephemeral — it disappears when the project ends
- Knowledge graph alone is too heavy for quick daily notes
- The bridge between the two allows agents to work fast (live notes) while capitalizing knowledge (graph)
Specifically, agents can:
- Write quickly without friction (live-memory, append-only, ~50ms)
- Automatically consolidate via LLM into structured documentation (bank, ~15s)
- Persist knowledge in a searchable graph (graph-memory, ~2min)
- Query the graph in natural language to retrieve information from past projects
Agent Cline Agent Claude Agent X
│ │ │
└────────┬──────────┘ │
│ │
▼ MCP Protocol (Streamable HTTP) ▼
┌────────────────────────────────────────┐
│ Caddy WAF (Coraza CRS) │
│ Rate Limiting • TLS • OWASP CRS │
└────────────┬───────────────────────────┘
│
┌────────────┴───────────────────┐
│ Live Memory MCP (:8002) │
│ 43 tools • Auth Bearer │
│ LLM Consolidation │
└──────┬──────────┬──────┬───────┘
│ │ │
┌──────┴──┐ ┌────┴───┐ │
│ S3 │ │ LLMaaS │ │ MCP Streamable HTTP
│Dell ECS │ │ CT API │ │ (optional)
└─────────┘ └────────┘ │
┌───────────┴────────────┐
│ Graph Memory │
│ (long-term memory) │
│ Neo4j + Qdrant │
└────────────────────────┘
Minimal Stack: S3 + LLM. No local database. Optional: connection to Graph Memory for long-term memory (knowledge graph).
- Docker >= 24.0 + Docker Compose v2
- Python 3.11+ (for CLI, optional)
- A compatible S3 storage (Cloud Temple Dell ECS, AWS, MinIO)
- An OpenAI API compatible LLM (Cloud Temple LLMaaS, OpenAI, etc.)
git clone https://github.com/Cloud-Temple/live-memory.git
cd live-memorycp .env.example .envEdit .env with your values (see Configuration).
# Build images (WAF + MCP server)
docker compose build
# Start services
docker compose up -d
# Check status
docker compose ps
# Health check
curl -s http://localhost:8080/health# Install dependencies
uv pip install -e .
# Run server
python -m live_memuv pip install -e .# Health check via CLI
python scripts/mcp_cli.py health
# Or full E2E test (creates space, writes notes, consolidates)
python scripts/test_recette.py| Service | Port | Description |
|---|---|---|
| WAF | 8080 |
Only exposed port — Caddy WAF → Live Memory |
| MCP Server | 8002 |
Internal Docker network only |
Edit .env. All variables are documented in .env.example.
| Variable | Description | Example |
|---|---|---|
S3_ENDPOINT_URL |
S3 endpoint URL | https://takinc5acc.s3.fr1.cloud-temple.com |
S3_ACCESS_KEY_ID |
S3 access key | AKIA... |
S3_SECRET_ACCESS_KEY |
S3 secret key | wJal... |
S3_BUCKET_NAME |
Bucket name | live-mem |
S3_REGION_NAME |
S3 region | fr1 |
LLMAAS_API_URL |
LLM API URL (must include /v1) |
https://api.ai.cloud-temple.com/v1 |
LLMAAS_API_KEY |
LLM API key | sk-... |
ADMIN_BOOTSTRAP_KEY |
Admin bootstrap key (≥ 32 chars) | my-secret-key-change-me |
The consolidator uses an LLM (OpenAI-compatible API) to transform live notes into structured bank files.
| Variable | Default | Description |
|---|---|---|
LLMAAS_MODEL |
qwen3.5:27b |
LLM model name as exposed by the provider |
LLMAAS_CONTEXT_WINDOW |
131072 |
TOTAL context window of the model (input + output combined, in tokens). Qwen3 235B = 128K |
LLMAAS_MAX_TOKENS |
16384 |
Max OUTPUT tokens per request. The consolidator adjusts dynamically: output = min(MAX_TOKENS, CONTEXT_WINDOW - input) |
LLMAAS_TEMPERATURE |
0.3 |
LLM creativity (0.0 = deterministic, 1.0 = very creative) |
PROXY_URL |
(none) | Outbound HTTP proxy (e.g. http://10.0.0.1:3128). Custom variable (not HTTP_PROXY) — injected manually into boto3 (S3) and httpx (LLM). Not supported for Graph Memory connections. |
| Variable | Default | Description |
|---|---|---|
MCP_SERVER_PORT |
8002 |
MCP server listening port |
MCP_SERVER_DEBUG |
false |
Detailed logs (full error messages) |
CONSOLIDATION_TIMEOUT |
600 |
Timeout per LLM call (seconds) |
CONSOLIDATION_MAX_NOTES |
200 |
Max notes per consolidation |
CONSOLIDATION_BATCH_SIZE |
5 |
Notes per LLM batch (small = precise, large = faster) |
CONSOLIDATION_COOLDOWN_SECONDS |
60 |
Per-space anti-spam cooldown for bank_consolidate (0 disables) |
CONSOLIDATION_VALIDATION_ENABLED |
false |
Optional post-consolidation check for unattributed claims |
CONSOLIDATION_VALIDATION_MAX_EXAMPLES |
20 |
Max examples returned by the validation pass |
COMPACT_THRESHOLD |
0.6 |
Auto-compaction trigger (0.6 = compact if bank > 60% of budget) |
BANK_FILE_MAX_SIZE |
15360 |
Max size per bank file (bytes, 15 KB). Above = compaction candidate |
RESPONSE_MAX_BYTES |
524288 |
Max non-MCP response body size before truncation |
API_TOOL_MAX_BODY_BYTES |
1048576 |
Max request body accepted by /api/tool |
docker compose up -d
docker compose ps # Check status
docker compose logs -f live-mem-service --tail 50 # Logs43 tools exposed via the MCP protocol (Streamable HTTP), divided into 7 categories.
| Tool | Parameters | Description |
|---|---|---|
system_health |
— | Health status (S3, LLMaaS, number of spaces) |
system_whoami |
— | 👤 Current token identity (name, permissions, spaces) |
system_about |
— | Service identity (version, tools, capabilities) |
| Tool | Parameters | Description |
|---|---|---|
space_create |
space_id, description, rules, owner? |
Creates a space with its rules (bank structure) |
space_update |
space_id, description?, owner? |
Updates description and/or owner |
space_update_rules |
space_id, rules |
📜 Updates space rules (admin only) |
space_list |
— | Lists spaces accessible by current token |
space_info |
space_id |
Detailed info (notes, bank, consolidation) |
space_rules |
space_id |
Reads immutable space rules |
space_summary |
space_id |
Complete summary: rules + bank + stats (agent startup) |
space_export |
space_id |
tar.gz export in base64 |
space_delete |
space_id, confirm |
Deletes the space ( |
| Tool | Parameters | Description |
|---|---|---|
live_note |
space_id, category, content, tags? |
Writes a timestamped note (agent = token name). Categories: observation, decision, todo, insight, question, progress, issue |
live_read |
space_id, limit?, category?, agent? |
Reads live notes (optional filters) |
live_search |
space_id, query, limit? |
Full-text search in notes |
| Tool | Parameters | Description |
|---|---|---|
bank_read |
space_id, filename |
Reads a bank file (supports subfolders: personaProfiles/buyer.md) |
bank_read_all |
space_id |
Reads entire bank in one request (🚀 agent startup) |
bank_list |
space_id |
Lists bank files with relative paths (without content) |
bank_consolidate |
space_id, agent? |
🧠 Enqueues async LLM consolidation. Call once; do not watch/poll unless explicitly requested |
bank_consolidation_status |
job_id |
Manual-only status check for a job returned by bank_consolidate |
bank_consolidation_queues |
space_ids? |
Read-only summary of consolidation lanes by space |
bank_stale_spaces |
min_notes?=5, min_age_days?=5, space_ids? |
🚨 Lists spaces with ≥N unconsolidated notes whose oldest is ≥D days old (supervision) |
bank_compact |
space_id, dry_run? |
🔧 Compacts oversized bank files via LLM. dry_run=True by default (admin) |
bank_repair |
space_id, dry_run? |
🔧 Repairs corrupted filenames (Unicode, parasitic prefixes). dry_run=True by default (admin) |
bank_write |
space_id, filename, content |
✏️ Writes/replaces a bank file directly — bypasses LLM consolidation (admin) |
bank_delete |
space_id, filename |
🗑️ Deletes a bank file + its Unicode duplicates (admin, irreversible) |
| Tool | Parameters | Description |
|---|---|---|
graph_connect |
space_id, url, token, memory_id, ontology? |
Connects a space to Graph Memory. Tests connection, creates memory if needed. Default ontology: general |
graph_push |
space_id |
Synchronizes bank → graph. Smart delete + re-ingest, orphan cleanup. ~30s/file |
graph_status |
space_id |
Connection status + graph stats (documents, entities, relations, top entities, documents list) |
graph_disconnect |
space_id |
Disconnects (data remains in graph) |
| Tool | Parameters | Description |
|---|---|---|
backup_create |
space_id, description? |
Creates a full snapshot on S3 |
backup_list |
space_id? |
Lists available backups |
backup_restore |
backup_id |
Restores a backup (space must not exist) |
backup_download |
backup_id |
Download as tar.gz base64 |
backup_delete |
backup_id |
Deletes a backup |
| Tool | Parameters | Description |
|---|---|---|
admin_create_token |
name, permissions, space_ids?, expires_in_days?, email? |
Creates a token ( |
admin_list_tokens |
— | Lists active tokens |
admin_revoke_token |
token_hash |
Revokes a token (makes it unusable) |
admin_delete_token |
token_hash |
Physically deletes a token from the registry ( |
admin_purge_tokens |
revoked_only? |
Bulk purge: revoked only (default) or all tokens |
admin_update_token |
token_hash, space_ids, action |
Modifies token spaces (add/remove/set) |
admin_bulk_update_tokens |
filters, delta, confirm? |
Bulk token update with filters and add/remove/set operations |
admin_gc_notes |
space_id?, max_age_days?, confirm?, delete_only? |
Garbage Collector: cleans orphaned notes |
Live Memory can push its Memory Bank into a Graph Memory instance for long-term memory. The knowledge graph extracts entities, relations, and embeddings from bank files.
1. graph_connect(space_id, url, token, memory_id, ontology="general")
└─ Tests connection, creates Graph Memory if needed
2. bank_consolidate(space_id)
└─ Queues async consolidation; call once and do not watch/poll unless explicitly requested
3. graph_push(space_id)
├─ Lists documents in Graph Memory
├─ For each modified bank file:
│ ├─ document_delete (removes orphaned entities)
│ └─ memory_ingest (complete graph recalculation)
├─ Cleans deleted bank documents
└─ Updates metrics (last_push, push_count)
4. graph_status(space_id)
└─ Stats: 79 entities, 61 relations, top entities, documents...
Each push is a complete refresh of the graph for that file. Existing files are deleted then re-ingested so Graph Memory recalculates entities, relations, and embeddings with up-to-date content.
| Ontology | Usage |
|---|---|
general (default) |
Versatile: FAQ, specs, certifications, CSR |
legal |
Legal documents, contracts |
cloud |
Cloud infrastructure, product sheets |
managed-services |
Managed services, outsourcing |
presales |
Pre-sales, RFP/RFI, proposals |
Live Memory exposes a web interface on /live to visualize memory spaces in real-time.
http://localhost:8080/live
| Zone | Content |
|---|---|
| 📊 Dashboard (left) | Space info, consolidation (date + counters), live/bank stats, colored agents, categories with %, Markdown rules, Graph Memory |
| 🔴 Live Timeline (top-right) | Live notes grouped by date (Today/Yesterday/date), cards with agent + category + Markdown |
| 📘 Bank Viewer (bottom-right) | Consolidated file tabs, Markdown rendering with marked.js |
┌──────────────┬────────────────────────────┐
│ 📊 Dashboard│ 🔴 Live Timeline │
│ (info, │ (auto-refresh, date group)│
│ agents, ├────────────────────────────┤
│ rules...) │ 📘 Bank (Markdown tabs) │
└──────────────┴────────────────────────────┘
- Configurable: 3s / 5s / 10s / 30s / manual
- Anti-flicker: only re-renders DOM if data has changed
- Pulsing green dot with last refresh timestamp
- Space selection → immediate loading (no button needed)
| Endpoint | Description |
|---|---|
GET /api/spaces |
List of spaces |
GET /api/space/{id} |
Complete info (meta + rules + stats + graph-memory) |
GET /api/live/{id} |
Live notes (filters: ?agent=, ?category=, ?limit=) |
GET /api/bank/{id} |
Bank file list |
GET /api/bank/{id}/{filename} |
Bank file content |
/api/* endpoints require a Bearer Token. /live page and /static/* files are public.
A full administration console is available at /admin, exposing all 43 MCP tools through a web interface:
http://localhost:8080/admin
| Section | Features |
|---|---|
| 📊 Dashboard | Health status (clickable → service details), spaces count, active tokens, version/uptime, identity bar |
| 📂 Spaces | CRUD, info/rules modals, explore link, delete with confirmation |
| 🔑 Tokens | Create/update/revoke/delete, visual space chips with delta calculation |
| 🔍 Explorer | Live notes + bank files side-by-side for any space |
| 💾 Backups | Create/restore/delete, "Backup All", dynamic columns |
| 🌉 Graph Bridge | Status check, push, disconnect per space |
| 🧹 Maintenance | Consolidate, compact, repair, GC, purge — single space selector, compact action list |
- Auth: requires a valid token (same as
/live), session via HttpOnly cookie - CSP-safe: zero inline handlers, all via
data-action+ event delegation - Upload Rules: file picker (
.md) or paste directly from the Rules modal
📖 Full Guide: See GUIDE_INTEGRATION_CLINE.md for the step-by-step guide (Cline configuration, custom instructions, workflow, multi-agents, troubleshooting).
In Cline's MCP settings (cline_mcp_settings.json):
{
"mcpServers": {
"live-memory": {
"url": "http://localhost:8080/mcp",
"headers": {
"Authorization": "Bearer lm_YOUR_TOKEN"
}
}
}
}To configure the Custom Instructions for your agent, copy the clinerules.md file into your Cline global Custom Instructions (or into a .clinerules/ directory in your project). You only need to change two values:
- The MCP server name (as configured in
cline_mcp_settings.json, e.g.my-live-mem) - The name of your memory space (the ID passed to
space_create, e.g.my-project)
The agent name is auto-detected from the authentication token — nothing else to configure.
💡 Ready-to-use template:
clinerules.md— copy and customize the 2 bold values📖 Detailed guide: Cline Integration & Custom Instructions Guide
In claude_desktop_config.json:
{
"mcpServers": {
"live-memory": {
"url": "http://localhost:8080/mcp",
"headers": {
"Authorization": "Bearer lm_YOUR_TOKEN"
}
}
}
}from mcp.client.streamable_http import streamablehttp_client
from mcp import ClientSession
async def example():
headers = {"Authorization": "Bearer your_token"}
async with streamablehttp_client("http://localhost:8080/mcp", headers=headers) as (r, w, _):
async with ClientSession(r, w) as session:
await session.initialize()
# Load all context
result = await session.call_tool("bank_read_all", {
"space_id": "my-project"
})
# Write a note
await session.call_tool("live_note", {
"space_id": "my-project",
"category": "observation",
"content": "Build passing in CI"
})pip install click rich prompt-toolkit mcp[cli]>=1.8.0
export MCP_URL=http://localhost:8080
export MCP_TOKEN=your_tokenpython scripts/mcp_cli.py health
python scripts/mcp_cli.py whoami # Current token identity
python scripts/mcp_cli.py about
python scripts/mcp_cli.py space list
python scripts/mcp_cli.py space create my-project --rules-file rules.md
python scripts/mcp_cli.py live note my-project observation "Build OK"
python scripts/mcp_cli.py bank consolidate my-project
python scripts/mcp_cli.py bank read-all my-project
python scripts/mcp_cli.py token create agent-cline read,write
python scripts/mcp_cli.py graph connect my-project URL TOKEN MEM-ID -o general
python scripts/mcp_cli.py graph push my-project
python scripts/mcp_cli.py graph status my-project
python scripts/mcp_cli.py graph disconnect my-projectpython scripts/mcp_cli.py shellAutocomplete, history, Rich display. See scripts/README.md for full reference.
Unified test script with 4 selectable suites via --suite:
docker compose up -d # Prerequisite
# All suites (44 tests, ~60s)
python scripts/test_recette.py --url http://localhost:8080
# Single suite
python scripts/test_recette.py --suite recette # Agent pipeline (7 tests)
python scripts/test_recette.py --suite isolation # Multi-tenant (18 tests)
python scripts/test_recette.py --suite qualite # MCP tools (19 tests)
# Graph Memory suite (optional, requires running graph-memory)
python scripts/test_recette.py --suite graph \
--graph-url http://host.docker.internal:8080 \
--graph-token your_token
# List available suites
python scripts/test_recette.py --list
# Step-by-step + verbose
python scripts/test_recette.py --suite isolation -v --step --no-cleanup| Suite | Tests | Description |
|---|---|---|
recette |
7 | Full pipeline: token → notes → LLM consolidation → bank |
isolation |
18 | Multi-tenant isolation v0.7.1: cross-space access, backup filtering, auto-add token |
qualite |
19 | MCP tools regression testing: system, admin, space, live, bank, backup, GC |
graph |
~8 | Graph Memory bridge: connect, push, status, disconnect (optional) |
- Bearer Token mandatory on all MCP requests
- Bootstrap key to create the first admin token
- SHA-256 Tokens stored on S3 (never in clear text)
- 3 levels: read, write, admin
- Space scope: a token can be limited to specific spaces
- OWASP CRS: SQL/XSS injection, path traversal, SSRF
- Rate Limiting: 200 MCP/min (Streamable HTTP)
- Automatic TLS: Let's Encrypt in production (
SITE_ADDRESS=domain.com) - Non-root container:
mcpuser
live-memory/
├── src/live_mem/ # Source code (43 MCP tools + web interface)
│ ├── server.py # FastMCP server + middlewares
│ ├── config.py # pydantic-settings configuration
│ ├── auth/ # Authentication
│ │ ├── middleware.py # Auth + Logging + StaticFiles
│ │ └── context.py # check_access, check_write, check_admin
│ ├── static/ # /live web interface
│ │ ├── live.html # SPA (Dashboard + Live + Bank)
│ │ ├── css/live.css # Styles (Cloud Temple theme)
│ │ ├── js/ # 7 JS modules (config, api, app, dashboard, timeline, bank, sidebar)
│ │ └── img/ # Cloud Temple SVG Logo
│ ├── core/ # Business services
│ │ ├── storage.py # S3 dual SigV2/SigV4 (Dell ECS)
│ │ ├── space.py # Memory spaces CRUD
│ │ ├── live.py # Live notes (append-only)
│ │ ├── consolidator.py # LLM Pipeline (4 steps)
│ │ ├── graph_bridge.py # 🌉 Link to Graph Memory
│ │ ├── tokens.py # SHA-256 tokens management
│ │ ├── backup.py # S3 snapshots
│ │ ├── gc.py # Garbage Collector
│ │ ├── locks.py # asyncio locks per space
│ │ └── models.py # Pydantic models
│ └── tools/ # MCP Tools (7 modules)
│ ├── system.py # 3 tools (health, whoami, about)
│ ├── space.py # 9 tools (spaces CRUD)
│ ├── live.py # 3 tools (notes)
│ ├── bank.py # 11 tools (bank + consolidation + supervision + maintenance)
│ ├── graph.py # 4 tools (Graph Bridge)
│ ├── backup.py # 5 tools (snapshots)
│ └── admin.py # 8 tools (tokens + GC + purge + bulk)
├── scripts/ # CLI + Shell + Tests
├── waf/ # Caddy + Coraza WAF
├── clinerules.md # 📋 Cline Custom Instructions template (copy + customize)
├── DESIGN/live-mem/ # 9 architecture documents
├── docker-compose.yml
├── Dockerfile
├── pyproject.toml # Dependencies & project config (uv)
├── uv.lock # uv lockfile
├── VERSION # 2.4.0
├── CHANGELOG.md
└── FAQ.md
docker compose logs live-mem-service --tail 50
docker compose logs waf --tail 20- Check your token:
Authorization: Bearer YOUR_TOKEN - Bootstrap key is not a token — create a token first via
admin_create_token
- Check LLMaaS credentials in
.env - Default timeout is 600s — increase
CONSOLIDATION_TIMEOUTif needed bank_consolidatereturns an async job acknowledgement (runningorqueued) withnext_action="return_to_user_without_polling"; call it once and do not watch/poll unless explicitly requestedbank_consolidation_status(job_id)remains available for manual status checks only
| Project | Description | Link |
|---|---|---|
| graph-memory | Long-term memory (Knowledge Graph + RAG) | github.com/Cloud-Temple/graph-memory |
Apache License 2.0
Cloud Temple — cloud-temple.com
Developed by Christophe Lesur.
Live Memory v2.4.0 — Shared working memory for collaborative AI agents