CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Common Development Commands

Build

# Build all packages
uv build --all-packages

# Build specific package
uv build --package embcli-core

Run Tests

# Run tests for specific package
uv run --package embcli-core pytest packages/embcli-core/tests

# Run a single test
uv run --package embcli-core pytest packages/embcli-core/tests/embcli_core/test_cli_embed.py::test_function_name

Linting and Formatting

# Run linter and formatter on all packages
uv run ruff check --fix
uv run ruff format

# Run on specific package
uv run ruff check --fix packages/embcli-core
uv run ruff format packages/embcli-core

Type Checking

# Run type checker for specific package
uv run --package embcli-core pyright packages/embcli-core

Documentation

# Build documentation
uv run mkdocs build -f docs/mkdocs.yml

High-Level Architecture

embcli is a plugin-based CLI for working with embedding models and vector stores. The architecture follows these key principles:

Core Components

embcli-core: The main package containing:
- CLI interface (cli.py) - Command-line interface using Click
- Plugin system (plugins.py) - Uses pluggy for plugin management
- Base classes (models.py, vector_stores.py) - Abstract base classes for embedding models and vector stores
- Document handling (document.py, document_loader.py) - Document types and loading utilities
Plugin Architecture:
- Each embedding model is a separate package (e.g., embcli-openai, embcli-gemini)
- Models register themselves via setuptools entry points
- Plugin discovery happens through pluggy.PluginManager
- Hooks are defined in hookspecs.py
Model Types:
- EmbeddingModel: Base class for text embedding models
- MultimodalEmbeddingModel: Extends EmbeddingModel for image support
- LocalEmbeddingModel: For models that run locally (e.g., sentence-transformers)
Vector Store Integration:
- Abstract VectorStore base class defines interface
- Currently supports LanceDB via vector_store/lancedb.py as default implementation
- Vector stores handle document ingestion and similarity search

Key Design Patterns

Plugin Registration: Models and vector stores register via embedding_model() and vector_store() hooks
Batch Processing: All models support batch embedding generation for efficiency
Model Aliases: Each model can have multiple aliases for user convenience
Environment Variables: API keys and configuration loaded from .env files
Separation of Concerns: Core functionality separate from vendor-specific implementations

Adding New Models

To add a new embedding model:

Create a new package under packages/embcli-<vendor>
Implement the EmbeddingModel or MultimodalEmbeddingModel interface
Register the model via setuptools entry point in pyproject.toml
Add appropriate tests following the existing pattern

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Common Development Commands

Build

Run Tests

Linting and Formatting

Type Checking

Documentation

High-Level Architecture

Core Components

Key Design Patterns

Adding New Models

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Common Development Commands

Build

Run Tests

Linting and Formatting

Type Checking

Documentation

High-Level Architecture

Core Components

Key Design Patterns

Adding New Models