Skip to content

Refactor BaseDoclingAgent from Pydantic BaseModel to Abstract Base Class (ABC) #40

@ceberam

Description

@ceberam

Background

Currently, BaseDoclingAgent inherits from Pydantic's BaseModel, which creates a design inconsistency between the intended use of Pydantic (data validation and serialization) and the actual purpose of agent classes (behavioral objects that execute actions).

Current Implementation:

# docling_agent/agent/base.py
from pydantic import BaseModel, ConfigDict

class BaseDoclingAgent(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)
    
    agent_type: DoclingAgentType
    backend: BaseBackend
    tools: list
    max_iteration: int = 16

Agent Subclasses:
All agent classes (DoclingRAGAgent, DoclingEnrichingAgent, DoclingEditingAgent, etc.) inherit from BaseDoclingAgent and define Pydantic fields, but then override __init__ to manually set these fields, defeating Pydantic's purpose.

Example from DoclingRAGAgent:

class DoclingRAGAgent(BaseDoclingAgent):
    max_iterations: Annotated[int, Field(description="...")] = 5
    verbose: Annotated[bool, Field(description="...")] = False
    # ... more fields
    
    def __init__(self, *, tools, backend=None, max_iterations=5, verbose=False, ...):
        super().__init__(
            agent_type=DoclingAgentType.DOCLING_DOCUMENT_RAG,
            backend=backend or self.default_backend(),
            tools=tools,
        )
        # Manually setting fields that are already defined as Pydantic fields
        self.max_iterations = max_iterations
        self.verbose = verbose
        # ...

Problem Statement

  1. Semantic Mismatch: Pydantic is designed for data models (validation, serialization, deserialization), not behavioral objects
  2. Confusing Initialization: Mixing Pydantic field definitions with manual __init__ assignments
  3. Unnecessary Overhead: Pydantic validation/serialization features are never used for agents
  4. Misleading API: Developers might expect Pydantic features (model_dump, model_validate) to work meaningfully
  5. Code Duplication: Field definitions + manual assignments in __init__

Motivation

Agents are behavioral objects, not data models:

  • They execute actions and maintain state
  • They have methods that "do things" with natural language instructions
  • They don't need serialization, validation, or schema generation
  • They are never persisted or transmitted as data

Pydantic is appropriate for:

  • ✅ Configuration models (AgentTask, BackendConfig, ModelConfig)
  • ✅ LLM output models (SectionSelection, AnswerAttempt, RAGResult)
  • ✅ Operation validation models (UpdateContentOperation, RewriteContentOperation)
  • ✅ Data persistence models (DocLibraryEntry, DocLibraryIndex)

Pydantic is NOT appropriate for:

  • ❌ Agent classes (behavioral objects)

Proposed Solution

Refactor BaseDoclingAgent to use Python's Abstract Base Class (ABC) pattern.

Benefits

  1. Clearer Design Intent: ABC clearly signals "interface/behavior" vs "data model"
  2. Simpler Initialization: Standard Python __init__, no Pydantic confusion
  3. Better Performance: No Pydantic validation overhead
  4. Easier Maintenance: No mixing of Pydantic fields with manual assignments
  5. More Pythonic: Standard OOP patterns for behavioral objects
  6. Better Documentation: Field documentation can use docstrings instead of Pydantic Field
  7. Type Safety: Still maintains type hints without Pydantic

Required Changes

Files to Modify:

  1. docling_agent/agent/base.py

    • Change BaseDoclingAgent from BaseModel to ABC
    • Remove model_config
    • Convert fields to __init__ parameters
    • Keep all methods unchanged
  2. All Agent Classes:

    • docling_agent/agent/rag.py - DoclingRAGAgent
    • docling_agent/agent/enricher.py - DoclingEnrichingAgent
    • docling_agent/agent/editor.py - DoclingEditingAgent
    • docling_agent/agent/writer.py - DoclingWritingAgent
    • docling_agent/agent/extractor.py - DoclingExtractingAgent
    • docling_agent/agent/orchestrator.py - DoclingOrchestratorAgent

    For each:

    • Remove Pydantic field definitions
    • Keep __init__ methods (already correct)
    • Update super().__init__() calls if needed
    • Remove any Pydantic-specific code
  3. Tests:

    • Update any tests that rely on Pydantic features
    • Verify all agent instantiation still works
    • Check that no code uses model_dump, model_validate, etc. on agents

Backward Compatibility

Breaking Changes:

  • Any code that treats agents as Pydantic models will break
  • Code using model_dump(), model_validate(), model_fields, etc. on agents

Mitigation:

  • Search codebase for Pydantic method calls on agents (likely none exist)
  • Add deprecation warnings if needed
  • Document changes in CHANGELOG

Additional Notes

  • This refactoring does NOT affect data models (AgentTask, BackendConfig, etc.) which correctly use Pydantic
  • LLM output models (SectionSelection, AnswerAttempt, etc.) should remain Pydantic models
  • The change is primarily architectural - functionality remains the same

References

Metadata

Metadata

Assignees

Labels

architectureChanges to fundamental system design or structureenhancementNew feature or requestrefactoringCode restructuring to improve design without changing functionality
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions