Skip to content

feat: Enhanced Ingestion Pipeline with Resume/Checkpointing #64

@maximilien

Description

@maximilien

Summary

Enhance the ingestion pipeline with resilience features for production workloads.

Scope

  • Add resume from failures capability
  • Implement checkpointing
  • Add deduplication
  • Add validation before insert
  • Track batch metrics
  • Improve error reporting
  • Add dry-run mode
  • Export metrics to JSON/YAML
  • Test resilient ingestion

Commands

weave stack ingest Docs data/ --resume
weave stack ingest Docs data/ --dry-run
weave stack ingest Docs data/ --validate
weave stack ingest Docs data/ --report metrics.json

Success Criteria

  • Enhanced ingestion handles failures gracefully
  • Checkpointing works correctly
  • Metrics export functional

Priority: Phase 2 - Week 2

Estimated: 2-3 days
Assignee: TBD

Related

Can be developed in parallel with cloud deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfeatureA new feature and enhancementp2Medium priorityphase-2Phase 2 work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions