-
Notifications
You must be signed in to change notification settings - Fork 37
Description
We have hundreds of tools mentioned across podcasts, Zoomcamp lectures, workshops, and other videos.
At the moment, these tool mentions are scattered across transcripts and YouTube videos.
We want to build a full tools catalog that includes all tools (open-source or proprietary) that appear in our content — and optionally use Cognee (or similar graph tooling) to generate connections between tools and videos where they were mentioned.
Cognee demo was part of this video: https://www.youtube.com/live/MNt_KK32gys?si=1Pb_rShW5UYfPPUH
Goal of the task
Build a system that reads video transcripts from our YouTube content, extracts mentions of tools, generates structured tool pages, and builds a central catalog on the website.
The system may also use Cognee to visualize relationships between tools and videos where they were mentioned.
What the catalog should include
Each tool should have:
- name,
- category / purpose,
- a short description,
- all mentions found across our content (episode, timestamp, speaker, quote),
- optional metadata like homepage, GitHub repo, docs, etc.
This catalog is broader than the open-source demo catalog (#91). It includes any tool mentioned in any Zoomcamp, workshop, or podcast.
The system should produce a dataset of:
tool → source video → timestamp → context snippet.
Outcome
A dynamic tools encyclopedia that consolidates all tool mentions across the entire DataTalks.Club content ecosystem.
This is extremely valuable for learners, for SEO, and for potential partnerships or sponsors who want their tools represented.