Skip to content

Automate Slack Discussions to Website FAQ Sync (Daily Export Pipeline) #90

@kavaivaleri

Description

@kavaivaleri

We want to build an automated system that extracts frequently asked questions from our Slack workspace and syncs them to the website's FAQ section. The goal is to run this process daily, pull new questions and answers, and automatically update the existing FAQ files or create new entries.

We already have an early prototype of an agent that fetches Slack data, but it’s not fully integrated and isn't running in an automated or reliable way. This hackathon task is to improve, extend, or rewrite that agent so that it becomes a production-ready component of our documentation workflow.

What Needs to Be Built

The system should connect to Slack, scan selected channels (for example #course-ml-zoomcamp, #data-engineering, #llm-zoomcamp), identify messages that represent user questions and authoritative answers, and export them into structured files.

The exported content should match the format required by our website FAQ engine. Ideally, the system should distinguish between new items and updates to existing ones.

The implementation can reuse the existing agent logic, build new functionality on top of it, or replace it entirely if the team finds a better approach.

Expected Functionality

The solution should run daily, ideally via GitHub Actions or another CI system. It should pull new content from Slack, clean and normalize the text, and produce a commit or a pull request with the updated FAQ files.

The system must handle basic formatting, avoid duplicates, and ensure that questions are stored consistently and searchably.

Integration With the Website

The website already supports FAQ content through Markdown or HTML blocks. The task includes making sure the exported data is compatible with our existing FAQ structure and that the integration is smooth.

If needed, the team can extend or adjust the FAQ format to make the pipeline easier or more robust.

Participantion

Teams may choose to:

  • improve the existing agent,
  • completely rewrite it using a more reliable approach,
  • or build additional intelligence (classification, semantic matching, deduplication, etc.).

All improvements are welcome as long as the final result is automated, reliable, and easy to maintain.

Outcome

A working daily pipeline that automatically extracts FAQ items from Slack and updates our website with fresh, high-quality entries.

This solution will significantly reduce manual effort, improve learner experience, and ensure that frequently asked questions across all Zoomcamps remain accurate and up to date.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions