Team Omni
Prajith Ravisankar, Srijan Ravisankar
ThreadFlow is a visual pipeline builder for analyzing Reddit discussions. Users drag and drop nodes onto a canvas to build data processing workflows—similar to how n8n or Node-RED work, but focused on social media analysis.
The tool connects to a local database of Reddit comments and uses Google's Gemini AI to perform analysis tasks like sentiment detection, bot identification, and content summarization.
Analyzing large amounts of social media comments manually is time-consuming. A single Reddit thread can have thousands of comments, and popular subreddits accumulate millions over time.
Researchers, journalists, or anyone trying to understand public sentiment on a topic faces two options:
- Read comments one by one (not practical at scale)
- Write custom scripts for each analysis task (requires programming knowledge)
ThreadFlow provides a middle ground: a visual interface where non-programmers can build analysis workflows by connecting nodes.
- Loads Reddit comments from a local DuckDB database
- Supports search queries to find relevant discussions
- Filters by comment score or keywords
- Sentiment Analysis: Classifies comments as positive, negative, or neutral
- Bot Detection: Flags potentially automated or suspicious accounts
- Evidence Extraction: Identifies factual claims in discussions
- Summarization: Generates summaries of comment threads
- Data tables showing filtered results
- 3D visualizations: Canada map (geographic distribution), political party breakdown, bar charts, pie charts
- Drag-and-drop node interface
- Connect nodes to create processing pipelines
- Run pipelines with a single click
- View results directly in the nodes
- Next.js 15 with React
- React Flow for the node-based canvas
- Three.js for 3D visualizations
- TailwindCSS for styling
- FastAPI (Python)
- DuckDB for querying ~1.3GB of Reddit data locally
- Google Gemini API for AI analysis
- r/Canada subreddit comments and threads (CSV format, ingested into DuckDB)
- Does not scrape Reddit in real-time (uses a static dataset)
- Does not deploy to the cloud (runs entirely on localhost)
- Does not store user data or require authentication
- Does not perform vector search or semantic similarity (uses keyword-based full-text search)
- Does not guarantee AI accuracy (Gemini results depend on prompt quality and model limitations)
- Set up the backend: install Python dependencies, add Gemini API key to
.env, runpython ingest.pyto build the DuckDB database - Start the backend:
uvicorn main:app --port 8000 - Start the frontend:
npm install && npm run dev - Open
http://localhost:3000
Full setup instructions are in readme.md.
- Dataset is limited to r/Canada subreddit
- Gemini API has rate limits (15 requests/minute on free tier)
- Large datasets may slow down the browser
- 3D visualizations require WebGL support
backend/ → FastAPI server, DuckDB queries, Gemini integration
frontend/ → Next.js app, React Flow canvas, 3D components
archive/ → Source CSV files for Reddit data
AI Collective Hackathon 2026