RAG AI Teaching Assistant

A Retrieval-Augmented Generation (RAG) based AI teaching assistant that helps students find specific content within video lectures using semantic search and natural language queries.

Features

Automatic video subtitle extraction and processing
Semantic search using BGE-M3 embeddings
Context-aware answers with precise video timestamps
Integration with both local LLMs (Ollama) and OpenAI GPT models
Efficient vector similarity search using cosine similarity

Tech Stack

Python: Core programming language
Embeddings: BGE-M3 (via Ollama)
Vector Search: Cosine Similarity, NumPy
LLM: Llama 3.2 (Ollama) / GPT-4 (OpenAI)(highly recommended)
Libraries: Pandas, Scikit-learn, Joblib, Requests

How It Works

Video Processing: Convert video files to MP3 audio format
Transcription: Extract subtitles/transcripts to JSON format
Embedding Generation: Create vector embeddings for each subtitle chunk
Query Processing: Convert user questions to embeddings
Retrieval: Find top-5 most relevant video segments using cosine similarity
Response Generation: LLM generates contextual answers with timestamps

Setup Instructions

Step 1 - Collect Your Videos

Move all your video files to the videos folder

Step 2 - Convert to MP3

Run video_to_mp3.py to convert video files to audio format

Step 3 - Extract Transcripts

Run mp3_to_json.py to generate subtitle JSON files

Step 4 - Create Vector Database

Run preprocess_json.py to convert JSON files to vector embeddings and save as embedding.joblib

Step 5 - Run the Assistant

Execute the main script to start querying your video content

Usage Example

Ask a Question: How to create a responsive navbar?
Thinking...
This topic is covered in Video 15: "Building Navigation Bar" at timestamp 3:45 to 8:20...

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
newjsons		newjsons
unused		unused
README.md		README.md
Youtube_download.py		Youtube_download.py
embedding.joblib		embedding.joblib
merge_chunks.py		merge_chunks.py
mp3_to_json.py		mp3_to_json.py
process_incoming.py		process_incoming.py
process_json.py		process_json.py
prompt.txt		prompt.txt
response.txt		response.txt
webm_to_mp3.py		webm_to_mp3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG AI Teaching Assistant

Features

Tech Stack

How It Works

Setup Instructions

Step 1 - Collect Your Videos

Step 2 - Convert to MP3

Step 3 - Extract Transcripts

Step 4 - Create Vector Database

Step 5 - Run the Assistant

Usage Example

About

Uh oh!

Releases

Packages

Languages

002meet/rag-teaching-assistant

Folders and files

Latest commit

History

Repository files navigation

RAG AI Teaching Assistant

Features

Tech Stack

How It Works

Setup Instructions

Step 1 - Collect Your Videos

Step 2 - Convert to MP3

Step 3 - Extract Transcripts

Step 4 - Create Vector Database

Step 5 - Run the Assistant

Usage Example

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages