The notebook demonstrates the core building blocks behind modern LLM workflows:
- installing the required libraries
- using a Hugging Face token to access a gated model
- tokenizing text with a tokenizer
- generating embeddings with the OpenAI API
- converting prompts into PyTorch tensors
- loading a causal language model
- running a forward pass
- generating new tokens from a prompt
- decoding generated token IDs back into readable text
LLM_Understanding_Commented.ipynb— commented notebook with explanations added throughout
This notebook is meant to help understand the flow of an LLM system:
- Input text
- Tokenization
- Model processing
- Token prediction / generation
- Decoding back to text
It also introduces embeddings as a separate but important idea for semantic understanding.
- Python
- Google Colab / Jupyter Notebook
- Hugging Face Transformers
- Gemma (
google/gemma-3-1b-it) - OpenAI Embeddings API
- PyTorch
- Open the notebook in Google Colab or Jupyter Notebook
- Install the required libraries from the first cells
- Add your own:
- Hugging Face token
- OpenAI API key
- Run the cells in order
Do not upload real API keys or tokens to GitHub.
Use environment variables, Colab secrets, or another secure secret-management method instead.
- move secrets to Colab Secrets
- add temperature, top-k, and top-p generation examples
- compare tokenization across different prompts
- visualize embedding similarity between multiple sentences
- add a short section on attention and logits
This project is focused on building intuition about how LLMs work internally at a beginner level, especially around tokenization, embeddings, inference, and generation.